Managing the Evolution and Preservation of the Data Web - CSH

Managing the Evolution and Preservation of the Data Web


The 4th edition of this workshop has targeted one of the emerging and fundamental problems in the Semantic Web, specifically the preservation of evolving linked datasets. This topic is of particular relevance to the Semantic Web community since it raises awareness of the many research challenges for preserving and managing dynamic linked datasets. Fostering active usage of such evolving datasets requires further research advances on topics such as storage, synchronisation, change representation and querying over evolving graphs. This year, we accepted three papers, we invited a keynote speaker, and we discussed on future steps of the community, which we describe in brief.

In this year’s contributions we see a focus on the management of data versioning and the preservation of evolving knowledge. Singh et al. [4] present DELTA-LD, a change detection mechanisms for linked datasets. DELTA-LD focuses on detecting changes at both resource level (creation, removal, update, movement, or renewal of a resource) and triple level (deleting or adding a triple). To do so, the approach considers (i) the extraction of features from the linked datasets in order to detect changes and identify similar representations in different versions (i.e. moved resources), and (ii) a classification of the changes and a representation of the change model using a provided ontology. Pandit et al. [3] investigate on how to represent changes in consents and activities regarding the novel General Data Protection Regulation (GDPR).

In their position paper, they first discuss the use of PROV to represent the provenance of activities and ODRL to represent the consent, and identify the influence of consent changes. Then, they discuss on detecting and representing change in activities and how to link and use the changes to demonstrate the compliance w.r.t DDPR obligations. Laajimi et al. [2] focus on evaluating the performance of archiving engines. In particular, they propose and evaluate the use of the SPARK distributed system to archive RDF data. Thus, authors represent RDF data and changes in SPARK dataframes, while archiving queries are resolved via SPARK SQL. Then, the performance of different versioning approaches (e.g. fully materialized version versus representing only the changing triples in each version) are evaluated, with particular attention to measuring the different performance of starand chain queries


J. Debattista, J. D. Fernández, M. Vidal, J. Umbrich, Managing the Evolution and Preservation of the Data Web, J. Web Semant. 54 (2019) 1-3

Linked Reseachers