Aug 29, 2017 | 11:00—12:30
Join us for a lunch talk by Javier Fernandez on “Democratizing Big Semantic Data management, or how to query a labeled graph with 28 billion edges in a standard laptop” on Tuesday, August 29, at 11:00 am at the Hub in Room 101.
Abstract: “Linked Open Data” is a collective effort for the integration and combination of data from diverse sources, converting existing scattered data in the Web into profitable knowledge. Data providers make use of a common graph-based model, the Resource Description Framework (RDF), to describe and link data with various degrees of structure (or lack thereof). The potential of this Big Semantic Data is under-exploited when data management is based on traditional, human-readable RDF representations, which add unnecessary overheads when storing, exchanging and consuming RDF in the context of a large-scale and machine-understandable Semantic Web. In this talk WE will first discuss the main challenges emerging in a Big Semantic Data scenario, and we will present fundamental concepts of Compact Data Structures and RDF self-indexes. Then, we will introduce HDT, a compact data structure and binary serialization format that keeps big RDF datasets compressed while maintaining search and browse operations without prior decompression. Finally, we will show the application of HDT to represent and query more than 28 billion edges of the current Linked Open Data network.