Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient Processing of Semantic Information on the Web Georg Lausen Technische Fakultät Universität Freiburg.

Similar presentations


Presentation on theme: "Efficient Processing of Semantic Information on the Web Georg Lausen Technische Fakultät Universität Freiburg."— Presentation transcript:

1 Efficient Processing of Semantic Information on the Web Georg Lausen Technische Fakultät Universität Freiburg

2 The amount of available information on Web still is increasing rapidly. (Semi-)Automatic Data Extraction. Resource Description Framework (RDF). SPARQL is the standard query language for RDF. Efficiency and Scalability of query processing. Processing of Semantic Information on the Web

3 Efficiency and Scalability: A Variety of Approaches Single machine RDF stores Parallel Database Approach: Vertica and others Approaches based on Hadoop (MapReduce Paradigm) – Hadoop – Hadoop++ – Integration of databases: HadoopDB – Language translation Mapping SPARQL to Hadoop/HBase directly Mapping SPARQL to Pig Latin Non Hadoop clusters

4 Cluster-based Parallelism vs Parallel Database/Single Machine RDF-Store Each technology has its own advantages and problems. Rough characterization: QueryingLoading Parallel Database / Single Machine RDF-Store +- Cluster-based Parallelism -+ Loading in the context of Web research: Extract Transform Load schema. SPARQL provides a declarative way for specifying the transformation and querying.

5 ETL and Querying in the context of Web research Web documentsInitial RDF graphRDF store E L T Efficient Loading Efficient querying SPARQL PigSPARQL: Mapping SPARQL to PigLatin; to appear Semantic Web Information Management – SWIM 2011


Download ppt "Efficient Processing of Semantic Information on the Web Georg Lausen Technische Fakultät Universität Freiburg."

Similar presentations


Ads by Google