Efficient Processing of Semantic Information on the Web Georg Lausen Technische Fakultät Universität Freiburg.

Slides:



Advertisements
Similar presentations
Distributed DBMS©M. T. Özsu & P. Valduriez Ch.14/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Advertisements

Dr. Bhavani Thuraisingham February 18, 2011 Building Trustworthy Semantic Webs RDF and RDF Security.
Semantic Web Introduction
HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.
INTEGRATING BIG DATA TECHNOLOGY INTO LEGACY SYSTEMS Robert Cooley, Ph.D.CodeFreeze 1/16/2014.
Jennifer Widom NoSQL Systems Overview (as of November 2011 )
HadoopDB An Architectural Hybrid of Map Reduce and DBMS Technologies for Analytical Workloads Presented By: Wen Zhang and Shawn Holbrook.
The Web of data with meaning... By Michael Griffiths.
Paula Ta-Shma, IBM Haifa Research 1 “Advanced Topics on Storage Systems” - Spring 2013, Tel-Aviv University Big Data and.
OWL-AA: Enriching OWL with Instance Recognition Semantics for Automated Semantic Annotation 2006 Spring Research Conference Yihong Ding.
Graph Data Management Lab, School of Computer Scalable SPARQL Querying of Large RDF Graphs Xu Bo
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
Cloud Computing Other Mapreduce issues Keke Chen.
PARALLEL DBMS VS MAP REDUCE “MapReduce and parallel DBMSs: friends or foes?” Stonebraker, Daniel Abadi, David J Dewitt et al.
Storing RDF Data in Hadoop And Retrieval Pankil Doshi Asif Mohammed Mohammad Farhan Husain Dr. Latifur Khan Dr. Bhavani Thuraisingham.
HADOOP ADMIN: Session -2
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Publishing data on the Web (with.
Managing & Integrating Enterprise Data with Semantic Technologies Susie Stephens Principal Product Manager, Oracle
Information Extraction with Linked Life Data 19/04/2011.
MapReduce VS Parallel DBMSs
MapReduce April 2012 Extract from various presentations: Sudarshan, Chungnam, Teradata Aster, …
H ADOOP DB: A N A RCHITECTURAL H YBRID OF M AP R EDUCE AND DBMS T ECHNOLOGIES FOR A NALYTICAL W ORKLOADS By: Muhammad Mudassar MS-IT-8 1.
Marko Grobelnik Jozef Stefan Institute ( Ljubljana, Slovenia.
HadoopDB project An Architetural hybrid of MapReduce and DBMS Technologies for Analytical Workloads Anssi Salohalla.
Introduction to Hadoop and HDFS
HadoopDB Presenters: Serva rashidyan Somaie shahrokhi Aida parbale Spring 2012 azad university of sanandaj 1.
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, Bhavani Thuraisingham University.
An Introduction to HDInsight June 27 th,
Dimitrios Skoutas Alkis Simitsis
Daniel J. Abadi · Adam Marcus · Samuel R. Madden ·Kate Hollenbach Presenter: Vishnu Prathish Date: Oct 1 st 2013 CS 848 – Information Integration on the.
Large scale IP filtering using Apache Pig and case study Kaushik Chandrasekaran Nabeel Akheel.
Large scale IP filtering using Apache Pig and case study Kaushik Chandrasekaran Nabeel Akheel.
Semantically Processing The Semantic Web Presented by: Kunal Patel Dr. Gopal Gupta UNIVERSITY OF TEXAS AT DALLAS.
C-Store: RDF Data Management Using Column Stores Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Apr. 24, 2009.
SCAPE Rainer Schmidt SCAPE Training Event September 16 th – 17 th, 2013 The British Library Building Scalable Environments Technologies and SCAPE Platform.
RDF Access to Relational Databases Ashok Malhotra Oracle Corporation.
RDFPath: Path Query Processing on Large RDF Graph with MapReduce Martin Przyjaciel-Zablocki et al. University of Freiburg ESWC May 2013 SNU IDB.
Triple Stores. What is a triple store? A specialized database for RDF triples Can ingest RDF in a variety of formats Supports a query language – SPARQL.
CPS 216: Advanced Database Systems Shivnath Babu.
NoSQL Systems Motivation. NoSQL: The Name  “SQL” = Traditional relational DBMS  Recognition over past decade or so: Not every data management/analysis.
Research Meeting Jaeseok Myung. Copyright  2009 by CEBT Summary  TA DB : project 3, midterm(24 명 응시 ) WEC : report, project (android), classroom,
RoOUG Iunie Bucuresti, 26 Iunie Agenda Inregistrarea participantilor ODI – Common Use Cases 2Iunie 2013.
1 A Medical Information Management System Using the Semantic Web Technology Networked Computing and Advanced INFORMATION MANAGEMENT, NCM '08. Fourth.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
© The ATHENA Consortium. Susan Thomas SAP AG, Research Department How do you do semantics? Semantic Web Drawings by Sebastian Cremers Unit 3:
MarkLogic The Only Enterprise NoSQL Database Presented by: Aashi Rastogi ( ) Sanket Patel ( )
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
نمايندگي استان يزد. نمايندگي استان يزد طراحی کسب و کار الکترونیکی ارائه کننده : محسن افسر قره باغ.
Big Data-An Analysis. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult.
Big Data & Test Automation
CS 405G: Introduction to Database Systems
INTRODUCTION TO PIG, HIVE, HBASE and ZOOKEEPER
Triple Stores.
Collaborative Vocabulary Management
Adding ICs to OWL Ming Fang 07/10/2009.
Middleware independent Information Service
Pig Latin - A Not-So-Foreign Language for Data Processing
Database Applications (15-415) Hadoop Lecture 26, April 19, 2016
NoSQL Systems Overview (as of November 2011).
R2O+ODEMapster: Upgrading Relational Legacy Data to the Semantic Web
Zachary Cleaver Semantic Web.
Triple Stores.
Overview of big data tools
Jena HBase: A Distributed, Scalable, Efficient RDF Triple Store
Big DATA.
Jena HBase: A Distributed, Scalable, Efficient RDF Triple Store
Triple Stores.
04 | Processing Big Data with Pig
Pig Hive HBase Zookeeper
Big Data.
Presentation transcript:

Efficient Processing of Semantic Information on the Web Georg Lausen Technische Fakultät Universität Freiburg

The amount of available information on Web still is increasing rapidly. (Semi-)Automatic Data Extraction. Resource Description Framework (RDF). SPARQL is the standard query language for RDF. Efficiency and Scalability of query processing. Processing of Semantic Information on the Web

Efficiency and Scalability: A Variety of Approaches Single machine RDF stores Parallel Database Approach: Vertica and others Approaches based on Hadoop (MapReduce Paradigm) – Hadoop – Hadoop++ – Integration of databases: HadoopDB – Language translation Mapping SPARQL to Hadoop/HBase directly Mapping SPARQL to Pig Latin Non Hadoop clusters

Cluster-based Parallelism vs Parallel Database/Single Machine RDF-Store Each technology has its own advantages and problems. Rough characterization: QueryingLoading Parallel Database / Single Machine RDF-Store +- Cluster-based Parallelism -+ Loading in the context of Web research: Extract Transform Load schema. SPARQL provides a declarative way for specifying the transformation and querying.

ETL and Querying in the context of Web research Web documentsInitial RDF graphRDF store E L T Efficient Loading Efficient querying SPARQL PigSPARQL: Mapping SPARQL to PigLatin; to appear Semantic Web Information Management – SWIM 2011