1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011.

Slides:



Advertisements
Similar presentations
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Advertisements

Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
Reachability Querying: An Independent Permutation Labeling Approach (published in VLDB 2014) Presenter: WEI, Hao.
Evaluating “find a path” reachability queries P. Bouros 1, T. Dalamagas 2, S.Skiadopoulos 3, T. Sellis 1,2 1 National Technical University of Athens 2.
Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong.
1 Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks Erietta Liarou, Stratos Idreos, and Manolis Koubarakis Waled.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
Semantic Web Introduction
Store RDF Triples In A Scalable Way Liu Long & Liu Chunqiu.
RDF-3X: a RISC style Engine for RDF Ref: Thomas Neumann and Gerhard Weikum [PVLDB’08 ] Presented by: Pankaj Vanwari Course: Advanced Databases (CS 632)
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
1 Efficient Subgraph Search over Large Uncertain Graphs Ye Yuan 1, Guoren Wang 1, Haixun Wang 2, Lei Chen 3 1. Northeastern University, China 2. Microsoft.
Searching the Semantic Web. Introduction  Research Focuses: IE Ontologies (creating, languages, merging, storing, querying)  Next Sep: Using the Semantic.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 The Architecture of a Large-Scale Web Search and Query Engine.
Comparing path-based and vertically-partitioned RDF databases Preetha Lakshmi & Chris Mueller 12/10/2007 CSCI 8715 Shashi Shekhar.
Semantic Web Query Processing with Relational Databases Artem Chebotko Department of Computer Science Wayne State University.
Presented by Cathrin Weiss, Panagiotis Karras, Abraham Bernstein Department of Informatics, University of Zurich Summarized by: Arpit Gagneja.
Presented by Gentre Dozier and Spencer Dille management.com/newsletters/database_metadata_unstructured_data_triple_store html.
Graph Data Management Lab, School of Computer Scalable SPARQL Querying of Large RDF Graphs Xu Bo
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
GRIN – A Graph Based RDF Index Octavian Udrea Andrea Pugliese V. S. Subrahmanian Presented by Tulika Thakur.
Scalable Semantic Web Data Management Using Vertical Partitioning Daniel J. Abadi, Adam Marcus, Samuel R. Madden, Kate Hollenbach VLDB, 2007 Oct 15, 2014.
Hexastore: Sextuple Indexing for Semantic Web Data Management
SPARQL Semantic Web - Spring 2008 Computer Engineering Department Sharif University of Technology.
G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs Sherif SakrSameh ElniketyYuxiong He NICTA & UNSW Sydney, Australia Microsoft Research Redmond,
DANIEL J. ABADI, ADAM MARCUS, SAMUEL R. MADDEN, AND KATE HOLLENBACH THE VLDB JOURNAL. SW-Store: a vertically partitioned DBMS for Semantic Web data.
Extracting Semantic Constraint from Description Text for Semantic Web Service Discovery Dengping Wei, Ting Wang, Ji Wang, and Yaodong Chen Reporter: Ting.
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, Bhavani Thuraisingham University.
DBXplorer: A System for Keyword- Based Search over Relational Databases Sanjay Agrawal, Surajit Chaudhuri, Gautam Das Cathy Wang
GStore: Answering SPARQL Queries via Subgraph Matching Lei Zou, Jinghui Mo, Lei Chen, M. Tamer Ozsu ¨, Dongyan Zhao {
Lesley Charles November 23, 2009.
On Graph Query Optimization in Large Networks Alice Leung ICS 624 4/14/2011.
Daniel J. Abadi · Adam Marcus · Samuel R. Madden ·Kate Hollenbach Presenter: Vishnu Prathish Date: Oct 1 st 2013 CS 848 – Information Integration on the.
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
Semantic Web Programming in Python an Introduction Biju B Jaganath G.
C-Store: RDF Data Management Using Column Stores Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Apr. 24, 2009.
GStore: Answering SPARQL Queries Via Subgraph Matching Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao Peking University, 2 Hong.
RDF-3X : RISC-Style RDF Database Engine
Ranking objects based on relationships Computing Top-K over Aggregation Sigmod 2006 Kaushik Chakrabarti et al.
RDF-3X : a RISC-style Engine for RDF Thomas Neumann, Gerhard Weikum Max-Planck-Institute fur Informatik, Max-Planck-Institute fur Informatik PVLDB ‘08.
VLDB2005 CMS-ToPSS: Efficient Dissemination of RSS Documents Milenko Petrovic Haifeng Liu Hans-Arno Jacobsen University of Toronto.
Scalable Keyword Search on Large RDF Data. Abstract Keyword search is a useful tool for exploring large RDF datasets. Existing techniques either rely.
RDF-3X: a RISC-style Engine for RDF Presented by Thomas Neumann, Gerhard Weikum Max-Planck-Institut fur Informatik Saarbrucken, Germany Session 19: System.
Scalable Semantic Web Data Management Using Vertical Partitioning Daniel J. Adam Samuel R. Kate Abadi Marcus Madden MIT Daniel Hurwitz Technion:
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
2004/12/31 報告人 : 邱紹禎 1 Mining Frequent Query Patterns from XML Queries L.H. Yang, M.L. Lee, W. Hsu, and S. Acharya. Proc. of 8th Int. Conf. on Database.
GRIN: A Graph Based RDF Index Octavian Udrea 1 Andrea Pugliese 2 V. S. Subrahmanian 1 1 University of Maryland College Park 2 Università di Calabria.
RDF storages and indexes Maciej Janik September 1, 2005 Enterprise Integration – Semantic Web.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Author: Akiyoshi Matonoy, Toshiyuki Amagasay, Masatoshi Yoshikawaz, Shunsuke Uemuray.
Chapter 04 Semantic Web Application Architecture 23 November 2015 A Team 오혜성, 조형헌, 권윤, 신동준, 이인용.
Presented by: Siddhant Kulkarni Spring Authors: Publication:  ICDE 2015 Type:  Research Paper 2.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
Outline Introduction State-of-the-art solutions
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
SPARQL.
Keyword Search over RDF Graphs
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
Probabilistic Data Management
TT-Join: Efficient Set Containment Join
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
XML-Based RDF Data Management for Efficient Query Processing
Logics for Data and Knowledge Representation
On Efficient Graph Substructure Selection
RDF Stores S. Sakr and G. A. Naymat.
Lu Xing CS59000GDM 9/21/2018.
Keyword Searching and Browsing in Databases using BANKS
Lu Xing CS59000GDM Sept 7th, 2018.
A Framework for Testing Query Transformation Rules
Query Optimization.
Presentation transcript:

1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011

2 Outline RDF & SPARQL Previous Solutions for SPARQL Queries Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions

3 RDF & SPARQL Previous Solutions for SPARQL Queries Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions Outline

4 What is RDF A general-purpose framework provides structured, machine-understandable metadata for the Web It is based upon the idea of making statements about resources in the form of subject-predicate-object expressions. These expressions are known as triples in RDF. SubjectObject Predicate Statement

5 RDF Model Example page.html Guan Guan’s Home Page Creator Title SubjectPredicateObject page.htmlCreatorGuan page.htmlCreatorGuan's Home Page

6 What is SPARQL SPARQL is a query language for RDF. It provides a standard format for writing queries that target RDF data and a set of standard rules for processing those queries and returning the results. The building blocks of a SPARQL queries are graph patterns that include variables. The result of the query will be the values that these variables must take to match the RDF graph.

7 Example of SPARQL Select ?name Where { ?m ?name. ?m “ ”. ?m “ ”. } Names beginning with a ? or a $ are variables. Graph patterns are given as a list of triple patterns enclosed within braces {} The variables named after the SELECT keyword are the variables that will be returned as results. (~SQL) Here each of the conjunctions, denoted by a dot, corresponds to a join.

8 RDF Graph

9 SPARQL Queries Query Graph SPARQL Query: Select ?name Where { ?m ?name. ?m “ ”. ?m “ ”. }

10 Subgraph Match vs. SPARQL Queries

11 RDF & SPARQL Previous Solutions for SPARQL Queries Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions Outline

12 Existing Solutions-Three Column Table SPARQL Query: Select ?name Where { ?m ?name. ?m “ ”. ?m “ ”. } Shortage: Too Many Self-Joins

13 Shortage: A Big Waste of Space Existing Solutions-Property Table

14 Existing Solutions-Vertically Partitioned Shortage: Too Many Merge Joins

15 Existing Solutions-RDF-3x Shortage: Different to Handle Updates Utilize the characteristic of RDF, that there are only three elements(subject, object and predicate) in RDF. Construct all six possible indexes and optimalize merge orders.

16 RDF & SPARQL Previous Solutions for SPARQL Queries Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions Outline

17 Overview of gStore(Store) Represent an RDF dataset by an RDF graph G and store it by its adjacency list table.

18 Overview of gStore(Encoding) Encode each entity and class vertex into a bitstring, called signature. Link these vertex signatures to form a data signature graph G according to RDF graph’s structure

19 Overview of gStore(VS*-tree)

20 RDF & SPARQL Previous Solutions for SPARQL Queries Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions Outline

21 Encoding Technique

22 Encoding Technique

23 RDF & SPARQL Previous Solutions for SPARQL Queries Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions Outline

24 VS*-tree Each leaf node of the tree corresponds to one vertex signature in G. Given two leaf nodes d1 and d2 in the tree, we introduce an edge between them, if and only if there is an edge between d1 and d2 in G Given nodes d1 and d2 in the tree, we introduce a super edge from d1 to d2, if and only if there is at least one edge from d1’s children to d2’s children. Assign an edge label for the edge d1→ d2 by performing bitwise “OR” over these n edge labels from d1’s children to d2’s children.

25 VS*-tree

26 Query Algorithm

27 RDF & SPARQL Previous Solutions for SPARQL Queries Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions Outline

28 Experiments Used datasets: Yago, DBLP which are popular semantic datasets with millions of triples. Data size: approximately 4GB.

29 Experiments(Exact Queries)

30 Experiments(Wildcard Queries)

31 RDF & SPARQL Previous Solutions for SPARQL Queries Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions Outline

32 Conclusions Propose to store and query RDF data from graph database perspective. Using VS*-tree as indexing method for bitstring of vertices, which supports the SPARQL queries in a scalable manner. False positive.

33 Reference [ICDE09]Thanh Tran, Haofen Wang, Sebastian Rudolph, Philipp Cimiano, "Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-Shaped (RDF) Data", DOI /ICDE [VLDB07]Daniel J. Abadi, Adam Marcus, Samuel R. Madden,Kate Hollenbach, "Scalable Semantic Web Data Management Using Vertical Partitioning", VLDB ‘07, September 2328, 2007, Vienna, Austria. [PVLDB08]Cathrin Weiss, Panagiotis Karras, Abraham Bernstein, "Hexastore:Sextuple Indexing for Semantic Web Data Management",PVLDB '08, August 23-28, 2008, Auckland, New Zealand [PVLDB08]Thomas Neumann, Gerhard Weikum, "RDF3X:a RISCstyle Engine for RDF",PVLDB '08, August 23-28, 2008, Auckland, New Zealand [VLDB11]Lei Zou, Jinghui Mo, Lei Chen, M. Tamer O¨ zsu, Dongyan Zhao, "gStore: Answering SPARQL Queries via Subgraph Matching" VLDB‘11,August 29th - September 3rd 2011, Seattle, Washington. Thank you!