1 Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks Erietta Liarou, Stratos Idreos, and Manolis Koubarakis 2006. Waled.

Slides:



Advertisements
Similar presentations
George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.
Advertisements

17 th International World Wide Web Conference 2008 Beijing, China XML Data Dissemination using Automata on top of Structured Overlay Networks Iris Miliaraki.
Distributed Hash Tables
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Evaluation of a Scalable P2P Lookup Protocol for Internet Applications
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
P2PIR'06: "Distributed Cache Table (DCT)" Gleb Skobeltsyn, Karl Aberer D istributed T able: Efficient Query-Driven Processing of Multi-Term Queries in.
1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011.
P2P-DIET: One-time and Continuous Queries in Super-Peer Networks By Stratos Idreos, Manolis Koubarakis and Christos Tryfonopoulos Intelligent Systems Laboratory.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
Technische Universität Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
10/31/2007cs6221 Internet Indirection Infrastructure ( i3 ) Paper By Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Sharma Sonesh Sharma.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Looking Up Data in P2P Systems Hari Balakrishnan M.Frans Kaashoek David Karger Robert Morris Ion Stoica.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Structure Overlay Networks and Chord Presentation by Todd Gardner Figures from: Ion Stoica, Robert Morris, David Liben- Nowell, David R. Karger, M. Frans.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
Possible uses of Everlab cluster Everlab Workshop 7-8 June, Jerusalem Iris Miliaraki Christos Tryfonopoulos Technical University of Crete Dept. of Electronics.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
SIMULATING A MOBILE PEER-TO-PEER NETWORK Simo Sibakov Department of Communications and Networking (Comnet) Helsinki University of Technology Supervisor:
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Overlay network concept Case study: Distributed Hash table (DHT) Case study: Distributed Hash table (DHT)
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, Bhavani Thuraisingham University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
National Institute of Advanced Industrial Science and Technology Query Processing for Distributed RDF Databases Using a Three-dimensional Hash Index Akiyoshi.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
1 Towards Taxonomy-based Routing in P2P Networks Alexander L¨oser 指導老師 : 許子衝 老師 學生 : 羅英辰 學號 :M97G0216.
Trust calculus for PKI Roman Novotný, Milan Vereščák.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
Data Indexing in Peer- to-Peer DHT Networks Garces-Erice, P.A.Felber, E.W.Biersack, G.Urvoy-Keller, K.W.Ross ICDCS 2004.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Click to edit Master title style Multi-Destination Routing and the Design of Peer-to-Peer Overlays Authors John Buford Panasonic Princeton Lab, USA. Alan.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Simple Load Balancing for Distributed Hash tables
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Magdalena Balazinska, Hari Balakrishnan, and David Karger
A Scalable Peer-to-peer Lookup Service for Internet Applications
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
Distributed Hash Tables
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Consistent Hashing and Distributed Hash Table
Presentation transcript:

1 Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks Erietta Liarou, Stratos Idreos, and Manolis Koubarakis Waled Almukawi. Albert-Ludwigs-Universität Freiburg Rechnernetze und Telematik

2 Outline Motivation Terminologies  Conjunctive Queries  RDF  DHT System Model & Data Model Query Chain Algorithm Spread By Value Algorithm Experiments Future Works References

3 Motivation One of the most challenging and interesting open problems in the frontiers of P2P and Semantic Web is how to evaluate queries expressed in RDF query languages on top of P2P networks. Previous Rdfpeers used only atomic formula or disjoint RDF triple pattern. The conjunctive queries are interesting, since they provide a flexible methods that can deal effectively with a full functionality of existing RDF query.

4 Conjunctive Queries The conjunctive queries are simply the fragment of first order logic given by the set of atomic formula using conjunction Λ. A conjunctive query q formula : ?x1,..., ?xn : (s1, p1, o1) ^ (s2, p2, o2) ^ · · · ^ (sn, pn, on) where ?x1,..., ?xn are answer variables, each (si, pi, oi) is a triple pattern, and each variable ?xi appears in at least one triple pattern (si, pi, oi).

Conjunctive queries (2) Example: Q:ans(x,y):- country(Id,x),capital(y,x ),continent(x,“Europa“) Q‘:ans(x,y):-(?x,“type“,“country“),(?x,“has-capital“,?y), (?x, continent,“Europa“). 5

6 Resource Description Framework (RDF) RDF is a framework for describing Web resources, such as the title, author, content …e.t.c. RDF Statements is The combination of a Resource, a Property, and a Property value forms a Statement (known as the subject, predicate and object of a Statement). Subject Object Uris, blank node Predicate Uris, Literals, Blank nodes

7 Distriuted Hash Tables (DHT) DHTs are an important class of P2P networks that offer distributed hash table functionality, and allow one to develop scalable, robust and fault-tolerant distributed applications. Distribute data over a large P2P network Can Quickly find any given item ni Log N nodes. Look up for an Item in the DHT is performed in O(Log N) nodes.

System Model ( ref:[4] ) 10 N11N N42N42 +0 keyssucc.distan N63 N50 N48 N42 +4 N42 +2 N42 +1 N N42 +8 finger table N27 N11 N2 N56 N50 N17 N63 N48 N6 N33 N42 27 N27N N11N11 +0 keyssucc.distan N17N11 +4 N17N11 +2 N17N11 +1 N48N N27N11 +8 finger table 28N33N N27N27 +0 keyssucc.distan N63 N48 N42 N33 N27 +4 N27 +2 N N N27 +8 finger table Key 28 N42 10 N11 N42+32 N11 27 N27 N N27 28 N33 N27+1 N33

9 Data Model Each network node is able to describe in RDF the resources that it wants to make available to the rest of the network. It creates and inserts the RDF triples ( Subject, Predicate, Object ).  The subject of a triple identifies the resource that the statement is about.  The predicate identifies a property or a characteristic of the subject.  The object identifies the value of the property.

10 Query Chain Algorithm The main idea of QC is that the query is evaluated by a chain of nodes. Intermediate results flow through the nodes of this chain Finally the last node in the chain delivers the result back to the node that submitted the query. Distribute the Load Balance by  Split the query into the triple patterns that it consists of.  Evaluate each triple pattern separately to a deferent node.  Intermediate result flow through these nodes as triple and partially satisfy the query.

Query Chain Algorithm 11 N42 N27 N11 N2 N56 N50 X X N63 N48 N6 N33 T1=Hash(S) T2=Hash(P) T3=Hash(O) r1 r2 r3 Indexing a new triple

Query Chain Algorithm 12 N42 N27 N11 N2 N56 N50 X X N63 N48 N6 N33 T1=Hash(p1) T2=Hash(p2) T3=Hash(o3) r1 r2 r3 Indexing a new triple

13 Query Chain Algorithm Evaluating a query q is ?x:(?x,p1,?y), (?y,p2,?z), (?z,p3,o3). q is ?x:(?x,p1,?y), (?y,p2,?z), (?z,p3,o3). x x r1 S (H(p1)) R1 searches in its TT and evaluates t1 R1 searches in its TT and evaluates t1 r1 r2 S (H(p2)) QEval(q,2,R’,ip(x)) QEval (q,i,R’,ip(x)) L={s1,s2} R‘={s1,s2} L={s1,s2} R‘={s1,s2} S1,p1,s2

14 Query Chain Algorithm Evaluating a query R2 searches in its TT and evaluates t2 =(?y,p2,?z) R2 searches in its TT and evaluates t2 =(?y,p2,?z) r2 r3 S (H(o3)) R3 searches in its TT and evaluates t3 (?z,p3,o3) R3 searches in its TT and evaluates t3 (?z,p3,o3) r3 X X IP(x) Delivers answer answer (q,R’) Delivers answer answer (q,R’) QEval (q,3,R’,ip(x)) S2,p2,s3 L={s2,s3},R‘={s1,s3} L={s3},R‘={s1} S3,p3,s3

Query chain algorithms Local processing at each chain node 15 n n QEval (q,i,R,ip(x)) L = π X ( σ F (TT) ) e.g : q1=(?s1,p1,o1) L=π 1( σ 2=p1Λ3=o1(TT) ) R’=π Y(R ∞ L ) n’ QEval (q, i+1, R’, ip(x)) nknk nknk QEval (q,k, R’, ip(x)) R’=π Y(R ∞ π X ( σ F (TT) ) Answer (q,R’)

16 Spread By Value Algorithm In SBV the triple pattern t will be stored at the successor nodes of the identifiers Hash(s), Hash(p), Hash(o), Hash(s + p), Hash(s + o), Hash(p + o) and Hash(s + p + o). SBV exploits the values of matching triples found while processing the query incrementally.

SBV algorithm 17 N42 N27 N11 N2 N56 N50 X X N63 N48 N6 N33 T1=Hash(p1) T2=Hash(s2+p2) T3=Hash(s3+o3) r1 r2 r3 Indexing a new triple T4=Hash(p1)

18 SBV Algorithm Evaluating a query q is ?x:(?x,p1,?y), (?y,p2,?z), (?z,p3,o3). q is ?x:(?x,p1,?y), (?y,p2,?z), (?z,p3,o3). x x r1 S (H(p1)) QEval(q, V, u, IP(x)), r1 evaluates (?x,p1,?y) QEval(q, V, u, IP(x)), r1 evaluates (?x,p1,?y)

SBV Algorithm Evaluating a query R1 searches in its TT and finds v ’a={?x=s1,y=s2},v’b={?x=s1,y=o1} q’a=(s2p2,?z)(?z,p3,o3) q’b=(o1p2,?z)(?z,p3,o3) R1 searches in its TT and finds v ’a={?x=s1,y=s2},v’b={?x=s1,y=o1} q’a=(s2p2,?z)(?z,p3,o3) q’b=(o1p2,?z)(?z,p3,o3) r1 r2 S (H(s2+p2)) S (H(p2+01)) QEval(q’a,V,v’a,ip(x)) r4 QEval(q’b,V,v’b,ip(x)) (?x,p1,?y),(?y,p2,z3) t1=(s1,p1,s2) t4=(s1,p1,o1) (?x,p1,?y),(?y,p2,z3) t1=(s1,p1,s2) t4=(s1,p1,o1)

SBV Algorithm Evaluating a query Each node searches in its TT for matching triples (?z,p3,o3 ) v’’a={?x=s1,?y=s2,?z=s3} q’’a=(s3,p3,o3) Each node searches in its TT for matching triples (?z,p3,o3 ) v’’a={?x=s1,?y=s2,?z=s3} q’’a=(s3,p3,o3) r2 r3 S (H(s3+p3+o3)) QEval (q’’a,V,v’’a,ip(x)) x x IP(x) v’’’a={?x=s1,?y=s2,?z=s3} Delivers answer Answer (q,V) v’’’a={?x=s1,?y=s2,?z=s3} Delivers answer Answer (q,V) s2,p2,s3 s3,p3,s3

Comparing the query chains in QC and SBV 21 q is ?x : (s1,p1,?x),(?x,p2,?y),(?y,p3,o3). ?x has four matching values for (s1,p1,?x). ?y has three matching values for each value of ?x. QCSBV

Optimizing network traffic IP cache (IPC) used by our algorithms to significantly reduce network traffic. Another effect of IPC, is that we reduce the routing load incurred by nodes in the network. 22

Experiments A simulator for chord network with java code was done. Our metrics are: 1) The amount of network traffic that is created. 2) how well the query processing load and storage load are distribution among the network nodes. 23

24 Experiments Some Experimental evaluating of the two algorithms,and the result are showed in the following figures (b) IPC effect (a) Traffic cost

Experiments Query processing and storage load distribution 25

Future work Order of triple choosed to evaluate the query. RDFpeers reasoning. 26

27 Reference 1) E. Liarou, S. Idreos, and M. Koubarakis. Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks. In I. Cruz, S. Decker, D. Allemang, C. Preist, D. Schwabe, P. Mika, M. Uschold, and L. Aroyo. 2) E. Liarou, S. Idreos, and M. Koubarakis. Publish-Subscribe with RDF Data over Large Structured,Overlay Networks. In DBISP2P ’05. 3) Book : Foundations of Databases, Addison-Wesley, ISBN ) I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan. Chord: A Scalable peer-to-peer lookup service for internet applications. 5) E. Prud’hommeaux and A. Seaborn. SPARQL Query Language for RDF

Thank you for your Attention. 28