On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented.

Slides:



Advertisements
Similar presentations
Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge
Advertisements

Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Peer to Peer and Distributed Hash Tables
Data Currency in Replicated DHTs Reza Akbarinia, Esther Pacitti and Patrick Valduriez University of Nantes, France, INIRA ACM SIGMOD 2007 Presenter Jerry.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Digital Library Service – An overview Introduction System Architecture Components and their functionalities Experimental Results.
1 Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks Erietta Liarou, Stratos Idreos, and Manolis Koubarakis Waled.
GridVine: Building Internet-Scale Semantic Overlay Networks By Lan Tian.
Columbia University Department of Computer Science COMS – E6125 Web-enHanced Information Management Presentation A Study to the Semantic Web and Semantic.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Lecture 5 - Routing On the Flat Labels M.Sc Ilya Nikolaevskiy Helsinki Institute for Information Technology (HIIT)
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Small-world Overlay P2P Network
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
©NEC Laboratories America 1 Hui Zhang Samrat Ganguly Sudeept Bhatnagar Rauf Izmailov NEC Labs America Abhishek Sharma University of Southern California.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
Mercury: Scalable Routing for Range Queries Ashwin R. Bharambe Carnegie Mellon University With Mukesh Agrawal, Srinivasan Seshan.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
SCALLOP A Scalable and Load-Balanced Peer- to-Peer Lookup Protocol for High- Performance Distributed System Jerry Chou, Tai-Yi Huang & Kuang-Li Huang Embedded.
XtreemOS IP project is funded by the European Commission under contract IST-FP XtreemOS WP3.2 - T3.2.3 Scalable Directory Service Design State.
Object Naming & Content based Object Search 2/3/2003.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
SIMULATING A MOBILE PEER-TO-PEER NETWORK Simo Sibakov Department of Communications and Networking (Comnet) Helsinki University of Technology Supervisor:
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
Effizientes Routing in P2P Netzwerken Chord: A Scalable Peer-to- peer Lookup Protocol for Internet Applications Dennis Schade.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Overlay network concept Case study: Distributed Hash table (DHT) Case study: Distributed Hash table (DHT)
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Security Michael Foukarakis – 13/12/2004 A Survey of Peer-to-Peer Security Issues Dan S. Wallach Rice University,
Peer-to-Peer Name Service (P2PNS) Ingmar Baumgart Institute of Telematics, Universität Karlsruhe IETF 70, Vancouver.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
National Institute of Advanced Industrial Science and Technology Query Processing for Distributed RDF Databases Using a Three-dimensional Hash Index Akiyoshi.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Scalable Distributed Reasoning Using MapReduce Jacopo Urbani, Spyros Kotoulas, Eyal Oren, and Frank van Harmelen Department of Computer Science, Vrije.
DHT-based unicast for mobile ad hoc networks Thomas Zahn, Jochen Schiller Institute of Computer Science Freie Universitat Berlin 報告 : 羅世豪.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
1. Outline  Introduction  Different Mechanisms Broadcasting Multicasting Forward Pointers Home-based approach Distributed Hash Tables Hierarchical approaches.
Pastry Antony Rowstron and Peter Druschel Presented By David Deschenes.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
Stefanos Antaris A Socio-Aware Decentralized Topology Construction Protocol Stefanos Antaris *, Despina Stasi *, Mikael Högqvist † George Pallis *, Marios.
Peer to Peer Network Design Discovery and Routing algorithms
STATE KEY LABORATORY OF NETWORKING & SWITCHING BEIJING UNIVERSITY OF POSTS AND TELECOMMUNICATAIONS A Semantic Peer-to- Peer Overlay for Web Services.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
DATABASE REPLICATION DISTRIBUTED DATABASE. O VERVIEW Replication : process of copying and maintaining database object, in multiple database that make.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Towards a Scalable and Robust DHT Baruch Awerbuch Johns Hopkins University Christian Scheideler Technical University of Munich.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
Peer-to-Peer Networks 05 Pastry Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Plethora: Infrastructure and System Design
COS 461: Computer Networks
A Semantic Peer-to-Peer Overlay for Web Services Discovery
Presentation transcript:

On Triple Dissemination, Forward- Chaining and Load Balancing in DHT Based RDF Stores Dominic Battre, Felix Heine, Andre Höing, and Odej Kao Presented by Aldarwich Yaser Albert-Ludwigs-University Freiburg SS 2009 Department of Computer Science Computer Networks and Telematics Prof. Christian Schindelhaue

Overview  Motivation  Introduction RDF DHT Pastry  Triples dissemination  Reasoning  Load Balancing  References 1

Motivation  Centralized database Shortcomings Incapable to handle load Capacities limitation like in (Seasame,Jena)  Decentralized database Example: B abelpeers,RDFpeers and Edutella Provides scalibility,effeciency and capacity  Reasoning Infer new data from existing information  Load balancing

RDF Introduction  Resource Description Framework (RDF)  Used for representing information on the Web  RDFs provides a powerful model for storing and inferencing knowledge.  In RDF everything is represented by triples of the form(S,P,O) Example: Germany has Capital Berlin S P O 2

DHT Introduction  Solve the item location problem in a distributed network of nodes  Use a key k to calculate the ID ID=hash(k)  Operations: Put(k, x) Get(k) 3

Triple dissemination Triple T=(s,p,o) identifier = (hash(s)) identifier = (hash(p)) identifier = (hash(o)) Responsible node for p Responsible node for o Responsible node for s Query q = (s, p, o) identifier = (hash(p)) 4

Pastry Protocol  Each peer has a 128-bit ID: nodeID Unique and uniformly distributed Use cryptographic function applied to IP-address  Message takes O(log N) steps to destination  Node state contains: Leaf Set Routing table explain Neighborhood Set

Pastry (prefix-matching)  Route(m, )? Node-id Key

RDf Reasoning  The query is formulated gernerally  RDFs extract data even if the description does not exactly match the query  Example: Christian fatherof Schindelhauer Father subpropertyof relatives => Christian relative of Schindelhauer

RDFS Rules Generated TriplePreconditionRule Name u, rdf:type, xa,rdfs:domain,x u, a, v rdfs2 v, rdf:type, xa, rdfs:range, x u, a, v Rdfs3 u,rdfs:subPropertyOf,xu, rdfs:subPropertyOf, v v, rdfs:subpropertyOf, x rdfs5 v, rdf:type, xu, rdfs:subClassOf, x v, rdf:type, u rdfs9 u, rdfs:subClassOf, xu, rdfs:subClassOf, v v, rdfs:subClassOf, x rdfs11 6

Node Architecture  Each node hosts multiple RDf databases local triples database Received triples database Replica database Generated triples Generated Triples Local Triples Received Triples Replica 5 Node

Triple dissemination in DHT Node1 Node2 Node3 Node4 Generated Triples Local Triples Received Triples Replica Generated Triples Local Triples Received Triples Replica Generated Triples Local Triples Received Triples Replica Generated Triples Local Triples Received Triples Replica 7

Triples life-cycle  Triples are subjected to different events like (Joining, Departure)  Triples life-time long life time triples has few refreshes refreshes short life time triples(generated triples)  Update triples update inferred triples  Soft-state

Node Departure  Node substitution  Correction of routing table  Replica duty  Decreasing number of replicas 8 n1 n4 n3 n2 n9

Node Arrival  More complicated  Query recieving  Task of replica nodes  Time reduction 9 n1 n4 n3 n2 n6 n9

Load balancing  Major criticism against DHT based RDF strores  Many collisions are unavoidable  Example: DHT stores many triples with predicate rdf:type “ rdfs:subClassOf“ create many triples with Predicate rdf:type  Overlay Tree B uilds for discrete DHT positions like the one stores triples with rdf:type 10

Node1 Node2 Node3 Node4 Local Triples Received Triples Local Generated Triples Remote Triples Exte Local Remote Triples Local Triples Received Triples Generated Triples Local Triples Received Triples Generated Triples Local Triples Received Triples Generated Triples Local Remote Triples Exte Local Remote Triples Local Remote Triples references Load-balancing with remote triples database 11

Replicated overlay tree Root Rank1 Rank2 12

Query routing in overlay tree Root Rank1 Rank2 Qeury Result 13

Handling RDFs rules in load balancing  Problem of RDF rules As node is overloaded, the triples are splited into other nodes Example: a, rdfs:domain, x u, a, v a, rdfs:domain, x u,a,v a, rdfs:domain, x Node3Node1Node2

Handling RDFs rules in load balancing  Solution Make copy of most common rdfs schema into each node in overlay tree a, rdfs:domain, x u,a,v Node1Node4Node3 a, rdfs:domain, x u, a, v Node2 a, rdfs:domain, x

Conclusion  P2p based distributed database offer better scalability and source integration  Real power of RDF is stems from possibility to derive new data from explicit knwoledge  Overlay tree is the solution for overloading problem

References         Battre,heine,Kao:Top k RDF query evaluation in p2p 14

Thanks for your Attention