PeerDB: A P2P-based System for Distributed Data Sharing Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou Shawn Jeffery CS294-4 Peer-to-Peer Systems.

PeerDB: A P2P-based System for Distributed Data Sharing Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou Shawn Jeffery CS294-4 Peer-to-Peer Systems 11/05/03

Shawn Jeffery PeerDB2 Overview  A P2P “database” system Allows content-based search  No global schema  Utilizes mobile agents Provides flexibility and extensibility  Dynamically adjusts topology

11/05/03Shawn Jeffery PeerDB3 Background: P2P vs Distributed Databases P2PDDBMS MembershipAd-hocControlled SchemaNo global schema Shared (or at least some way to mediate) Query result setIncompleteComplete Content location“Word of mouth” Shared catalog

11/05/03Shawn Jeffery PeerDB4 BestPeer  Generic P2P platform  Mobile Agents Carry code and data Collect stats Security issues?  Dynamic Reconfiguration How does this compare to Gia?  Location Independent Global Names Lookup (LIGLO) Servers Small number Provides a global identity for peers and peer status Why not use a DHT/KBR/DOLR?

11/05/03Shawn Jeffery PeerDB5 BestPeer Security  Private and sharable data Agents only able to access sharable data Does this adequately restrict the power of mobile agents?  Communications on the wire also encrypted  What’s missing?

11/05/03Shawn Jeffery PeerDB6 Architecture Sharable Data Local Data Database

11/05/03Shawn Jeffery PeerDB7 Schema “Mediation”  Problems with supporting SQL queries: No global schema information Different nodes could name the same table/attribute differently (“len”, “length”)  Solution: User supplies metadata for each relation name and attribute Users expected to do a lot  Formula based on matching relation keywords and attribute keywords to determine if a query matches a table  What about other schema mediation work (such as Piazza)?

11/05/03Shawn Jeffery PeerDB8 Local Query Processing – Phase I  “Master Agent” coordinates the entire affair  Check Local Dictionary for matching relations Use the relation matching strategy even for the local DB  Create “Relation Matching Agents” and flood to all neighbors  Wait for responses Display results to user as they arrive

11/05/03Shawn Jeffery PeerDB9 Local Query Processing – Phase II  User selects the relations he/she wants Create a “Data Retrieval Agent” Rewrite query in terms of new relations  If local, submit SQL to local db  Contact remote nodes directly to access the data Creates remote join plans locally - optimization?

11/05/03Shawn Jeffery PeerDB10 Remote Query Processing  Phase I: Find relations Relation Matching Agents flood with TTL Check Export Dictionary for a match  Return matches directly  Phase II: Get data Data Retrieval Agent submits SQL to DBMS Return data to the requesting node directly Run further data processing before returning  Again, security issues

11/05/03Shawn Jeffery PeerDB11 Statistics  Master Agents monitor stats in the network  Keywords for some relations returned during Phase I Update metadata  Number of objects returned for selected relations Can be used for topology change decisions  Use most recently returned results as metric to determine who to connect with Frequent updates – might need to change neighbors after each result returned

11/05/03Shawn Jeffery PeerDB12 Caches  Cache all query results locally  Soft state  LRU replacement  Users choose which copy they want Only provided with peer id and an indication of which is the source What about timestamp, etc? Again, user heavily involved

11/05/03Shawn Jeffery PeerDB13 Relation Matching Performance  Significant tradeoff between precision and recall  Which is more important?  Is their approach acceptable?

11/05/03Shawn Jeffery PeerDB14 Experimental Methodology  Compare P2P Model vs Client/Server model CS returns via the search path (?)  Compare static vs reconfigurable networks  Compare agent vs message based approach  32 Nodes Is this enough?

11/05/03Shawn Jeffery PeerDB15 Evaluation Scenarios (Metrics?)  Fixed set of nodes Easily test P2P protocols, Reconfiguration strategies  Latency  Quality and Quantity  What else is important?

11/05/03Shawn Jeffery PeerDB16 Performance  As you increase the amount of storage on each node, latency decrease Due to caching  In general, reconfiguration performs better  Response times O(1 Minute) Is this acceptable?  Agent based shown to be better What if agent produces more data than it processes?

11/05/03Shawn Jeffery PeerDB17 Discussion: A P2P DBMS?  PeerDB represents a tiny step towards a P2P DB (also PIER, Piazza) What does it do right? What else is needed? Is it ideal to have a P2P DB? Is it feasible?

PeerDB: A P2P-based System for Distributed Data Sharing Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou Shawn Jeffery CS294-4 Peer-to-Peer Systems.

Similar presentations

Presentation on theme: "PeerDB: A P2P-based System for Distributed Data Sharing Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou Shawn Jeffery CS294-4 Peer-to-Peer Systems."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PeerDB: A P2P-based System for Distributed Data Sharing Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou Shawn Jeffery CS294-4 Peer-to-Peer Systems.

Similar presentations

Presentation on theme: "PeerDB: A P2P-based System for Distributed Data Sharing Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou Shawn Jeffery CS294-4 Peer-to-Peer Systems."— Presentation transcript:

Similar presentations

About project

Feedback