Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Overview of Issues in P2P database systems Presented by Ahmed Ataullah Wednesday, November 29 th 2006.

Similar presentations


Presentation on theme: "An Overview of Issues in P2P database systems Presented by Ahmed Ataullah Wednesday, November 29 th 2006."— Presentation transcript:

1 An Overview of Issues in P2P database systems Presented by Ahmed Ataullah Wednesday, November 29 th 2006

2 2 Why mix P2P and databases More and more intelligent mobile devices  Storage capacities of 8 gigabytes and beyond are becoming the norm  Most devices are multipurpose and do more than just storage  These nodes can often independently connected to other multipurpose devices P2P systems have a ‘network effect’  No special infrastructure required to join (usually)  No requirements of availability and reliability  Community orientation Some motivating P2P database examples  Provincial health care network  Travel Agents (worldwide)

3 3 P2PDBMS – A generally accepted definition Unmanaged distributed database system  Number of nodes > 10^6  Most nodes (at least half) are offline at any given time  Nodes can leave at any given time and join from different locations Nodes are independent local database systems as well  Have a local schema and may contribute with some local resources (data, processing power, bandwidth etc.)

4 4 Widely accepted assumptions No central control  No standard schema (FNAME == FIRST_NAME)  No standardized local DBMS Goal centric communities  Peers are co-operative Some work related to game theory has been done with the contrary assumption  Location and location independent scenarios are treated differently by applications No reliability, serializability and correctness guarantees. Best effort is acceptable  Virtually no access control

5 5 P2P Database Management Systems What it boils down to…  File sharing, formalized and taken up a notch  Our objective is to port everything from the relational world (tables, constraints, foreign keys, materialized views, triggers etc) into a highly scalable and loosely connected network of database systems Why is that so difficult?

6 6 The Query Processing Nightmare SELECT MIN (PRICE), DATE, FLIGHT_NUMBER FROM FLIGHTS NATURAL JOIN AVAILABILITY WHERE ORIGION= ‘TORONTO’ AND DESTINATION=‘LONDON’ Schema issues  Schemas may not agree  Knowledge may not be consistent, Toronto = YYZ and London = LHR or LGW etc. Correctness  Have to look at every peer.  Not possible? Alternative solutions? Response Time  Most accurate answer up to certain point in time

7 7 The Query Processing Nightmare SELECT MIN (PRICE), DATE, FLIGHT_NUMBER FROM FLIGHTS NATURAL JOIN AVAILABILITY WHERE ORIGION= ‘TORONTO’ AND DESTINATION=‘LONDON’ Data placement issues  A correct answer may have to be derived  May require coordination among peers Local vs. Remote processing  Dynamic coordination rules  Is bandwidth more available or processing power? Cyclic nature of networks  Query propagation and update requests (and all other algorithms) have to be bounded

8 8 The Query Optimization Nightmare SELECT MIN (PRICE), DATE, FLIGHT_NUMBER FROM FLIGHTS NATURAL JOIN AVAILABILITY WHERE ORIGION= ‘TORONTO’ AND DESTINATION=‘LONDON’ Redundancy Issues  Same flight and price but different date? Materialized views  How often do we update these views Update propagation  problem for offline peers (push/pull strategy) Inserts and Deletes  Is every item unique?  Ownership model

9 9 Other issues which need attention SELECT MIN (PRICE), DATE, FLIGHT_NUMBER FROM FLIGHTS NATURAL JOIN AVAILABILITY WHERE ORIGION= ‘TORONTO’ AND DESTINATION=‘LONDON’ Semantic Optimization  Not very well studied  Must have a well designed model Fairness  Can one agent lie about his/her ticket prices  Incentives and Detection mechanisms Access control  Can it be offered at a high granularity? Consequences?

10 10 Conclusion (lessons learnt) P2P database systems are more than just database engines with networking modules above them Lot more work can be done in various sub areas  A minor tweak or assumption change can often lead to surprisingly different results  Interesting ideas like semantic query optimization, fine grained access control, fairness and control related issues have not been addressed  The need to do so perhaps also not been recognized


Download ppt "An Overview of Issues in P2P database systems Presented by Ahmed Ataullah Wednesday, November 29 th 2006."

Similar presentations


Ads by Google