Implementing Database Coordination in P2P Networks * Ilya Zaihrayeu SemPGRID-04, 18 May 2004, New York, USA * work with Fausto Giunchiglia.

Implementing Database Coordination in P2P Networks * Ilya Zaihrayeu SemPGRID-04, 18 May 2004, New York, USA * work with Fausto Giunchiglia

Why P2P Databases P2P data sharing: files … relational data? File sharing: KaZaa + Morpheus = more than 460 million downloads (download.com, May 2004) P2P databases: academia testbeds so far.. Promises: large-scale fault-tolerant multi-database system with low start-up and maintenance costs, and high “output” for an individual party Difficulties: data integration solutions are not applicable due to centralized nature Challenges: new methodologies, theories and algorithms, models, mechanisms and tools need to be developed

Why P2P Databases, cont’d Application: non performance critical domains, where local autonomy of each party is essential Medical care scenario –John is going for skiing and suffers an accident –John is taken to local clinic for treatment – doctors need to know whether John has contraindication against some drugs –John does not know these details, but his database layer has a link to family doctor’s databases Cooperating real estate agents example –Agents coordinate their data to push sales –When on the site of a customer who wants to sell, agent updates his database and makes data available for other agents –When on the site of a customer who may want to buy, agent shows details from his database, and may query other agent’s databases Other examples: scientific databases (genomic data), tourism, etc

Data Coordination Model Interest Groups – group of peers able to answer queries about a certain topic –e.g., group topic – “Tourism in Trentino”, “Real Estate in Scotland”, etc –each Interest Group has group manager (GM) which helps in maintenance of the group Acquaintances – “known” nodes that contribute data –acquaintance query – a query over the relations of an acquaintance which results satisfy some local relation Correspondence Rules – solve heterogeneity problem at instance level –semantic heterogeneity at structure level is solved by acquaintance queries Coordination Rules – coordinate data (queries and updates) with acquaintances

Interest Groups Help to cope with large number of nodes by clustering the network Nodes self-organize into interest groups A node may form a child interest group One node may belong to multiple groups Use schema matching to monitor group constitution GM is to support group constitution, “talk” to other GMs and provide information about the group to newcomers All topics ArtsShopping Movies Music … PublicationsComputers … Lyrics Books …

Acquaintance query Acquaintance query is a conjunctive query: q(X) :- r 1 (X 1 ), …, r n (X n ) –q(X) – head, refers to local relation; –r 1 (X 1 ), …, r n (X n ) – subgols of the body, refers to the relation of an acquaintance; and comparison predicates –X, X 1,…, X n – variables or constants; E.g., P1: films (title, year, genre) :- P2: movie (title, year, director); genres (title, genre); year>1995 12 3 4 ABCD EF IG I :- A,B B :- C,D D :- I,G C :- E,F F :- G A loop

Correspondence Rules and Coordination Rules Correspondence rules define how constants from the local domain are translated into constants in the domain of an acquaintance (forward translation) and vice versa (backward translation) –not necessarily symmetric, e.g. currency translation Coordination Rules’ goal is data coordination with acquaintances and acquainted nodes –activated by user (user query) or from the network (network query, results, update)

Algorithmic notes Query answering algorithm –Use acquaintance queries and correspondence rules to translate queries and data –Propagate to acquaintances if acquaintance queries are relevant –Compute only new tuples, reconcile results –Process loops in query propagation, define termination point (no propagation using acquaintance queries that have been already used) “Getting acquainted” protocol –Retrieve database schemas and then apply a matching operator on them –Based on the matching results, generate (with help of user) acquaintance queries, correspondence rules, tune up coordination rules Updates handling (work with E. Franconi, G. Kuper, A. Lopatenko) –Data may go through a loop more than once, define termination point

Implementing P2P databases on top of JXTA Benefits –system platform, networking protocol independence –IP-independence (location independence) –gives basic blocks for building P2P applications We implement Interest Groups and Acquaintances in JXTA We encode database related functionalities into a set of custom JXTA services (DB-related services) DB-related services Node-level servicesGroup-level services Queries handler DB operations … Screening service GM service …

Architecture A node PDBMS User Interface (UI) Database Manager (DBM) Wrapper Source Database (SDB) User A P2P database network User-1 User-2 User-n Nodes on the network JXTA Layer SS

Architecture, cont’d JXTA Layer DBM User Interface (UI) Wrapper In Out Disco- very Query Planner Pipes Query Propagation P2P Management Coordination Rules Acquaintances Peer Groups Services JXTA Core Services GM in-pipe adv DB-related services Results Handler Acquaintance queries Correspondence Rules Advertisements Peer Adv Peer Gr. Adv Gr. topic Pipe Adv SS Updates Handler

Demo: toy databases and topology Relations: (1)Movie (title, year, genre) (2)Credits (name, title, role) (3)Movie2 (title, year, director) (4)Genre (title, genre) 0 1 2 5 4 3 Q [1,2] [2] [2,3,4] [3] [4] (1:-1) (2:-2) (3:-3) (4:-4) (1:-3,4) (2:-2) (4:-1) Rendezvous peer Mediator peer

Query example 1 “List titles of movies featuring Tom Hanks” Q(t) :- Credits (n,t,r); n=“Tom Hanks” 0 1 2 5 4 3 Q [1,2] [2] [2,3,4] [3] [4] (1:-1) (2:-2) (3:-3) (4:-4) (1:-3,4) (2:-2) (4:-1)

Query example 2 “Titles of drama movies issued after 1995” Q(t) :- Movie (t,y,g); g=“Drama”; y>1995; 0 1 2 5 4 3 Q [1,2] [2] [2,3,4] [3] [4] (1:-1) (2:-2) (3:-3) (4:-4) (1:-3,4) (2:-2) (4:-1)

Query example 3 “Names of actors playing in action movies in 2003” Q(n) :- Movie (t,y,g); Credits (n,t,r); r=“Actor”; g=“Action”; y=2003; 0 1 2 5 4 3 Q [1,2] [2] [2,3,4] [3] [4] (1:-1) (2:-2) (3:-3) (4:-4) (1:-3,4) (2:-2) (4:-1)

References F. Giunchiglia and I. Zaihrayeu. Making peer databases interact - a vision for an architecture supporting data coordination. 6th International Workshop on Cooperative Information Agents (CIA- 2002), Madrid, Spain, September 18 -20, 2002. P. Bernstein, F. Giunchiglia, A. Kementsietsidis, J. Mylopoulos, L. Serafini, and I. Zaihrayeu, “Data management for peer-to-peer computing: A vision,” WebDB, 2002. A. Halevy, Z. Ives, D. Suciu, and I. Tatarinov, “Schema mediation in a peer data management system,” ICDE, 2003. V. Kantere, I. Kiringa, J. Mylopoulos, A. Kementsietsidis, and M. Arenas, “Coordinating peer databases using ECA rules,” DBISP2P, September 2003. Enrico Franconi, Gabriel Kuper, Andrei Lopatenko, Ilya Zaihrayeu (2004). The coDB Robust Peer-to-Peer Database System. Proc. of the 2nd Workshop on Semantics in Peer-to-Peer and Grid Computing (SemPGrid'04), 2004 JXTA project, see http://www.jxta.org

Announcement Submission deadline: 30 June, 2004 www.p2pkm.org

Thank you

Implementing Database Coordination in P2P Networks * Ilya Zaihrayeu SemPGRID-04, 18 May 2004, New York, USA * work with Fausto Giunchiglia.

Similar presentations

Presentation on theme: "Implementing Database Coordination in P2P Networks * Ilya Zaihrayeu SemPGRID-04, 18 May 2004, New York, USA * work with Fausto Giunchiglia."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Implementing Database Coordination in P2P Networks * Ilya Zaihrayeu SemPGRID-04, 18 May 2004, New York, USA * work with Fausto Giunchiglia.

Similar presentations

Presentation on theme: "Implementing Database Coordination in P2P Networks * Ilya Zaihrayeu SemPGRID-04, 18 May 2004, New York, USA * work with Fausto Giunchiglia."— Presentation transcript:

Similar presentations

About project

Feedback