Download presentation
Presentation is loading. Please wait.
Published byConstance Moody Modified over 9 years ago
1
Peer-to-Peer Data Integration Using Distributed Bridges Neal Arthorne B. Eng. Computer Systems (2002) Supervisor: Babak Esfandiari April 12, 2005 Candidate Thesis for M. A. Sc. in Electrical Engineering
2
2 Introduction Multiple autonomous heterogeneous data sources E.g. chemistry and genetics databases, digital repositories, astronomy databases Data is distributed and network-accessible Each data source may use a different syntax or query language (SQL, Web Services etc.)
3
3 Related Work Federated database systems [Sheth, 1990] Global federated schema Mediator approach [Wiederhold, 1992] Databases wrapped in a software layer that translates to a common information model Middleware lies between user applications and data sources Theoretical description of data integration Schemas mapped with FOL statements – LAV or GAV approach
4
4 Related Work Cont’d OWL/RDF/RDFS used to describe semantic relationships – ontologies Peer-to-Peer Data Integration – WWW approach to integrating data PIAZZA (Halevy et al.), Lenzerini, Franconi Focused on query optimization and decidability in FOL systems
5
5 Limitations of Related Work Global shared schemas are fragile and not scalable Centrally located and administered Changes affect all component databases or middleware P2P data integration is limited Semantic differences not addressed Centrally stored mappings Large databases not compatible with centralized metadata
6
6 Proposed Solutions User-contributed mappings between schemas (bridges) Fully de-centralized distribution of mappings Anyone can publish a new mapping No global schema means improved scalability Provide semantic mappings for data Distributed searching compatible with large databases Use existing Universal Peer-to-Peer (U-P2P) framework
7
7 Universal Peer-to-Peer (U-P2P) Peers share XML metadata with binary attachments Communities formed around a shared XML Schema Community itself is published – anyone can create a community Flexible deployment – pluggable Network Adapters 0..* Resource Attachments Book Comunity War & Peace file://... 0..* 1 1
8
8 P2P Data Integration with U-P2P Proposed Bridge Community and bridge schema Anyone can publish a bridge Includes simple semantic relation Attached mappings and/or transforms U-P2P modularized for database proxies Distributed Network Adapter (Gnutella) Compatible with large databases No central indexing servers
9
9 > Shareable Bridges in U-P2P ResourceCommunityBridge 1 0..* 1 2 - Semantic relation - Data mappings
10
10 Example Bridge DSpace to Fedora bridge … d2a9d6f78dcf91828f68a52f78260e05 134d8f8ecd57acb35206b4cd13e38622 owl:sameAs d2a9d6f78dcf91828f68a52f78260e05 da1058314b7d8890fc7df7f879a0a7db file://… Bridge Source Bridge Target OWL Relation Transform attachments
11
11 Case Study: Digital Repositories Peer A Peer B Peer C DSpace Community Fedora Community DSpace Community Fedora Community Fedora Community Generic Central Server Gnutella ProtocolCentralized P2P Fedora Database Proxy Bridge
12
12 Conclusion P2P approach to integration – anyone can create a bridge Fully-distributed network adapter brings in large data sources via proxies Demonstrated integration with digital repositories Simple semantic relationship (OWL) Query translation
13
13 Future Work Manual navigation between schemas Need to automate retrieving bridges XPath query translation is limited Need to provide robust query translation modules Translate instance data Semantic relationships not exploited Use OWL ontologies to give bridges a context Software agents can be introduced to discover and use bridges
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.