Presentation is loading. Please wait.

Presentation is loading. Please wait.

 Schema mappings are logical assertions that describe the correspondence between two schemas Higher-level, declarative programming constructs Hide implementation.

Similar presentations

Presentation on theme: " Schema mappings are logical assertions that describe the correspondence between two schemas Higher-level, declarative programming constructs Hide implementation."— Presentation transcript:

1  Schema mappings are logical assertions that describe the correspondence between two schemas Higher-level, declarative programming constructs Hide implementation details, allow for optimizations Key elements in data exchange and data integration systems  Data Exchange [FKMP03] Translate data conforming to a source schema S into data conforming to a target schema T so that the schema mapping M is satisfied SPIDER: a Schema Mapping Debugger Bogdan Alexe Laura Chiticariu Wang-Chiew Tan University of California, Santa Cruz Schema Mappings and Data Exchange ExampleMain Idea: Debugging Schema Mappings with Routes Compute All Routes and Compute One Route Source instance I Source schema S Target Schema T Target instance J Debugging a Data Exchange The process of exploring, understanding and refining a schema mapping through the use of (test) data at the level of schema mappings Debugging Schema Mappings M XQuery/SQL/Java  Approach 1: At the level of the implementation (unsatisfactory) Specific to the exchange engine Specific to implementation language. E.g., XQuery, XSLT, etc Commercial tools available: Altova MapForce, Stylus Studio, etc  Approach 2: At the level of the schema mapping (desirable) Currently, NO SUPPORT!!!  Motivation for debugging at the level of the schema mappings: Uniformity in specifying and debugging Reduce programming effort by allowing a user to specify and debug at the level of schema mappings MANHATTAN CREDIT CardHolders: cardNo ² limit ² ssn ² name ² Dependents: accNo ² ssn ² name ² FARGO FINANCE Accounts: ² accNo ² creditLine ² accHolder Clients: ² ssn ² name D2D2 D1D1 C1C1 Source instance I Target instance J Solution for I under the schema mapping 123$15KID1Alice 456$7KID2Bob CardHolders 123ID2Bob Dependents 123L1L1 ID1 A2A2 L2L2 ID2 456L3L3 ID2 Accounts ID1Alice ID2Bob Clients fk 1 D 1 : foreach s 0 in MANHATTAN-CREDIT.CardHolders exists t 0 in FARGO-FINANCE.Accounts, t 1 in FARGO-FINANCE.Clients, where t 0.accHolder=t 1.ssn with s 0.cardNo= t 0.accNo and s 0.ssn= t 0.accHolder and s t D 2 : foreach s 0 in MANHATTAN-CREDIT.Dependents exists t 0 in FARGO-FINANCE.Clients with s 0.ssn= t 0.ssn and s t C 1 : foreach s 0 in FARGO-FINANCE.Clients exists t 0 in FARGO-FINANCE.Accounts with s 0.ssn= t 0.accHolder Features  Computing routes for selected source or target data Compute all routes Compute one route Compute alternative routes on demand Guided exploration of all routes  Standard debugging features Breakpoints on dependencies Watch windows – zoom into details about each step in the routes  Schema-level exploration of routes Facilitates the understanding of schema mappings directly at the level of source and target schemas  Implementation details On top of the Clio data exchange system Supports relational and XML schema mappings Schema mapping language: XSML - XML Schema Mapping Language Source schemaTarget schema Source-to-target dependencies Target dependencies Unknown credit limit? $15K is not copied over to the target 123L1L1 ID1 Accounts BobID2$7K456 CardHolders ID1L1L1 123 Accounts AliceID1 Clients D1D1 A route for the Accounts tuple Debugging scenario 1 D’ 1 : foreach s 0 in MANHATTAN-CREDIT.CardHolders exists t 0 in FARGO-FINANCE.Accounts, t 1 in FARGO-FINANCE.Clients, where t 0.accHolder=t 1.ssn with s 0.cardNo= t 0.accNo and s 0.ssn= t 0.accHolder and s t and s 0.limit= t 0.creditLine 123 is not copied to the target as Bob’s account number D2D2 BobID2123 Dependents ID2L2L2 A2A2 Accounts BobID2 Clients C1C1 Route for the Accounts tuple Unknown account number? Debugging scenario 2 D’ 2 : foreach s 0 in MANHATTAN-CREDIT.Dependents, s 1 in MANHATTAN-CREDIT.CardHolders where s 0.accNo=s 1.cardNo exists t 0 in FARGO-FINANCE.Clients, t 1 in FARGO-FINANCE.Accounts where t 1.accHolder=t 0.ssn with s 0.ssn= t 0.ssn and s t and s 1.cardNo= t 1.accNo and s 1.limit= t 1.creditLine BobID2 Clients BobID2123 Dependents ID2L2L2 A2A2 Accounts AliceID1$15K123 CardHolders D1D1 D2D2 C1C1 Forest of routes for the Account tuple Routes obtained from the forest D1D1 BobID2 Clients C1C1 D2D2 BobID2123 Dependents ID2L2L2 A2A2 Accounts BobID2 Clients C1C1 BobID2$7K456 CardHolders ID2L2L2 A2A2 Accounts ID2L3L3 456 Accounts Schema-level exploration of routes MANHATTAN CREDIT CardHolders: cardNo ² limit ² ssn ² name ² Dependents: accNo ² ssn ² name ² FARGO FINANCE Accounts: ² accNo ² creditLine ² accHolder Clients: ² ssn ² name C1C1 fk 1 D1D1 D2D2 selected schema element Towards a full-fledged debugger SPIDER is the first prototype debugger for schema mappings  Routes illustrate the relationship between source and target data with the schema mapping Declarative semantics, based on the logical satisfaction of the dependencies Independent of any implementation of the schema mapping Concept applies to any mapping-based data exchange or data integration system  Compute all routes For each selected target tuple t s, consider every possibility for witnessing t. Do not consider the same tuple twice. Complete, polynomial time algorithm The route forest is a polynomial representation of all routes (possibly exponentially many) for the selected tuples Computation can be user-guided, or stopped with breakpoints on dependencies  Compute one route Non-exhaustive: adapted compute all routes to stop when one witness is found Inference procedure: to deduce all consequences of a proven tuple and avoid recomputation of “branches” Complete, polynomial time algorithm ID2L2L2 A2A2 Accounts Source instance I Source schema S Target Schema T Target instance J M SPIDER routes tuple selection  Illustrate the schema mapping at the level of the source and target schemas  Future work Extension to handle nested schema mappings Adapt the target instance with changes in the schema mapping  Acknowledgements Daniel Pepper, UC Santa Cruz The Clio team in IBM Almaden Research Center

Download ppt " Schema mappings are logical assertions that describe the correspondence between two schemas Higher-level, declarative programming constructs Hide implementation."

Similar presentations

Ads by Google