Download presentation
Presentation is loading. Please wait.
1
Using Schema Matching to Simplify Heterogeneous Data Translation Tova Milo, Sagit Zohar Tel Aviv University
2
Introduction There are large amounts of data available on the Web but the format of the data is not homogeneous. Most applications can handle only one or a small number of formats. There is a need to translate data from one format to another.
3
Introduction Two approaches to translating data: A specific program to translate from format A to format B. (e.g. Latex to HTML) Data translation languages.
4
Introduction The solution – TranScm A data translation system Automatically translates a portion (often a large portion) of the desired data Does not replace data translation languages, but reduces the amount of programming needed in them
5
TranScm Architecture Rule Base Matching Module Typing Module GUI Input SchemaOutput Schema Import/Export Library
6
Data Model Tree (Forest) Model Similar to OEM Allows an order on children Can handle cyclic structures using ids as “pointers”
7
Data Model Article title “Conceptual Concepts” authors author “Al Gore Ithm”“G WWW Bush” sections
8
Schema Model Labeled graphs Some nodes may be ordered Each vertex is a schema element (type) Labels carry information about the node
9
Schema Model Article [3] title [1] string authors [0,…,->] author [1] ref string sections [2]
10
Rules Rules are the basis of the matching and translation Rules have an associated priority
11
Rules Each rule has two components: Matching component Match function Decendents (sic) function Translation component Translation function
12
Matching The Match function examines schema labels to determine possible matches. The Decendents function checks the numbers and types of the children of the current node.
13
Matching Article author Article authors author
14
When Matching Fails Matching can fail for two reasons: Something in the source can’t be matched to something in the target with the current set of rules. Something in the source matches several items in the target equally well.
15
When Matching Fails Via the GUI, the user can do the following: Add Disable Modify Override
16
Translation Using the mapping generated from the Matching step and the appropriate rules, data is transformed from the input schema to the output schema. The translation process can make use of data translation languages The translation process can perform type checking.
17
Conclusion TranScm Provides a general mechanism for data translation Handles the common relatively simple translations automatically Can use data translation languages for more difficult translations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.