Presentation is loading. Please wait.

Presentation is loading. Please wait.

XML To Relational Model. Key Index – Forward Traversal Backward Traversal.

Similar presentations


Presentation on theme: "XML To Relational Model. Key Index – Forward Traversal Backward Traversal."— Presentation transcript:

1 XML To Relational Model

2

3

4

5

6 Key Index – Forward Traversal Backward Traversal

7

8

9

10

11

12 Binary Approach B name (source, ordinal, flag, target) Create many tables as different subelement and attribute names occur in XML document Partition Edge Table by name Universal table – Take outer join of all binary tables

13

14

15

16

17

18

19 Universal Table with Overflow

20

21

22

23 Converting Ordered XML to Relations

24 Skynet Hitech. Company Skynet Hitech Research John Smith Tom Jackson Sales Linda White Kevin Lee

25 Ordered XML model for Skynet Hitech. Company

26 Schema of the storing table Attributes ID ID: the unique index for each tuple DID: the document ID Path: the path from the root to the leaf node, this is to find a particular node Surrogate Pattern: number representation of nodes Value: Text value associated with each node

27 Numbering nodes

28 Tuple that stores “ Linda White ” ID: 00334 DID: 501 Path: Company/Department/Manager Surrogate Pattern: 1[1]2[2]2[1] Value: Linda White

29 Old Skynet file stored in the RDBMS OLD PathSurrogate PattenValue Company/Name1[1]1[1]Skynet Hitech Company/Department/Name1[1]2[1]1[1]Research Company/Department/Manager1[1]2[1]2[1]John Smith Company/Department/Employee1[1]2[1]3[1]Tom Jackson Company/Department/Name1[1]2[2]1[1]Sales Company/Department/Manager1[1]2[2]2[1]Linda White Company/Department/Employee1[1]2[2]3[1]Kevin Lee

30

31

32

33

34

35

36

37

38

39

40 <!ELEMENT book (booktitle, author)

41

42 Basic Inline Algorithm A relation is created for root of element of graph All element’s descendents are inlined into that relation except Children below a “*” node are made into separate relations – this corresponds to creating a new relation for a set-valued child Each node having a backpointer edge pointing to it is made into a separate relation

43 Drawbacks Grossly inefficient for many queries “List all authors having first name Jack” will have to be executed as the union of 5 separate queries Large number of relations it creates

44 To determine the set of relations to be created for an element, we construct an element graph by… Do a DFS traversal of DTD graph, starting at element node for which we are constructing relations Each node is marked as “visited” the first time it is reached and is unmarked once all its children have been traversed If an unmarked node in DTD graph is reach during DFS, a new node bearing the same name is created in the element graph A regular edge is created from the most recently created node in the element graph with the same names as the DFS parent of the current DTD node to newly created node If an attempt is made to traverse an already marked DTD, then a backpointer edge is added from the most recently created node in the element graph to the most recently created node in the element graph of the same name as the marked DTD node

45

46 Fragmentation: Example Results in 5 relations Just retrieving first and last names of an author requires three joins! author (authorID: integer, id: string) name (nameID: integer, authorID: integer) firstname (firstnameID: integer, nameID: integer, value: string) lastname (lastnameID: integer, nameID: integer, value: string) address (addressID: integer, authorID: integer, value: string)

47

48 Shared Inlining Method Relations are created for… All elements in the DTD graph whose nodes have an in-degree greater than one. Nodes with in-degree of one are inlined Elements have an in-degree of zero Elements below a “*” node Of mutually recursive elements all having in-degree one, one of them is made a separate relation Each element node X that is a separate relation inlines all nodes Y that are reachable from it such that the path from X to Y does not contain a node that is to be made a separate relation

49 Issues with Sharing Elements Parent of elements not fixed at schema level Need to store type and ids of parents parentCODE field (type of parent) parentID field (id of parent) No foreign key relationship

50 Hybrid Same as Shared except that it inlines some elements not inlined in Shared Inlines elements with in-degreee greater than one that are not recursive or reached through a “*” node. Set sub-elements and recursive elements are treated as in Shared

51

52 book (bookID: integer, book.booktitle.isroot: boolean, book.booktitle : string) article (articleID: integer, article.contactauthor.isroot: boolean, article.contactauthor.authorid: string) monograph (monographID: integer, monograph.parentID: integer, monograph.parentCODE: integer, monograph.editor.isroot: boolean, monograph.editor.name: string) title (titleID: integer, title.parentID: integer, title.parentCODE: integer, title: string) author (authorID: integer, author.parentID: integer, author.parentCODE: integer, author.name.isroot: boolean, author.name.firstname.isroot: :boolean, author.name.firstname: string, author.name.lastname.isroot: boolean, author.name.lastname: string, author.address.isroot: boolean, author.address: string, author.authorid: string)

53 Shared Inline

54 Hybrid

55


Download ppt "XML To Relational Model. Key Index – Forward Traversal Backward Traversal."

Similar presentations


Ads by Google