Download presentation
Presentation is loading. Please wait.
Published byOphelia Jackson Modified over 9 years ago
1
Relational-Style XML Query Taro L. Saito, Shinichi Morishita University of Tokyo June 10 th, SIGMOD 2008 Vancouver, Canada Presented by Sangkeun-Lee Reference slides: www.xerial.org/pub/paper/2008/leo-sigmod2008-slides.pdf Intelligent Database Systems Lab School of Computer Science & Engineering Seoul National University, Seoul, Korea
2
Copyright 2009 by CEBT If your Manager Says… Center for E-Business Technology It’s a tragedy…
3
Copyright 2009 by CEBT Migration to XML Database Benefits of using XML XML is a portable text-data format Tree-structured XML can reduce redundancy of relational data Center for E-Business Technology CompanyEmployeeOffice 1e1NY 1e2NY Co Emp Office NY e1e2 Relational Data XML Data
4
Copyright 2009 by CEBT Problem Querying relational data translated into XML Q: Retrieve a node tuple (Co, Emp, Office) from the XML data E.g. Xpath, a path expression query /Co/Emp/Office Center for E-Business Technology CompanyEmployeeOffice 1e1NY 1e2NY Co Emp Office NY e1e2 Relational Data XML Data
5
Copyright 2009 by CEBT Structure Variations Tree-representation of relational data is not unique Center for E-Business Technology CompanyEmployeeOffice 1e1NY 1e2NY Co Emp Office NY e1e2 Co Office Emp e1 NY Office Emp e1e2 Co NY
6
Copyright 2009 by CEBT Inconvenience of Xpath Query User must know the entire XML structures to produce correct path queries Center for E-Business Technology Co Emp Office NY e1e2 Co Office Emp e1 NY Office Emp e1e2 Co NY e2 /Co/Office/Emp//Co/Emp[Office]//Office[Co]/Emp
7
Copyright 2009 by CEBT Relational-Style XML Query Query relations in XML With an SQL-like syntax SELECT Co, Emp, Office from (XML Data) Center for E-Business Technology Co Emp Office NYNY NYNY e1e2 Co Office Emp e1 NYNY Office Emp e1e2 Co NYNY e2 CompanyEmployeeOffice 1e1NY 1e2NY Input XML Result
8
Copyright 2009 by CEBT To Retrieve Relations in XML Center for E-Business Technology
9
Copyright 2009 by CEBT Problem Definition Convert an SQL query, SELECT A,B,C, into an XML structure query There can be many structural variations of (A,B,C) For N nodes, there exists N^(N-1) structural variations – 3^2 = 9 Center for E-Business Technology …
10
Copyright 2009 by CEBT An Example Center for E-Business Technology This example involves various tree structures that denote data in the same relation.
11
Copyright 2009 by CEBT Amoeba A node tuple (A,B,C) is an amoeba iff one of the A,B and C is a common ancestor of the others Amoeba join retrieves all amoeba structures in the XML data Center for E-Business Technology …
12
Copyright 2009 by CEBT Relation in XML A Key observation Relation is simply embedded in XML Center for E-Business Technology Co Emp Office NY e1e2 Co Office Emp e1 NY Office Emp e1e2 Co NY e2 /Co/Office/Emp//Co/Emp[Office]//Office[Co]/Emp CompanyEmployeeOffice 1e1NY 1e2NY
13
Copyright 2009 by CEBT Hidden Semantics in XML Some amoeba structure may not form a relation Why this structure is not allowed? Because there are functional dependencies (FD) implied in the XML structure Center for E-Business Technology Company Office Emp Company Office Employee 1 M N 1
14
Copyright 2009 by CEBT Functional Dependencies (FD) FD: X->Y (From a given X,Y is uniquely determined) Employee -> office (Each employee belongs to an office) Office -> company (Each office belongs to a company) Relation in XML must have an amoeba structure corresponding to each FD Relations and FDs are sufficient to describe a schema of XML Center for E-Business Technology Company Office Emp Company Office Employee 1 M N 1 Invalid structure!!
15
Copyright 2009 by CEBT Examples of FDs Center for E-Business Technology
16
Copyright 2009 by CEBT Detecting FDs A type of FDs required to determine XML structures to query is one-to-many(or one-to-one) relationships: FD: Emp -> Office – Each employee belongs to an office – An office may have several employees (one-to-many) We can observe these relationships by counting node occurrences or directory from the ER-diagram Center for E-Business Technology Company Office Emp Company Office Employee 1 M N 1
17
Copyright 2009 by CEBT If FDs are ignored… The company has M offices, and each office has N employees: # of (company, office, employee) tuples: When M = 100, N=5 100x(100x5) While, # of correct answers is only M*N = 500 Center for E-Business Technology Company Office Employee 1 M N 1 Company Office Emp Office Emp
18
Copyright 2009 by CEBT Avoiding Invalid Relations However, an amoeba structure itself is a connected component of tree nodes, and thus invalid nodes may be connected, as illustrated in Figure 9. To avoid these irrelevant node connections, while allowing various tree structures in describing XML data, we introduce a restricted class of XML structures, called a tree relation Center for E-Business Technology
19
Copyright 2009 by CEBT FD-Aware Amoeba Join Tree Relation >& >& > FDs: Emp-> Office, Office -> Company Bottom-up construction of query results Amoeba Join (Employee, Office) Amoeba Join (Office, Company) FD-aware amoeba join avoids invalid XML structures Center for E-Business Technology Company Office Employee 1 M N 1 Company Office Emp Office Emp
20
Copyright 2009 by CEBT Query Performance FD-aware amoeba join scales well For various sizes of XML data Center for E-Business Technology
21
Copyright 2009 by CEBT Experiments – Query Set Center for E-Business Technology
22
Copyright 2009 by CEBT Think in Relational-Style First, consider XML := Relations + their annotations Steps Detect relational part from XML data Detect one-to-many(one) relationships (FDs) Write relational queries – SELECT Co, Emp, Office Things more in the paper XML Algebra Data Integration Pushing Structural Constraints for better performance Query Incomplete Relations Other experiments Center for E-Business Technology
23
Copyright 2009 by CEBT Contributions & Conclusions Relation in XML Defined using amoeba structure and FDs XML Algebra Relational-Style XML Query Retrieves relations in XML with a SQL-like query syntax (SQL over XML) Allows structural variations of XML data Departure from path expression queries Target XML structures are automatically determined Applications Database integration Managing relational data enhanced with XML syntax “It’s Just SQL” A large number of XML data and queries are still relational Center for E-Business Technology
24
Copyright 2009 by CEBT Contributions & Conclusions Center for E-Business Technology It’s not tragedy anymore…
25
My thought on Center for E-Business Technology Really?
26
Copyright 2009 by CEBT Thoughts on The Paper Good Points Generally, I think it’s a very good paper The paper shows us very interesting motivation and clearly define the problem The approach in the paper is intuitive and well-explained The authors performed well-designed experiments The paper proposes many future works – Maybe, one of us can work on them However, I doubt if it is really useful – Can we use this for an important business project? Although the author insists that their algorithm scales well, some queries caused out of memory and didn’t work – It’s a crucial problem to be used in real-life business Isn’t Xpath good enough? – People who use XML docs usually know the structure of XML docs Center for E-Business Technology
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.