Presentation is loading. Please wait.

Presentation is loading. Please wait.

Relational-Style XML Query Taro L. Saito, Shinichi Morishita University of Tokyo June 10 th, SIGMOD 2008 Vancouver, Canada Presented by Sangkeun-Lee Reference.

Similar presentations


Presentation on theme: "Relational-Style XML Query Taro L. Saito, Shinichi Morishita University of Tokyo June 10 th, SIGMOD 2008 Vancouver, Canada Presented by Sangkeun-Lee Reference."— Presentation transcript:

1 Relational-Style XML Query Taro L. Saito, Shinichi Morishita University of Tokyo June 10 th, SIGMOD 2008 Vancouver, Canada Presented by Sangkeun-Lee Reference slides: www.xerial.org/pub/paper/2008/leo-sigmod2008-slides.pdf Intelligent Database Systems Lab School of Computer Science & Engineering Seoul National University, Seoul, Korea

2 Copyright  2009 by CEBT If your Manager Says… Center for E-Business Technology It’s a tragedy…

3 Copyright  2009 by CEBT Migration to XML Database  Benefits of using XML XML is a portable text-data format Tree-structured XML can reduce redundancy of relational data Center for E-Business Technology CompanyEmployeeOffice 1e1NY 1e2NY Co Emp Office NY e1e2 Relational Data XML Data

4 Copyright  2009 by CEBT Problem  Querying relational data translated into XML  Q: Retrieve a node tuple (Co, Emp, Office) from the XML data E.g. Xpath, a path expression query /Co/Emp/Office Center for E-Business Technology CompanyEmployeeOffice 1e1NY 1e2NY Co Emp Office NY e1e2 Relational Data XML Data

5 Copyright  2009 by CEBT Structure Variations  Tree-representation of relational data is not unique Center for E-Business Technology CompanyEmployeeOffice 1e1NY 1e2NY Co Emp Office NY e1e2 Co Office Emp e1 NY Office Emp e1e2 Co NY

6 Copyright  2009 by CEBT Inconvenience of Xpath Query  User must know the entire XML structures to produce correct path queries Center for E-Business Technology Co Emp Office NY e1e2 Co Office Emp e1 NY Office Emp e1e2 Co NY e2 /Co/Office/Emp//Co/Emp[Office]//Office[Co]/Emp

7 Copyright  2009 by CEBT Relational-Style XML Query  Query relations in XML With an SQL-like syntax  SELECT Co, Emp, Office from (XML Data) Center for E-Business Technology Co Emp Office NYNY NYNY e1e2 Co Office Emp e1 NYNY Office Emp e1e2 Co NYNY e2 CompanyEmployeeOffice 1e1NY 1e2NY Input XML Result

8 Copyright  2009 by CEBT To Retrieve Relations in XML Center for E-Business Technology

9 Copyright  2009 by CEBT Problem Definition  Convert an SQL query, SELECT A,B,C, into an XML structure query There can be many structural variations of (A,B,C) For N nodes, there exists N^(N-1) structural variations – 3^2 = 9 Center for E-Business Technology …

10 Copyright  2009 by CEBT An Example Center for E-Business Technology This example involves various tree structures that denote data in the same relation.

11 Copyright  2009 by CEBT Amoeba  A node tuple (A,B,C) is an amoeba iff one of the A,B and C is a common ancestor of the others  Amoeba join retrieves all amoeba structures in the XML data Center for E-Business Technology …

12 Copyright  2009 by CEBT Relation in XML  A Key observation Relation is simply embedded in XML Center for E-Business Technology Co Emp Office NY e1e2 Co Office Emp e1 NY Office Emp e1e2 Co NY e2 /Co/Office/Emp//Co/Emp[Office]//Office[Co]/Emp CompanyEmployeeOffice 1e1NY 1e2NY

13 Copyright  2009 by CEBT Hidden Semantics in XML  Some amoeba structure may not form a relation Why this structure is not allowed?  Because there are functional dependencies (FD) implied in the XML structure Center for E-Business Technology Company Office Emp Company Office Employee 1 M N 1

14 Copyright  2009 by CEBT Functional Dependencies (FD)  FD: X->Y (From a given X,Y is uniquely determined) Employee -> office (Each employee belongs to an office) Office -> company (Each office belongs to a company)  Relation in XML must have an amoeba structure corresponding to each FD  Relations and FDs are sufficient to describe a schema of XML Center for E-Business Technology Company Office Emp Company Office Employee 1 M N 1 Invalid structure!!

15 Copyright  2009 by CEBT Examples of FDs Center for E-Business Technology

16 Copyright  2009 by CEBT Detecting FDs  A type of FDs required to determine XML structures to query is one-to-many(or one-to-one) relationships: FD: Emp -> Office – Each employee belongs to an office – An office may have several employees (one-to-many)  We can observe these relationships by counting node occurrences or directory from the ER-diagram Center for E-Business Technology Company Office Emp Company Office Employee 1 M N 1

17 Copyright  2009 by CEBT If FDs are ignored…  The company has M offices, and each office has N employees:  # of (company, office, employee) tuples: When M = 100, N=5 100x(100x5)  While, # of correct answers is only M*N = 500 Center for E-Business Technology Company Office Employee 1 M N 1 Company Office Emp Office Emp

18 Copyright  2009 by CEBT Avoiding Invalid Relations  However, an amoeba structure itself is a connected component of tree nodes, and thus invalid nodes may be connected, as illustrated in Figure 9.  To avoid these irrelevant node connections, while allowing various tree structures in describing XML data, we introduce a restricted class of XML structures, called a tree relation Center for E-Business Technology

19 Copyright  2009 by CEBT FD-Aware Amoeba Join  Tree Relation >& >& >  FDs: Emp-> Office, Office -> Company  Bottom-up construction of query results Amoeba Join (Employee, Office) Amoeba Join (Office, Company)  FD-aware amoeba join avoids invalid XML structures Center for E-Business Technology Company Office Employee 1 M N 1 Company Office Emp Office Emp

20 Copyright  2009 by CEBT Query Performance  FD-aware amoeba join scales well For various sizes of XML data Center for E-Business Technology

21 Copyright  2009 by CEBT Experiments – Query Set Center for E-Business Technology

22 Copyright  2009 by CEBT Think in Relational-Style  First, consider XML := Relations + their annotations  Steps Detect relational part from XML data Detect one-to-many(one) relationships (FDs) Write relational queries – SELECT Co, Emp, Office  Things more in the paper XML Algebra Data Integration Pushing Structural Constraints for better performance Query Incomplete Relations Other experiments Center for E-Business Technology

23 Copyright  2009 by CEBT Contributions & Conclusions  Relation in XML Defined using amoeba structure and FDs XML Algebra  Relational-Style XML Query Retrieves relations in XML with a SQL-like query syntax (SQL over XML) Allows structural variations of XML data  Departure from path expression queries Target XML structures are automatically determined  Applications Database integration Managing relational data enhanced with XML syntax  “It’s Just SQL” A large number of XML data and queries are still relational Center for E-Business Technology

24 Copyright  2009 by CEBT Contributions & Conclusions Center for E-Business Technology It’s not tragedy anymore…

25 My thought on Center for E-Business Technology Really?

26 Copyright  2009 by CEBT Thoughts on The Paper  Good Points Generally, I think it’s a very good paper The paper shows us very interesting motivation and clearly define the problem The approach in the paper is intuitive and well-explained The authors performed well-designed experiments The paper proposes many future works – Maybe, one of us can work on them  However, I doubt if it is really useful – Can we use this for an important business project? Although the author insists that their algorithm scales well, some queries caused out of memory and didn’t work – It’s a crucial problem to be used in real-life business Isn’t Xpath good enough? – People who use XML docs usually know the structure of XML docs Center for E-Business Technology


Download ppt "Relational-Style XML Query Taro L. Saito, Shinichi Morishita University of Tokyo June 10 th, SIGMOD 2008 Vancouver, Canada Presented by Sangkeun-Lee Reference."

Similar presentations


Ads by Google