1 M ATERIALIZED V IEW M AINTENANCE FOR THE X ML D OCUMENTS Yuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen Yuan Fa, Yabing Chen, Tok Wang Ling, Ting Chen National University of Singapore National University of Singapore Presenter: Qing Li (City University of Hong Kong Presenter: Qing Li (City University of Hong Kong)
2 Background of Materialized View Maintenance Materialized View Maintenance ORA-SS Data Model XML View Incremental XML View Maintenance Related Works Conclusion A GENDA
3 Background
4 Views Relational View XML View Materialized Views Maintain the Materialized Views Re-computation Incremental approach I NTRODUCTION TO VIEW
5 O VERVIEW OF A RCHITECTURE Updated Materialized View ff δ δ’δ’ Data source Updated Data source Materialized View δ: changes on the source data f: function to compute the view content from scratch δ’: changes on the view
6 Why choose incremental approach? Re-computing the materialized view from scratch is usually too costly when only a part of the materialized view needs to be changed The incremental approach will absorb incoming updates and incrementally modify the materialized views without halting query processing. We prefer the incremental approach I NCREMENTAL A PPROACH
7 What’s important for incremental XML view maintenance? Good XML data model to define flexible views with swap, join and aggregations Efficient incremental view maintenance method X ML V IEW M AINTENANCE
8 XML view Defined view with swap, join and aggregation using ORA-SS Extend the XML view transformation to support the flexible views Materialized view maintenance for XML documents Developed relevance checking process for each source XML update. Those update without affecting the view will be detected Developed incremental method to maintain the view with swap, join and aggregation Contributions
9 ORA-SS DATA MODEL
10 Object-Relationship-Attribute model for Semi- Structured data [4] Basic concepts: object classes relationship types Attributes Captures rich semantic information ORA-SS DATA MODEL
11 Represented as a labeled rectangle Attributes are labeled circles connected to the object class by edges ORA-SS : Object Class
12 represented as a labeled edge label: (name, n, p, c) name: relationship name n: degree p: parent participation constraint c: child participation constraint ORA-SS : Relationship Type
13 represented as a labeled circle distinguish object attributes and relationship attributes ORA-SS : Attribute
14 Source XML Document DOC1 - SPJ
15 ORA-SS Schema Diagram of DOC1
16 Source XML Document DOC2 - JD
17 ORA-SS Schema Diagram of DOC2
18 A semantically rich, labeled and directed graph schema Captures much semantic information distinguish attributes from object classes express the degree of relationship types specify the participation constraints on the object classes in a relationship type distinguish object attributes and relationship attributes ORA-SS : Summary
19 XML VIEW
20 View is defined using ORA-SS schema diagram Selection Projection Swap Join Aggregation X ML V IEW D EFINITION
21 X ML V IEW E XAMPLE The view shows information of project of department dn1, part of each project Object class supplier is dropped from the source schema 1. part and project are swapped. A new relationship type jp is created between project and part. A new attribute called total_quantity is created for jp, which is the sum of quantity of a specific part that the suppliers are supplying for the project.
22 X ML V IEW E XAMPLE (cont.)
23 Materialized view View is materialized by using view transformation technique Previous Work Daofeng Luo, Ting Chen, Tok Wang Ling, and Xiaofeng Meng. On View Transformation Support for a Native DBMS. DASFAA 2004, pages , Jeju Island, Korea, March 2004 It can perform accurate and efficient view transformation based on ORA-SS. But the method is only transforming a single source ORA-SS schema to a view schema Our Extended Work Here we enrich the method to handle the complex views which can be over multiple source XML schemas, have selection conditions, and have aggregation functions X ML V IEW M ATERIALIZATION
24 Projection (on object type or relationship type) It selects instances of object classes and relationship types from the source XML documents Selection (on attribute of object class or relationship type) It prunes the instances retrieved from Projection Procedure by checking the selection conditions in the view schema Join (different object classes) It joins the elements with the same name and key attributes together from different source XML documents Aggregation (on attributes) It applies the aggregation function to the values of aggregate attribute if there is an aggregation function associated with the attribute X ML E xtended XML View Materialization Outline
25 X ML Materialized View EXAMPLE
26 VIEW MAINTENANCE
27 Obtain the source update tree according to the update specification and the source document and source schema Check the relevance of the source update to see whether the update will affect the view. If the source update is relevant, we proceed to step 3, otherwise we stop here Generate the view update tree, which contains the update information to the view Merge the view update tree into the view to produce the completed updated materialized view I ncremental Materialized XML View Maintenance Outline
28 S OURCE U PDATE T REE E XAMPLE Source Update Suppose supplier s3 is going to supply part p1 to project j1 with a quantity of 10. This will insert part p1 with child project j1 as the child element of supplier s3 in the source XML doc1 The source update tree in this case is shown in next page, which contains the path from supplier s3 to project j1
29 S OURCE U PDATE T REE E XAMPLE (cont.)
30 Benefit Avoid generating and evaluating unnecessary maintenance statements Insertion/Deletion [STEP 1] Check whether the object classes or relationship types in the source update tree are in the view schema Require to query schema only [STEP 2] Check whether each path in the source update tree satisfies the selection conditions in the view schema Require to query schema using source update tree [STEP 3] Check whether each path in the source update tree joins with any source XML documents Require to query schema, source update tree and source XML documents C heck Source Update Tree Relevance
31 Modification [STEP 1] Check whether the modified attribute appears in the view schema Require to query schema only [STEP 2] Check whether the new and old modified values satisfy the selection condition Require to query schema using source update tree C heck Source Update Tree Relevance (CONT.)
32 Almost same process as view materialization One exception is the source update tree is used as an input instead of the updated source XML document itself General Process: Projection (on object type or relationship type) Selection (on attribute of object class or relationship type) Join (different object classes) Aggregation (on attributes) Generate View Update Tree
33 S AMPLE V IEW U PDATE T REE
34 After the view update tree is computed, we are going to merge the change into the materialized view We merge each path in the view update tree one by one Insertion Deletion Modification Handling aggregation Merge View Update Tree
35 Updated Materialized View
36 RELATED WORKS
37 Abiteboul, et.al. “Incremental Maintenance for Materialized Views over Semistructured Data”, VLDB 98’ The work supposes that the updates are identified by Object IDs. Updates are restricted to single element/attribute update Updates to XML documents may be subtrees and in this case the OIDs are unlikely to be available The work handles the view which is the portion of the source semi-structured data The complex views with swap of XML elements in the hierarchy cannot be handled Related Works
38 Zhuge, et.al. “Graph Structured Views and Their Incremental Maintenance”, ICDE 98’ The view is to retrieve a set of specific objects with their children from the source semi-structured data That means the only hierarchical structure in the view is a binary relationship, and the view only have the set of objects and their children which are originally in the source semi- structured data and satisfying the view specification Only the parent-child relationship needs to be checked with the view definition to determine whether the updated element affect the view Related Works (cont.)
39 Existing Works Updates are limited to atomic value update any single insertion/deletion/change of atomic values causes view maintenance process Views with swap, join and aggregation are not addressed Our work addresses the above issues Related Works Comparison
40 CONCLUSION
41 Extended the XML view transformation to support the flexible views with swap, join, aggregation Proposed a new incremental view maintenance method for XML documents Flexible views with swap, join, aggregation can be handled C ONCLUSION
42 Transaction Update To handle transaction, we will enable multiple changes to be specified in one single update tree. Thus, the view update tree can be derived together at one time All the updates with counter effects need to be removed Implement XML order support Storing order information in the source update tree F UTURE W ORK
43 R EFERENCES 1.S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. Wiener. The Lorel Query Language for Semistructured Data. Journal of Digital Libraries, 1(1), Nov S. Abiteboul, J. McHugh, M. Rys, V. Vassalos, and J. Wiener. Incremental Maintenance for Materialized Views over Semistructured Data. In VLDB, pages 38-49, D. Luo, T. Chen, T. W. Ling, and X. Meng. On View Transformation Support for a Native XML DBMS. In 9th International Conference on Database Systems for Advanced Applications, Korea, March G. Dobbie, X. Y. Wu, T. W. Ling, M. L. Lee. ORA-SS: An Object – Relationship - Attribute Model for Semistructured Data. Technical Report TR21/00, School of Computing, National University of Singapore, Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object Exchange across Heterogeneous Information Sources. In Proceedings of the 11th International Conference on Data Engineering, pages , Taipei, Taiwan, Mar D. Suciu. Query Decomposition and View Maintenance for Query Language for Unstructured Data. In VLDB, pages , Bombay, India, September Y. Zhuge and H. Garcia-Molina. Graph Structured Views and Their Incremental Maintenance. In Proceedings of the 14th International Conference on Data Engineering (DE), World Wide Web Consortium, “XQuery: A Query Language for XML”, W3C Working Draft,
44 T HANKS FOR Y OUR A TTENTION