Master Informatique 1 Dr. Vu Le AnhStructural indexes of XML Databases Dr. Vu Le Anh

Slides:



Advertisements
Similar presentations
Symmetrically Exploiting XML Shuohao Zhang and Curtis Dyreson School of E.E. and Computer Science Washington State University Pullman, Washington, USA.
Advertisements

Database System Concepts and Architecture
XML: Extensible Markup Language
Lecture 24 MAS 714 Hartmut Klauck
Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,
Incremental Maintenance of XML Structural Indexes Ke Yi 1, Hao He 1, Ioana Stanoi 2 and Jun Yang 1 1 Department of Computer Science, Duke University 2.
Optimizing Join Enumeration in Transformation-based Query Optimizers ANIL SHANBHAG, S. SUDARSHAN IIT BOMBAY VLDB 2014
Min LuTIMBER: A Native XML DB1 TIMBER: A Native XML Database Author: H.V. Jagadish, etc. Presenter: Min Lu Date: Apr 5, 2005.
Efficient Processing Regular Queries In Shared-Nothing Parallel Database Systems Using Tree- And Structural Indexes (ADBIS 2007, Bulgaria) Vu Le Anh, Attilla.
1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.
1 Networking through Linux Partha Sarathi Dasgupta MIS Group Indian Institute of Management Calcutta.
Paper by: A. Balmin, T. Eliaz, J. Hornibrook, L. Lim, G. M. Lohman, D. Simmen, M. Wang, C. Zhang Slides and Presentation By: Justin Weaver.
Web-site Management System Strudel Presented by: LAKHLIFI Houda Instructor: Dr. Haddouti.
CPSC-608 Database Systems Fall 2010 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #9.
Managing XML and Semistructured Data Lecture : Indexes.
Query Languages Aswin Yedlapalli. XML Query data model Document is viewed as a labeled tree with nodes Successors of node may be : - an ordered sequence.
TOSS: An Extension of TAX with Ontologies and Similarity Queries Edward Hung, Yu Deng, V.S. Subrahmanian Department of Computer Science University of Maryland,
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
Exploiting Local Similarity for Indexing Paths in Graph-Structured Data by Raghav Kaushik, Pradeep Shenoy, Philip Bohannon and Ehud Gudes 1Abdullah Mueen.
Managing XML and Semistructured Data Lecture 16: Indexes Prof. Dan Suciu Spring 2001.
Managing XML and Semistructured Data Lecture 1: Preliminaries and Overview Prof. Dan Suciu Spring 2001.
Client-Server Processing and Distributed Databases
TECHNIQUES FOR OPTIMIZING THE QUERY PERFORMANCE OF DISTRIBUTED XML DATABASE - NAHID NEGAR.
IS432: Semi-Structured Data Dr. Azeddine Chikh. 1. Semi Structured Data Object Exchange Model.
Performance Guarantees for Distributed Reachability Queries Wenfei Fan 1,2 Xin Wang 1 Yinghui Wu 1,3 1 University of Edinburgh 2 Harbin Institute of Technology.
TEDI: Efficient Shortest Path Query Answering on Graphs Author: Fang Wei SIGMOD 2010 Presentation: Dr. Greg Speegle.
LegoDB 1 Data Binding Workshop, Avaya Labs, June 2003 LegoDB: Cost-based XML to Relational “Shredding” Jerome Simeon Bell Labs – Lucent Technologies joint.
1 Maintaining Semantics in the Design of Valid and Reversible SemiStructured Views Yabing Chen, Tok Wang Ling, Mong Li Lee Department of Computer Science.
XML as a Boxwood Data Structure Feng Zhou, John MacCormick, Lidong Zhou, Nick Murphy, Chandu Thekkath 8/20/04.
G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs Sherif SakrSameh ElniketyYuxiong He NICTA & UNSW Sydney, Australia Microsoft Research Redmond,
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
1 CS 430 Database Theory Winter 2005 Lecture 17: Objects, XML, and DBMSs.
Web Data Management Indexes. In this lecture Indexes –XSet –Region algebras –Indexes for Arbitrary Semistructured Data –Dataguides –T-indexes –Index Fabric.
Membership problem CYK Algorithm Project presentation CS 5800 Spring 2013 Professor : Dr. Elise de Doncker Presented by : Savitha parur venkitachalam.
5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
KAIST2002 SIGDB Tutorial1 Indexing Methods for Efficient XML Query Processing Jun-Ki Min KAIST
XML and Database.
Multi-Query Optimization and Applications Prasan Roy Indian Institute of Technology - Bombay.
Scalable Hybrid Keyword Search on Distributed Database Jungkee Kim Florida State University Community Grids Laboratory, Indiana University Workshop on.
1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.
Network-Aware Query Processing for Stream- based Application Yanif Ahmad, Ugur Cetintemel - Brown University VLDB 2004.
Fall CSE330/CIS550: Introduction to Database Management Systems Prof. Susan Davidson Office: 278 Moore Office hours: TTh
APEX: An Adaptive Path Index for XML data Chin-Wan Chung, Jun-Ki Min, Kyuseok Shim SIGMOD 2002 Presentation: M.S.3 HyunSuk Jung Data Warehousing Lab. In.
Semi-structured Data In many applications, data does not have a rigidly and predefined schema: –e.g., structured files, scientific data, XML. Managing.
Dr. N. MamoulisAdvanced Database Technologies1 Topic 8: Semi-structured Data In various application domains, the data are semi-structured; the database.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
Query Caching and View Selection for XML Databases Bhushan Mandhani Dan Suciu University of Washington Seattle, USA.
1 Holistic Twig Joins: Optimal XML Pattern Matching Nicolas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 2002 Presented by Jun-Ki Min.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
1 Efficient Processing of XML Twig Patterns with Parent Child Edges: A Look-ahead Approach Presenter: Qi He.
Processing XML Streams with Deterministic Automata Denis Mindolin Gaurav Chandalia.
Advanced Database Course Syllabus 1 Advanced Database System Lecturer : H.Ben Othmen.
1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.
Trie Indexes for Efficient XML Query Processing
By A. Aboulnaga, A. R. Alameldeen and J. F. Naughton Vldb’01
Semi-Structured Data and Agile Application Development
RE-Tree: An Efficient Index Structure for Regular Expressions
Probabilistic Data Management
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Distributed Memory Partitioning of High-Throughput Sequencing Datasets for Enabling Parallel Genomics Analyses Nagakishore Jammula, Sriram P. Chockalingam,
Selected Topics: External Sorting, Join Algorithms, …
Towards an Internet-Scale XML Dissemination Service
Frequent-Pattern Tree
XML indexing – A(k) indices
Incremental Maintenance of XML Structural Indexes
Indexing Methods for Efficient XML Query Processing
Wei Wang University of New South Wales, Australia
Accelerating Regular Path Queries using FPGA
Presentation transcript:

Master Informatique 1 Dr. Vu Le AnhStructural indexes of XML Databases Dr. Vu Le Anh

Master Informatique 2 Dr. Vu Le AnhStructural indexes of XML Databases Outline 1.Motiviation 2.Regular queries processing over XML datasets 3.Indexes over XML datasets 4.Structural indexes 5.Structural indexes for distributed XML datasets 6.Summary

Master Informatique 3 Dr. Vu Le AnhStructural indexes of XML Databases NCBI GEO dataset GEO is a public functional genomics data repository supporting MIAME-compliant data submissions. About 600 gigabyte (Feb ). Data are stored in XML datasets A map of gene is written in XML file, and its XML graph.

Master Informatique 4 Dr. Vu Le AnhStructural indexes of XML Databases Virtual observatory A collection of interoperating data archives and software tools which utilize the internet to form a scientific research environment in which astronomical research programs can be conducted. IVOA (International Virtual Observatory Alliance)  Building an international community Using very big XML datasets for storing, exchanging data

Master Informatique 5 Dr. Vu Le AnhStructural indexes of XML Databases Problem Efficient query processing over Big (Distributed) XML - Databases Two “interesting” ideas: 1.Storing the XML database in relational database. Rewriting XML a az XML queries  SQL and Datalog. Rewriting and combining the results. 2.Indexing the XML database. Using the indexes for query processing.

Master Informatique 6 Dr. Vu Le AnhStructural indexes of XML Databases Data Graph – Data Model for XML Data graph: directed, rooted, labelled graph. : set of nodes. : set of label values : set of edges : set of basic edges. : set of reference edges. : the root. : labeling function

Master Informatique 7 Dr. Vu Le AnhStructural indexes of XML Databases Publication XML document John ABC Dr.Ben Tom … Dr. Kiss DEF Dr. Baker XYZ

Master Informatique 8 Dr. Vu Le AnhStructural indexes of XML Databases XML - Datagraph

Master Informatique 9 Dr. Vu Le AnhStructural indexes of XML Databases Regular queries Query language for XML: –XQuery, XPath, UnQL, Lorel, XQL, XML-QL, etc. Build around regular expressions. 3 basic operations: –Concatation:. or / –Union: | –Interation: * For short: _ - some label value // - (_)* some sequence of label values Example: //(Student | Professor)//Paper/Title

Master Informatique 10 Dr. Vu Le AnhStructural indexes of XML Databases Regular queries Pair of nodes (u, v) matches R regular query, if there is a rout from u to v, in which the label sequence of the rout matching R. The result of R : I the input-set and O the output-set, (u, v) matches R} General case: I={root} és O={V}. Every R regular expression can be represented by a finite, not determined automata (NFA), which computes L(R) language. Query graph is the graph representing the automata.

Master Informatique 11 Dr. Vu Le AnhStructural indexes of XML Databases Query processing based on the automata The query graph of //B/D: Input: I={0}; Output: O={0,1,…,15} A A B CB26 AD913 A D BE D CA F E 15 * BD q0q0 q1q1 q2q2 q0q0 q0q0 q0q0 q0q0 q0q0 q 0 q 2 q1q1 q0q0 The result = {(0,3),(0,11),(0,13)}

Master Informatique 12 Dr. Vu Le AnhStructural indexes of XML Databases Transform to Edge Labeled graph Node labeled graphEdge labeled graph Query graph is a edge labeled graph. Transform data graph to edge labeled graph.

Master Informatique 13 Dr. Vu Le AnhStructural indexes of XML Databases State-Data (SD) graph SD graph = Query graph JOINING Data graph SD graph may be not connective. SD-Nodes: (data-node, state-node) SD- labeled edges: Constructing from the matching of labels of data-edges and node-edges.

Master Informatique 14 Dr. Vu Le AnhStructural indexes of XML Databases Joining R:= a/(b|c)*/a and data graph s0s0 s1s1 s2s2 a b c a Query graph: Data graph: a c a a b SD-graph: 1,s 0 2,s 0 2,s 1 1,s 1 2,s 2 a b 3,s 1 c 4,s 2 a 5,s 2 a 5,s 1 a a 3,s 0 4,s 1 Result: (1,4), (1,5) a

Master Informatique 15 Dr. Vu Le AnhStructural indexes of XML Databases SD-graph representation on relational database [KissVu05] Main results: –The data graph and query graph can be represented by tables –SD graph (table) = Joining data table and query table. –Computing the result based on the SD-table. –Regular query processing  DATALOG + SQL –Building the index to support SQL computation.

Master Informatique 16 Dr. Vu Le AnhStructural indexes of XML Databases 1. Step: Transform data graph to edge labeled graph

Master Informatique 17 Dr. Vu Le AnhStructural indexes of XML Databases 2. step: Query graph representation

Master Informatique 18 Dr. Vu Le AnhStructural indexes of XML Databases 3. lépés: Using DATALOG, SQL for the computation

Master Informatique 19 Dr. Vu Le AnhStructural indexes of XML Databases 4. step: Computation in Relational Databases results: {4,5,6}

Master Informatique 20 Dr. Vu Le AnhStructural indexes of XML Databases Classes of XML indexes 1.Indexing the basic values –The basis values are indexing (Ex: data(//emp/salary)) –Using B + -tree 2.Indexing the text values –Keywords should be indexed 3.Indexes for XML -Tree –Quickly checking and computing the label sequence of rout between some pair of nodes. –Applying it for near-tree XML datasets. 4.Structural indexes. –Simulating the datagraph by smaller one to reduce the cost of computation

Master Informatique 21 Dr. Vu Le AnhStructural indexes of XML Databases XML-tree pre/post computing [Dietz82] Tree preorder/postorder walking for computing (pre(x),post(x)) (1,7) (2,4) (3,1) (4,2) (5,3) (6,6) (7,5) x is a descendent of y pre(x) < pre(y) és post(x) > post(y)

Master Informatique 22 Dr. Vu Le AnhStructural indexes of XML Databases Tree- Structure Improvement [Li&Moon VLDB 2001] Every x node: (order(x), size(x)) (1,100) (10,30) (11,5) (17,5) (25,5) (41,10) (45,5) x is a descendent of y order(x) < order(y) és order(y) <= order(y) + size(x)

Master Informatique 23 Dr. Vu Le AnhStructural indexes of XML Databases Regular query processing over XML –tree and near tree Very efficient  based on tree-structured indexes [KissVu06]: Applying for near-tree XML dataset Link graph: Connecting between link nodes. Using tree-structured indexes for the basic structure

Master Informatique 24 Dr. Vu Le AnhStructural indexes of XML Databases Family of Structural indexes

Master Informatique 25 Dr. Vu Le AnhStructural indexes of XML Databases 1-index [Milo & Suciu, LNCS 1997] Idea: Grouping all “equivalent” data-nodes into an index-node. Computing the index nodes  bi-simulation equivalent ≡ ekvivalencia helyett. Index graph is smaller than the data-graph Working for every regular queries. A bi-simulation computing = PTIME.

Master Informatique 26 Dr. Vu Le AnhStructural indexes of XML Databases Bisimulation A  bi-simulation: –x1 és x2 have the same label –If x1  x2 and (y1,x1) is an edge, then there exists edge (y2,x2), in which y1  y2. y1y1y2y2 a  x1x1 a  x2x2 b b

Master Informatique 27 Dr. Vu Le AnhStructural indexes of XML Databases Example 1-index 1 paper 2,4,8,13 section 3,5,9,14 title 6,10 algorithm 7 proof 11 proof 12 uses 15,16 17,18 about exp 1-index 1 paper 4 section 5 title 6 algorithm 7 proof 8 section 9 title proof 12 uses algorithm 13 section about title 2 section 3 title exp Data Graph /paper/section/algorithm

Master Informatique 28 Dr. Vu Le AnhStructural indexes of XML Databases Using 1-index? Good: Working for all regular queries. Bad: Not small enough !!! Idea: The index graph is designed only for the most frequently in use queries. The index graph is very small now !!! New equivalent relationship between nodes should be defined If the query is not support, re-check on the data graph

Master Informatique 29 Dr. Vu Le AnhStructural indexes of XML Databases Structural indexes and a given set of queries Important : –//a0/a1/…/ai (i<=k), not longer than k A(k)-index –Dinamikus indexek APEX, D(k)-index –//S0/S1/…/Sk, SAPE queries DL-1, DL-A*(k)-index –Forward-backward queries F&B-index

Master Informatique 30 Dr. Vu Le AnhStructural indexes of XML Databases A(k)-Index [Kaushik et al. 02] A //a0/a1/…/ai (i<=k) A k-biszimulation. A  k (k-biszimuláció): –u  0 v, ha u és v if they have same label, –u  k v if u  k-1 v and If (u’,u) is an edge, there exists edge (v’,v): u’  k-1 v’ If (v’,v) is an edge, there exists edge (u’,u): u’  k-1 v’

Master Informatique 31 Dr. Vu Le AnhStructural indexes of XML Databases A(k)-index imdb movie director name tv director name {1} {2} {3} {4} {5} {6,8} {7,9} A(2)-index (1-index) imdb movie director name tv director name director name Data graph imdb movie tv director name {1} {2} {5} {3,6,8} {4,7,9} A(0)-index imdb movie director tv director name {1} {2} {3} {5} {6,8} {4,7,9} A(1)-index

Master Informatique 32 Dr. Vu Le AnhStructural indexes of XML Databases Split Operation R AB C3 C6 C1C2 C4C5 R AB C2,C3C1 C4C5,C6 R AB C2,C3C1 C4,C5,C6 R AB C1,C2,C3 C4,C5,C6 Adatgráf A(2) (=1-index) A(1) A(0)

Master Informatique 33 Dr. Vu Le AnhStructural indexes of XML Databases Refinement (1. step) R AB C3 C6 C1C2 C4C5 R AB C2,C3C1 C4C5,C6 R AB C2,C3C1 C4,C5,C6 R AB C1,C2,C3 C4,C5,C6 Data gráph A(2) (=1-index) A(1) A(0)

Master Informatique 34 Dr. Vu Le AnhStructural indexes of XML Databases Refinement (2. step) R AB C3 C6 C1C2 C4C5 R AB C2,C3C1 C4C5,C6 R AB C2,C3C1 C4,C5,C6 R AB C1,C2,C3 C4,C5,C6 Data graph A(2) (=1-index) A(1) A(0)

Master Informatique 35 Dr. Vu Le AnhStructural indexes of XML Databases DL-1-index [KissVu06] //S0/S1/…/Sk (SAPE = Simple Alternation Path Expression). Dinamikus index (Dynamic labelling).

Master Informatique 36 Dr. Vu Le AnhStructural indexes of XML Databases A //(d|e)/f SAPE query a bb d c de f e f f f d g Data Graph A SAPE query: //(d|e)/f R := S 0 / S 1 S 0 = { d,e } ; S 1 = { f } A (4,9), (5,10), (6,11) és (7,12) matching R. The result: T G (R) = {9,10,11,12}

Master Informatique 37 Dr. Vu Le AnhStructural indexes of XML Databases Example: DL 1-index support //(K|L) és //(B|C)/E queries A B E E C F C D E M L NK The data graph and the 1-index are the same. 0 A 1,2,3,4 K,L,M,N 5,6,7,8 B,C,D 9,10,11,12 E,F DL-1- index at the begin. 0 A 1,2 K,L 3,4 M,N 5,6 B,C 7,8 C,D 9,10 E 11,12 E,F 0 A 1,2 K,L 5,6 B,C 9,10 E 3,4 M,N C F D E (a)(b) (c)(d) R 1 = //(K|L) supportR 2 = //(B|C)/E Support

Master Informatique 38 Dr. Vu Le AnhStructural indexes of XML Databases A DL-A*(k)-index [KissVu06] 1.The A(i)-index is a special case of DL- A*(k). 2.DL-A*(k)-index support for a given not longer k SAPE queries.

Master Informatique 39 Dr. Vu Le AnhStructural indexes of XML Databases DL-A*(1)-index support A //(K|L) and //(B|C)/E queries A B E E C F C M L K D E N Data graph the begin index: //(K|L) - refinement: //(B|C)/E -refinement:

Master Informatique 40 Dr. Vu Le AnhStructural indexes of XML Databases Experiments 1.DL-1 vs. 1-index 2.DL-A*(k) vs. A(k)-index 2 datasets: -XMark: 100 Mb, nodes. -TreeBank: 82Mb, nodes.

Master Informatique 41 Dr. Vu Le AnhStructural indexes of XML Databases

Master Informatique 42 Dr. Vu Le AnhStructural indexes of XML Databases Distributed XML-tree XML- tree = Fragments – sub trees. Servers stores some fragments. There are linking edges between fragments. Questions: Finding efficient protocol for regular query processing? Waiting time – Computing time Applying structural indexes?

Master Informatique 43 Dr. Vu Le AnhStructural indexes of XML Databases //a/b//a processing on XML –tree using 2 servers

Master Informatique 44 Dr. Vu Le AnhStructural indexes of XML Databases Flow modell (SPIDER algoritmus) Beginning from the root. (F, q)  (F’, q’): 1.Processing on F stops. 2.Processing on F’ with state q’. 3.If finish processing over F’, then send the result to F. 4.F continues Waiting time!

Master Informatique 45 Dr. Vu Le AnhStructural indexes of XML Databases 2 phases parallel modell Servers: Computing every possible states on it own site. Sending to the coordinator the link edge Coordinator examines the link edges and request the results from servers Severs send the results to coordinator. The computing time !!!

Master Informatique 46 Dr. Vu Le AnhStructural indexes of XML Databases 1- phase parallel model [KissVu07] The coordinator builds the structural Tree-index for whole system for determine connective (F,q) states. Processing on the index first for computing connective states Good: Efficient processing Bad: The index may be big.

Master Informatique 47 Dr. Vu Le AnhStructural indexes of XML Databases Structural Tree-index A F0F0 0 F3F3 1 2 A B8 F2F2 F4F4 F1F AC D CB F E D D B A A E 7 F5F5 Fa-index AF0F0 AF2F2 BF3F3 BF4F4 DF1F1 DF5F5 ε AB AC A ε q0q0 q 0 q 1 (F 2,q 1 ), (F 2,q 2 ): is not connective q0q0 q0q0 q0q0 q 0 q 1 Connective states: (F 0,q 0 ), (F 1,q 0 ), …

Master Informatique 48 Dr. Vu Le AnhStructural indexes of XML Databases Experiments 19 Linux local-servers. Waiting time: 1IP : 2P : SP = 1 : 1.94 : Computing time: 1IP : 2P : SP = 1 : 1.77 : 2.75

Master Informatique 49 Dr. Vu Le AnhStructural indexes of XML Databases Native XML database systems Termék Fejlesztő License Adatbázistípus Qizx/dbQizx/db XMLMind Commercial Proprietary Sedna XML DBMSSedna XML DBMS ISP RAS MODIS Free Proprietary Sekaiju / YggdrasillSekaiju / Yggdrasill Media Fusion Commercial Proprietary SQL/XML-IMDBSQL/XML-IMDB QuiLogic Commercial Proprietary (native XML and relational) Sonic XML ServerSonic XML Server Sonic Software Commercial Object-oriented (ObjectStore). TaminoTamino Software AG Commercial Proprietary. Relational through ODBC. TeraText DBSTeraText DBS TeraText Solutions Commercial Proprietary TEXTML ServerTEXTML Server IXIASOFT, Inc.Commercial Proprietary TigerLogic XDMSTigerLogic XDMS Raining Data Commercial Pick TimberTimber University of Michigan Open Source (non-commercial only) Shore, Berkeley DB TOTAL XMLTOTAL XML Cincom Commercial Object-relational VirtuosoVirtuoso OpenLink Software Commercial Proprietary. Relational through ODBC XDBMXDBM Matthew Parry, Paul Sokolovsky Open Source Proprietary XDBXDB ZVON.org Open Source Relational (PostgreSQL) XediX TeraSolutionXediX TeraSolution AM2 Systems Commercial Proprietary X-Hive/DBX-Hive/DB X -Hive Corporation Commercial Proprietary. Relational through JDBC XindiceXindice Apache Software Foundation Open Source Proprietary xml.gax.comxml.gax.com GAX Technologies Commercial Proprietary Xpriori XMSXpriori XMS Xpriori Commercial Proprietary XQuantum XML Database ServerXQuantum XML Database Server Cognetic Systems Commercial Proprietary XStreamDB Native XML DatabaseXStreamDB Native XML Database Bluestream Db. Soft. Corp. Commercial Proprietary Xyleme Zone ServerXyleme Zone Server Xyleme SA Commercial Proprietary

Master Informatique 50 Dr. Vu Le AnhStructural indexes of XML Databases Summary 1.Big XML is used in many applications 2.Our problem: Efficient processing regular queries over XML databases. 3.Two ideas: 1.Using Relational databases 2.Building special indexes for XML databases

Master Informatique 51 Dr. Vu Le AnhStructural indexes of XML Databases Summary 4. Tree - index can be applied for XML tree and XML- near tree (using link graph) 5. Structural indexes: Simulate the data-graph by the smaller ones – index graphs. Construction based on the equivalent relationships. 6. Structural indexes is designed for support only a given of queries. 7. It can be applied in distributed XML database query processing (Cloud, Social networks)

Master Informatique 52 Dr. Vu Le AnhStructural indexes of XML Databases References [Chung et al., SIGMOD 2002] –Chin-Wan Chung, Jun-Ki Min, Kyuseok Shim, APEX: an adaptive path index for XML data, Proceedings of the 2002 ACM SIGMOD international conference on Management of data, June 03-06, 2002, Madison, Wisconsin [doi> / ] / [Dietz82] –Dietz, P. F Maintaining order in a linked list. In Proceedings of the Fourteenth Annual ACM Symposium on theory of Computing (San Francisco, California, United States, May , 1982). STOC '82. ACM, New York, NY, DOI= [Goldman & Widom VLDB 97] –Goldman, R. and Widom, J DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In Proceedings of the 23rd international Conference on Very Large Data Bases (August , 1997). M. Jarke, M. J. Carey, K. R. Dittrich, F. H. Lochovsky, P. Loucopoulos, and M. A. Jeusfeld, Eds. Very Large Data Bases. Morgan Kaufmann Publishers, San Francisco, CA, [Kaushik et al. 02] –Raghav Kaushik, Pradeep Shenoy, Philip Bohannon, Ehud Gudes, "Exploiting Local Similarity for Indexing Paths in Graph-Structured Data," Data Engineering, International Conference on, p. 0129, 18th International Conference on Data Engineering (ICDE'02), 2002 [Kiss05] –Attila Kiss, Vu Le Anh A solution for regular queries on XML Data, (PUMA Volume 15 (2005), Issue No. 2, pp [Kiss06] –Attila Kiss, Vu Le Anh: Efficient Processing SAPE Queries Using the Dynamic Labelling Structural Indexes. ADBIS 2006: ADBIS 2006 [Kiss07] –Attila Kiss, Vu Le Anh: Efficient Processing Regular Queries In Shared-Nothing Parallel Database Systems Using Tree- And Structural Indexes. ADBIS Research Communications 2007ADBIS Research Communications 2007 [Li&Moon VLDB 2001] –Li and Moon, 2001 Li, Q., Moon, B., Indexing and querying XML data for regular expressions. In: Proceedings of VLDB 2001, pp. 367–370. [Milo & Suciu, LNCS 1997] –Milo, T., Suciu, D. (1999), "Index structures for path expressions", 7th International Conference on Database Theory (ICDT), pp [Paige &Tarjan 87] –Paige, R. and Tarjan, R. E Three partition refinement algorithms. SIAM J. Comput. 16, 6 (Dec. 1987), DOI=

Master Informatique 53 Dr. Vu Le AnhStructural indexes of XML Databases Thank you!