Tree-Pattern Queries on a Lightweight XML Processor MIRELLA M. MORO Zografoula Vagena Vassilis J. Tsotras Research partially supported by CAPES, NSF grant.

Slides:

Advertisements

Similar presentations

Ting Chen, Jiaheng Lu, Tok Wang Ling

Advertisements

Jiaheng Lu, Ting Chen and Tok Wang Ling National University of Singapore Finding all the occurrences of a twig.

Computing Structural Similarity of Source XML Schemas against Domain XML Schema Jianxin Li 1 Chengfei Liu 1 Jeffrey Xu Yu 2 Jixue Liu 3 Guoren Wang 4 Chi.

Bottom-up Evaluation of XPath Queries Stephanie H. Li Zhiping Zou.

Processing XML Keyword Search by Constructing Effective Structured Queries Jianxin Li, Chengfei Liu, Rui Zhou and Bo Ning Swinburne University of Technology,

1 Virtual Cursors for XML Joins Beverly Yang (Stanford) Marcus Fontoura, Eugene Shekita Sridhar Rajagopalan, Kevin Beyer CIKM’2004.

Jianxin Li, Chengfei Liu, Rui Zhou Swinburne University of Technology, Australia Wei Wang University of New South Wales, Australia Top-k Keyword Search.

Twig 2 Stack: Bottom-up Processing of Generalized-Tree-Pattern Queries over XML Documents Songting Chen, Hua-Gang Li *, Junichi Tatemura Wang-Pin Hsiung,

Efficient Keyword Search for Smallest LCAs in XML Database Yu Xu Department of Computer Science & Engineering University of California, San Diego Yannis.

DIMACS Streaming Data Working Group II On the Optimality of the Holistic Twig Join Algorithm Speaker: Byron Choi (Upenn) Joint Work with Susan Davidson.

Structural Joins: A Primitive for Efficient XML Query Pattern Matching Al Khalifa et al., ICDE 2002.

Boosting XML filtering through a scalable FPGA-based architecture A. Mitra, M. Vieira, P. Bakalov, V. Tsotras, W. Najjar.

1 CS 561 Presentation: Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Ming Li.

ViST: a dynamic index method for querying XML data by tree structures Authors: Haixun Wang, Sanghyun Park, Wei Fan, Philip Yu Presenter: Elena Zheleva,

Presentation for Cmpe-521 VIST – Virtual Suffix Tree Prepared by: Evren CEYLAN – Aslı UYAR

Xyleme A Dynamic Warehouse for XML Data of the Web.

1 Extending PRIX for Similarity-based XML Query Group Members: Yan Qi, Jicheng Zhao, Dan Situ, Ning Liao.

1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.

1 Optimizing Cursor Movement in Holistic Twig Joins Marcus Fontoura, Vanja Josifovski, Eugene Shekita (IBM Almaden Research Center) Beverly Yang (Stanford)

Indexing XML Data Stored in a Relational Database VLDB`2004 Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili.

1 Prefix Path Streaming: a New Clustering Method for XML Twig Pattern Matching Ting Chen, Tok Wang Ling, Chee-Yong Chan School of Computing, National University.

1 Holistic Twig Joins: Optimal XML Pattern Matching ACM SIGMOD 2002.

Pattern tree algebras: sets or sequences? Stelios Paparizos, H. V. Jagadish University of Michigan Ann Arbor, MI USA.

A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.

©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.

Querying Structured Text in an XML Database By Xuemei Luo.

VLDB'02, Aug 20 Efficient Structural Joins on Indexed XML1 Efficient Structural Joins on Indexed XML Documents Shu-Yao Chien, Zografoula Vagena, Donghui.

TwigStackList¬: A Holistic Twig Join Algorithm for Twig Query with Not-predicates on XML Data by Tian Yu, Tok Wang Ling, Jiaheng Lu, Presented by: Tian.

Graph Indexing: A Frequent Structure- based Approach Alicia Cosenza November 26 th, 2007.

5/2/20051 XML Data Management Yaw-Huei Chen Department of Computer Science and Information Engineering National Chiayi University.

BNCOD07Indexing & Searching XML Documents based on Content and Structure Synopses1 Indexing and Searching XML Documents based on Content and Structure.

CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.

Early Profile Pruning on XML-aware Publish- Subscribe Systems Mirella M. Moro, Petko Bakalov, Vassilis J. Tsotras University of California VLDB 2007 Presented.

Chapter 12 Query Processing. Query Processing n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation of Expressions 2.

Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.

FlexTable: Using a Dynamic Relation Model to Store RDF Data IDS Lab. Seungseok Kang.

QED: A Novel Quaternary Encoding to Completely Avoid Re-labeling in XML Updates Changqing Li,Tok Wang Ling.

ICDE 2002, San Jose, CA Efficient Temporal Join Processing using Indices Donghui Zhang University of California, Riverside Vassilis J. Tsotras University.

Tree-Pattern Queries on a Lightweight XML Processor MIRELLA M. MORO Zografoula Vagena Vassilis J. Tsotras Research partially supported by CAPES, NSF grant.

Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.

Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.

1 Approximate XML Query Answers Presenter: Hongyu Guo Authors: N. polyzotis, M. Garofalakis, Y. Ioannidis.

Graph Data Management Lab, School of Computer Science Branch Code: A Labeling Scheme for Efficient Query Answering on Tree

Dr. N. MamoulisAdvanced Database Technologies1 Topic 8: Semi-structured Data In various application domains, the data are semi-structured; the database.

From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching Jiaheng Lu, Tok Wang Ling, Chee-Yong Chan, Ting Chen National.

Efficient Processing of Updates in Dynamic XML Data Changqing Li, Tok Wang Ling, Min Hu.

Computing & Information Sciences Kansas State University Wednesday, 08 Nov 2006CIS 560: Database System Concepts Lecture 32 of 42 Monday, 06 November 2006.

File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.

Holistic Twig Joins Optimal XML Pattern Matching Nicolas Bruno Columbia University Nick Koudas Divesh Srivastava AT&T Labs-Research SIGMOD 2002.

1 Holistic Twig Joins: Optimal XML Pattern Matching Nicolas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 2002 Presented by Jun-Ki Min.

Holistic Twig Joins: Optimal XML Pattern Matching Nicholas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 02 Presented by: Li Wei, Dragomir Yankov.

Efficient Discovery of XML Data Redundancies Cong Yu and H. V. Jagadish University of Michigan, Ann Arbor - VLDB 2006, Seoul, Korea September 12 th, 2006.

1 Efficient Processing of XML Twig Patterns with Parent Child Edges: A Look-ahead Approach Presenter: Qi He.

Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)

1 Efficient Processing of Partially Specified Twig Queries Junfeng Zhou Renmin University of China.

Database Management System

Efficient Filtering of XML Documents with XPath Expressions

RE-Tree: An Efficient Index Structure for Regular Expressions

Chapter 12: Query Processing

(b) Tree representation

Joining Interval Data in Relational Databases

Structure and Content Scoring for XML

Lecture 2- Query Processing (continued)

Early Profile Pruning on XML-aware Publish-Subscribe Systems

Structure and Content Scoring for XML

Efficient Processing of Top-k Spatial Preference Queries

Wei Wang University of New South Wales, Australia

Donghui Zhang, Tian Xia Northeastern University

Relax and Adapt: Computing Top-k Matches to XPath Queries

Efficient Aggregation over Objects with Extent

Presentation transcript:

Tree-Pattern Queries on a Lightweight XML Processor MIRELLA M. MORO Zografoula Vagena Vassilis J. Tsotras Research partially supported by CAPES, NSF grant IIS , UC Micro, and Lotus Interworks

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 2 Outline Motivation and Contributions Background Method Categorization Experimental Evaluation Conclusions

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 3 Motivation XML query languages: selection on both value and structure  “Tree-pattern” queries (TPQ) very common in XML Many promising holistic solutions None in lightweight XML engines  Without optimization module (e.g. eXist, Galax)   Effective, robust processing method Reasons:  No systematic comparison of query methods under a common storage model  No integration of all methods under such storage model Context: XPath semantics, stored data (indexed at will)

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 4 Contributions TPQ methods over unified environment Method Categorization: data access patterns and matching algorithm Common storage model + integration of all methods  Capture the access features  Permit clustering data with off-the-shelf access methods (e.g. B + tree) Novel variations of methods using index structures + Handle TPQ Extensive comparative study  Synthetic, benchmark and real datasets  Decision in the applicability, robustness and efficiency

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 5 article author last procs conf Background article (2,19) last (7,9) 2<7<9<19 Bib (1,20) article (2,19) title (3,5) procs (14,18) author (6,13) last (7,9) first (10,12) David J. (11) DeWitt (8) conf (15,17) t1 (4) VLDB (16) XML database = forest of unranked, ordered, node-labeled trees, one tree per document TPQ

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 6 Common Storage Model B + Tree on ( tag, initial ) bib (1,16) book (2,9) (10,17) author (3,8) (11,16) (19,24) name (4,5) (12,13) (20,21) paper (18,25) address (6,7) (14,15) (22,23) bib (1,26) book (2,9)paper (18,25) author (3,8)author (19,24) name (4,5) address (6,7) name (20,21) address (22,23) book (10,17) author(11,16) name (12,13) address (14,15) Input = sequence (list) of elements One list per document tag = element list  Node clustering by index structures Numbering scheme

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 7 Method Categorization Parameters: access pattern and matching algorithm (1) set based techniques (2) query driven (3) input driven (4) structural summaries

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 8 Cat 1: Set-based Techniques Access PatternMatching Process Sorted/indexed Join sets, merge individual paths Input: sequences of elements, one list per query node element, possibly indexed (set-based) Major representative: TwigStack  Optimal XML pattern matching algorithm (ancestor/descendant) Stack-based processing  Set of stacks = compact encoding of partial and total results in linear space (possibly exponential number of answers)

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 9 TwigStack + Indexes B + tree, built on the left attribute  From ancestor: probe descendants: skip initial nodes  Ancestor skipping not effective (up to 1st element that follows) XB-tree: on (left,right) bounding segment XR-tree: on (left,right), B+tree with complex index key + stab lists A comparative study* shows that  Skipping ancestors: XBTree better (XBTree size is smaller)  Recursive level of ancestors: XBTree better again Searching on stab lists of XR-tree is less efficient  Plain B+tree: skips descendants, BUT not ancestors  XBTwigStack is our choice * H.Li et al. “An Evaluation of XML Indexes for Structural Joins”. Sigmod Record, 33(3), Sept 04

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 10 Cat 2: Query Driven Techniques Processing: the query defines the way input is probed Major representatives: ViST and PRIX Specific details: significantly different Same strategy  Convert both document and query to sequences  Processing query = subsequence matching Access PatternMatching Process Indexed/randomIncremental construction of each result instance

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 11 ViST and PRIX Recursively identify matches = quadratic time Optimize the naïve solution:  Identify candidate nodes for each matching step  Index structures to cluster those candidates Subsequence matching process = a plan consisting of INLJ among relations, each of which groups document nodes with the same label For a given query, joins sequence statically defined by the sequencing of the query INLJ plans are a superset of the static plans that PRIX and VIST use

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 12 ViST x PRIX x INLJ Percentage of nodes processed by each algorithm INLJ: best plan Dataset #nodesVISTPRIXINLJ 100%100 LEAVES: 80% LEAVES: 1% ROOT: 80% ROOT: 1% INTERNAL: 80% INTERNAL: 1%

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 13 INLJ : improved B + tree TPQ  evaluation of relational plan Independence of the ordered XML model Total avoidance of false positives a 1,52 b 32,41 b 34,37 a 33,40 c 38,39 c 35,36 b 2,31 b 42,51 b elem. list 33 34,41 42,51 2,31 32,41 Consider b//c Starting from c

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 14 Cat 3: Input Driven Techniques Processing: at each point, the flow of computation is guided entirely by the input through a Finite State Machine (DFA/NFA) Advantages  Each node processed only once  Simplicity, sequential access pattern Problem: skipping elements Access PatternMatching Process SequentialInput drives computation, merge individual paths

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 15 SingleDFA and IdxDFA SingleDFA  triggers the DFA, choosing next state  : execution backtracks to when start processed  TPQ matching: intermediate results compacted on stacks Experiments show reading whole input = not enough Speeding up navigation: IdxDFA  Instead of reading sequentially: use indexes and skip descendants

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 16 IdxDFA: example c1c1 b2b2 a3a3 c4c4 d6d6 c5c5 d7d7 b9b9 d9d9 c 10 d 11 a 12 c 16 d6d6 b 13 d 14 c 15 b 21 c 22 a b cd

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 17 IdxDFA: example c1c1 b2b2 a3a3 c4c4 d6d6 c5c5 d7d7 b9b9 d9d9 c 10 d 11 a 12 c 16 d6d6 b 13 d 14 c 15 b 21 c 22 b9b9 c4c4 d6d6 c5c5 d7d7 a b cd d9d9 c 10 d 11 a 12 b 13 d 14 c 15 c 16 d6d6 b 21 c 22

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 18 Cat 4: Graph Summary Evaluation Structural summary: index node identifies a group of nodes in the document Processing: identify index nodes that satisfy the query + post processing filtering Beneficial: when there is a reasonable structural index, much smaller than document Problem: graph size comparable/larger than original document Access PatternMatching Process Indexed/RandomMerge-join partitioned input, merge individual paths

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 19 Categories Summary Access Pattern Matching ProcessMethods Set BasedSorted/ Indexed Join sets, merge individual paths Twigstack /XB, B + tree, XR-tree Query Driven Indexed/ random Incremental construction of each result instance (ViST, PRIX) INLJ Input Driven Sequential Input drives computation, merge individual paths SingleDFA, IdxDFA Structural Summary Indexed/ random Merge-join partitioned input, merge individual paths Structural indexes

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 20 Experimental Evaluation 1. Experiments with real datasets 2. Experiments with synthetic datasets  Further analyze each method  Characterize the methods according to specific features available in each custom dataset 3. More sets of experiments  Closely verify XBTWIGSTACK versus INLJ

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 21 Algorithms using the same API Analysis varying structure and selectivity Performance measure = total time required to compute a query  Number of nodes as secondary information Intel Pentium 4 2.6GHz, 1Gb ram Berkeley DB: 100 buffers, page size 8Kb, B + tree Real/benchmark datasets  XMark (Internet auction, 1.4 GB raw data, ± 17 million nodes), Protein Sequence Database Setup

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 22 XMark

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 23 Custom Data Goal: isolate important features Query //a//b[.//c]//d  Simple enough for detailed investigation  Complex enough to provide large number of different data access possibilities Vary selectivity of each element separately Add recursion to key elements (root, leaf) a b cd

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 24 Custom Data a b cd

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 25 Custom Data a b cd

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 26 XBTwigStack x INLJ On large dataset, 40mi nodes, 1Gb, 1% selectivity Difference of 40s between XBTwig and INLJ best plan

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 27 XBTwigStack x INLJ

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 28 Conclusions Categorization of TPQ processing algorithms Adaptations for processing TPQ  DFA + accessing nodes from B + tree  INLJ + ancestor skipping DFA-based improved, IdxDFA, not enough Structural summary available and smaller than document: StrIdx XBTwigStack: most robust and predictable  INLJ when high selectivity: no guarantee about chosen plan without optimizer module

Questions?

EXTRA SLIDES

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 31 Bib (1,36) article (2,19) title (3,5) procs (14,18) author (6,13) last (7,9) first (10,12) David J. (11) DeWitt (8) conf (15,17) t1 (4) article (20,35) title (21,23) author (24,31) last (25,27) first (28,30) procs (32,34) Hongjun (29) Lu (26) conf (33) t2 (22) article author last procs conf Background Region numbering scheme : (left, right) VLDB (16)

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 32 TwigStack a 1 b 1 b 2 a 2 c 2 c 1 a b c a1a1 SaSa SbSb ScSc a2a2 b1b1 b2b2 c1c1 c2c2 a 2 b 2 c 1 a 1 b 1 c 1 a 1 b 1 c 2 a 1 b 2 c 1 1) solutions individual root-to-leaf paths 2) merge-join those partial solutions → before adding element to stack: (i) the node has a descendant on each of the query children streams (ii) each of those descendant nodes recursively satisfies this property → optimized by indexes docquery results

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 33 TwigStack + Indexes B + -tree: built on the left attribute  Access ancestor then probe descendant stream to skip unmatchable initial nodes  Ancestor skipping not effective: Skip only up to the first element following a given one XB-tree: index on (left,right) bounding segment  Pointer to children (region completely included in parent)  Leaves sorted on left  Region: ancestor access effective XR-tree: index on (left,right) = B + tree with complex index key + stab lists  Ancestor skipping: elements stabbed by left b 1 b 2 b 3 a 1 c 2 c 1

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 34 ViST, Virtual Suffix Tree Input: sequence of (symbol, path) pairs (a 1,  )(b 1,a 1 )(a 2,a 1 b 1 )(b 2,a 1 b 1 a 2 )(c 1,a 1 b 1 a 2 b 2 )(c 2, a 1 b 1 a 2 )  Document and query translated  Virtual suffix tree (B + -tree) indexed left Processing  Structural query = find (non-contiguous) subsequence matches → suffix tree Benefit: query as a whole instead of merging parts  One query path per time  Efficient when query top defines the results a 1 b 1 b 2 a 2 c 2 c 1

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 35 ViST, index Virtual suffix tree  B + tree, nodes indexed on the left position  D- ancestor and S-ancestor aa bc b a c c 1,13 2,4 3 5,7 69, ,12 D-Ancestor (c,bac) B+B+ (b,  ) (a,b) (b,ba) (c,ba) S-Ancestor 1,13 6 9,11 2,4 5,7 8,

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 36 A 18 B 11 B 4 B 14 B 17 C 6 C 1 F 10 C 13 D 3 D 8 D 16 Document v 5 v 0 v 9 v 12 v 2 v 7 v 15 A B C D NN Query (A, ε ) (B,A)(C,B)(D,B) Query Sequence ViST ViST Subsequence Matching Final Filtering

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 37 ViST, algorithm Q = q 1, … q k, query sequence D-tree B + index of (symbol, prefix) S-tree B + index of region labels function Search (region, i) if i < |Q| T = retrieve q i S-tree from D-tree N = retrieve from S-tree all nodes in the range region for each node c(left,right)  N Search ( (left,right), i+1) else return result Search ( (1,13), (a,b) ) Search ( (1,13), (b,  ) ) Search ( (2,4), (c,ba) ) (c,bac) B+B+ (b,  ) (a,b) (b,ba) (c,ba) 1,13 6 9,11 2,4 5,7 8, Search ( (5,7), (c,ba) ) Search ( (8,12), (c,ba) ) b a c Q = (b,  ) (a,b) (c,ba) aa bc b a c c 1,13 2,4 3 5,7 69, ,12

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 38 ViST, access order Search ( (1,13), (a,b) ) Search ( (1,13), (b,  ) ) Search ( (2,4), (c,ba) ) (c,bac) B+B+ (b,  ) (a,b) (b,ba) (c,ba) 1,13 6 9,11 2,4 5,7 8, Search ( (5,7), (c,ba) ) Search ( (8,12), (c,ba) ) Q = (b,  ) (a,b) (c,ba) aa bc b a c c 1 2 X

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 39 ViST, discussion Worst-case storage requirement for D-Ancestor is > linear in #elements  E.g. unary tree with n nodes, sequence O(n 2 ) False alarms  Our implementation: no false alarms //a[//b]//c unordered  Vist: (a,  )(b,a)(c,a) & (a,  )(c,a)(b,a)  Our implementation: run the twig query only once a b c de fd a bb de a b de D 1 = (a,  ) (b,a) (d,ab) (e,ab) (c,a) (f,ac) (d,ac) D 2 = (a,  ) (b,a) (d,ab) (b,a) (e,ab) Q = (a,  ) (b,a) (d,ab) (e,ab)    a bc a cb

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 40 PRIX, PRüfer seqs. for Indexing XML Input: sequence of labels  Document & query mapped by Prüfer’s method  Tree → sequence: remove one node at a time Processing  Sequence matching against indexed db: filter non-matches  Refinement phases: filter twig-matches, the results: Form a tree, satisfy the twig query, include the leaf nodes LPS = A C B C C B A C A E E E D A NPS = (Any numbering scheme, here is post-order) Bottom-up approach 

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 41 PRIX, Processing Problems  Complex solution  //a[//b]//c unordered: same problem as ViST What we do  Region based numbering scheme and XB-tree  Bottom-up traversal of the query + subtwig merging Access nodes in the same order Efficient when query bottom defines results

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 42 A(k) index A(k) → k is the degree of similarity, “size of common path”  k k-bisimilarity 1) for any two nodes u and v, u  0 v iff u and v have same label 2) u  k v iff u  k-1 v and for every parent u’ of v’, there is a parent v’ of v s.t. u’  k-1 v’ and vice-versa A D D C E B E 23 4,5 6,7 1 A D C B E A(0)A(1) A D D C B 6,7 E A D D C E B E A(2) Original document

UC Riverside Tree-Pattern Queries on a Lightweight XML Processor 43 Protein