A New Top-down Algorithm for Tree Inclusion Dr. Yangjun Chen Dept. Applied Computer Science, University of Winnipeg 515 Portage Ave. Winnipeg, Manitoba,

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Transitive Closure Compression Jan. 2013Yangjun Chen ACS Outline: Transitive Closure Compression Motivation DAG decomposition into node-disjoint.
Binary Trees, Binary Search Trees CMPS 2133 Spring 2008.
Binary Trees, Binary Search Trees COMP171 Fall 2006.
B + -Trees Sept. 2012Yangjun Chen ACS B + -Tree Construction and Record Searching in Relational DBs Chapter 6 – 3rd (Chap. 14 – 4 th, 5 th ed.; Chap.
Core Labeling: A New Way to Compress Transitive Closure
Jan. 2013Dr. Yangjun Chen ACS Outline Signature Files - Signature for attribute values - Signature for records - Searching a signature file Signature.
Implementation of Graph Decomposition and Recursive Closures Graph Decomposition and Recursive Closures was published in 2003 by Professor Chen. The project.
Binary Search Trees Briana B. Morrison Adapted from Alan Eugenio.
1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.
6/14/2015 6:48 AM(2,4) Trees /14/2015 6:48 AM(2,4) Trees2 Outline and Reading Multi-way search tree (§3.3.1) Definition Search (2,4)
An Efficient Algorithm for Answering Graph Reachability Queries Yangjun Chen, Yibin Chen Dept. Applied Computer Science, University of Winnipeg 515 Portage.
1 Trees. 2 Outline –Tree Structures –Tree Node Level and Path Length –Binary Tree Definition –Binary Tree Nodes –Binary Search Trees.
1 Red-Black Trees. 2 Black-Height of the tree = 4.
Dynamic Set AVL, RB Trees G.Kamberova, Algorithms Dynamic Set ADT Balanced Trees Gerda Kamberova Department of Computer Science Hofstra University.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
© 2004 Goodrich, Tamassia (2,4) Trees
Chapter 13 Binary Search Trees. Copyright © 2005 Pearson Addison-Wesley. All rights reserved Chapter Objectives Define a binary search tree abstract.
B + -Trees (Part 2) Lecture 21 COMP171 Fall 2006.
CSC 212 Lecture 19: Splay Trees, (2,4) Trees, and Red-Black Trees.
Recursive Graph Deduction and Reachability Queries Yangjun Chen Dept. Applied Computer Science, University of Winnipeg 515 Portage Ave. Winnipeg, Manitoba,
Constructing Signature Graphs for Signature Files Dr. Yangjun Chen Dept. Applied Computer Science University of Winnipeg Canada.
CHAPTER 12 Trees. 2 Tree Definition A tree is a non-linear structure, consisting of nodes and links Links: The links are represented by ordered pairs.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
CSC 2300 Data Structures & Algorithms February 6, 2007 Chapter 4. Trees.
C++ Programming: Program Design Including Data Structures, Third Edition Chapter 20: Binary Trees.
Binary Trees Chapter 6.
Randomized Algorithms - Treaps
CPSC 335 BTrees Dr. Marina Gavrilova Computer Science University of Calgary Canada.
Introduction Of Tree. Introduction A tree is a non-linear data structure in which items are arranged in sequence. It is used to represent hierarchical.
CS Data Structures Chapter 5 Trees. Chapter 5 Trees: Outline  Introduction  Representation Of Trees  Binary Trees  Binary Tree Traversals 
Lecture 10 Trees –Definiton of trees –Uses of trees –Operations on a tree.
Multi-way Trees. M-way trees So far we have discussed binary trees only. In this lecture, we go over another type of tree called m- way trees or trees.
Multiway Trees. Trees with possibly more than two branches at each node are know as Multiway trees. 1. Orchards, Trees, and Binary Trees 2. Lexicographic.
Binary Trees. Binary Tree Finite (possibly empty) collection of elements A nonempty binary tree has a root element The remaining elements (if any) are.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
Module #19: Graph Theory: part II Rosen 5 th ed., chs. 8-9.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
Trees – Chapter 9 Slides courtesy of Dr. Michael P. Frank University of Florida Dept. of Computer & Information Science & Engineering.
Discrete Structures Trees (Ch. 11)
DATA STRUCTURE Presented By: Mahmoud Rafeek Alfarra Using C# MINISTRY OF EDUCATION & HIGHER EDUCATION COLLEGE OF SCIENCE AND TECHNOLOGY KHANYOUNIS- PALESTINE.
3.1. Binary Search Trees   . Ordered Dictionaries Keys are assumed to come from a total order. Old operations: insert, delete, find, …
Trees By P.Naga Srinivasu M.tech,(MBA). Basic Tree Concepts A tree consists of finite set of elements, called nodes, and a finite set of directed lines.
M180: Data Structures & Algorithms in Java Trees & Binary Trees Arab Open University 1.
AVL trees1 AVL Trees Height of a node : The height of a leaf is 1. The height of a null pointer is zero. The height of an internal node is the maximum.
Red-Black Trees Definitions and Bottom-Up Insertion.
Chapter 10: Trees A tree is a connected simple undirected graph with no simple circuits. Properties: There is a unique simple path between any 2 of its.
BINARY TREES Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary.
1 Binary Search Trees   . 2 Ordered Dictionaries Keys are assumed to come from a total order. New operations: closestKeyBefore(k) closestElemBefore(k)
A New Algorithm for Evaluating Ordered Tree Pattern Queries Yangjun Chen Dept. Applied Computer Science, University of Winnipeg 515 Portage Ave. Winnipeg,
Chapter 6 – Trees. Notice that in a tree, there is exactly one path from the root to each node.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
On the Intersection of Inverted Lists Yangjun Chen and Weixin Shen Dept. Applied Computer Science, University of Winnipeg 515 Portage Ave. Winnipeg, Manitoba,
1 UNIVERSITY of PENNSYLVANIAGrigoris Karvounarakis October 04 Lazy Query Evaluation for Active XML Abiteboul, Benjelloun, Cautis, Manolescu, Milo, Preda.
B+-Tree Deletion Underflow conditions B+ tree Deletion Algorithm
1 Trees. 2 Trees Trees. Binary Trees Tree Traversal.
CSCE 210 Data Structures and Algorithms
A Linear-Space Top-down Algorithm for Tree Inclusion Problem
Recursive Objects (Part 4)
CISC220 Fall 2009 James Atlas Lecture 13: Binary Trees.
Lecture 22 Binary Search Trees Chapter 10 of textbook
Advanced Algorithms Analysis and Design
Binary Trees, Binary Search Trees
Interval Heaps Complete binary tree.
Trees.
Binary Trees, Binary Search Trees
On the Graph Decomposition
Binary Trees, Binary Search Trees
Binary Search Trees < > = Dictionaries
NATURE VIEW OF A TREE leaves branches root. NATURE VIEW OF A TREE leaves branches root.
Presentation transcript:

A New Top-down Algorithm for Tree Inclusion Dr. Yangjun Chen Dept. Applied Computer Science, University of Winnipeg 515 Portage Ave. Winnipeg, Manitoba, Canada R3B 2E9

Outline Motivation Basic algorithm for tree inclusion problem -Definition -Algorithm description Improvements Summary

Given two ordered labeled trees P and T, called the pattern and the target, respectively. An interesting problem is: Can we obtain pattern P by deleting some nodes from target T? That is, is there a sequence v 1,..., v k of nodes such that for T 0 = T and T i+1 = delete(T i, v i +1 ) for i = 0,..., k - 1, we have T k = P. If this is the case, we say, P is included in T, T contains P, or say, T covers P. Motivation a b d ef T:T: cb d ef T:T: a delete(T, c)

Motivation s vp vnadv “reads”“book” s np vp detnvnpadv “The”“student”“reads”detadjn “the”“interesting”“book” “again and again” Linguistic analysis

Definition 1 Let F and G be labeled ordered forests. We define an ordered embedding (, G, F) as an injective function  : V(G)  V(F) such that for all nodes v, u  V(G), i)label(v) = label((v)); (label preservation condition) ii)v is an ancestor of u iff (v) is an ancestor of (u);(ancestor condition) iii)v is to the left of u iff (v) is to the left of (u); (Sibling condition) Tree inclusion algorithm Definition a b b G:G: a d b ebb F:F:

Algorithm Tree inclusion algorithm 1.Let T = (k  1) be a tree and G = (l  1) be a forest. We handle G as a tree P =, where p v represents a virtual node, matching any node in T. 2.Consider a node in P with children v 1,..., v j. We use a pair (i  j) to represent an ordered forest containing the first i subtrees of v:. Then, represents the first j trees in G. P:P: v1v1 vivi vkvk … … v

Algorithm Tree inclusion algorithm 3.In addition, h(v) represents the height of v in a tree; and (v) represents a link from v in P to the leaf node on the left-most path in P[v]. Let v’ be a leaf node in P. We denote by  -1 (v’) a set of nodes x such that for each v  x (v) = v’. -1(v 3 ) = {v 1, v 2, v 3 } v1v1 v5v5 v4v4 v2v2 v3v3 (v1)(v1) (v2)(v2) P:P:

The tree inclusion checking is done by calling two functions recursively: top-down(T, G), bottom-up(T’, G), where T is a tree, and T’ and G are two forests. Algorithm Tree inclusion algorithm Each of the two functions returns a pair with v being p v or a node on the left-most path in P 1. T = T’ = G =

Function: top-down(T, G) Tree inclusion algorithm Case 1: G = ; or G = (l > 1), but |T |  |P 1 | + |P 2 |. In this case, we try to find a pair such that T contains the first i subtrees of v, where v = p v, or v   -1 (v’) and v’ is the leaf node on the left-most path in P 1. T:T: G:G: P1P1 pvpv G:G: …… P1P1 P2P2 pvpv |T|  |P 1 | + |P 2 |. T:T: t t PlPl p1p1 In top-down(T, G), two cases will be handled. p1p1

Function: top-down(T, G) Tree inclusion algorithm i)If t is a leaf node, we will check whether label(t) = label((p 1 )), where p 1 is the root of P 1. If it is the case, return. Otherwise, return. T = : G:G: P1P1 pvpv G:G: …… P1P1 P2P2 pvpv |T |  |P 1 | + |P 2 |. t t T = : PlPl case 1:

Function: top-down(T, G) Tree inclusion algorithm ii)If |T| < |P 1 | or height(t) < height(p 1 ), we will make a recursive call top-down(T, ), where be a forest of the subtrees of p 1. The return value of top-down(T, ) is used as the return value of top-down(T, G) |T | < |P 1 | G:G: …… pvpv p1p1 … … P 11 P1jP1j P1iP1i T:T: t PlPl case 1:

Function: top-down(T, G) Tree inclusion algorithm iii)If |T|  |P 1 | (but |T |  |P1| + |P2|) and height(t)  height(p 1 ), two cases need to be considered: label(t) = label(p 1 ). Call bottom-up(, ). label(t)  label(p 1 ). Call bottom-up(, ). p1p1 … … P 11 P1jP1j P1iP1i t … … T1T1 TkTk TiTi label(t) = label(p 1 ) p1p1 … … P 11 P1jP1j P1iP1i t … … T1T1 TkTk TiTi label(t)  label(p 1 ) case 1:

In both sub-cases, assume that the return value is. A further checking needs to be conducted: Function: top-down(T, G) Tree inclusion algorithm If label(t) = label(v) and i = the outdegree of v, the return value should be. Otherwise, the return value is the same as. T:T: t P1:P1: p1p1 v or label(t)  label(v) label(t) = label(v) case 1:

Function: top-down(T, G) Tree inclusion algorithm Case 2: G = (l > 1), and |T| > |P 1 | + |P 2 |. In this case, we will call bottom-up(, G). Assume that the return value is. The following checkings will be continually conducted. Case 1: G = ; or G = (l > 1), but |T |  |P 1 | + |P 2 |. G:G: …… P1P1 P2P2 pvpv |T | > |P 1 | + |P 2 | PlPl T:T: …… T1T1 T2T2 t TkTk

Function: top-down(T, G) Tree inclusion algorithm iv)If v = p 1 ’s parent, the return value is the same as. v)If v  p 1 ’s parent, check whether label(t) = label(v)) and i = the outdegree of v. If so, the return value will be changed to. Otherwise, the return value remains. Case 2: G = (l > 1), and |T | > |P 1 | + |P 2 |. In this case, we will call bottom-up(, G). Assume that the return value is. The following checkings will be continually conducted. G:G: … … P1P1 P2P2 pvpv v = p 1 ’s parent = p v …… P1P1 P2P2 pvpv v  p 1 ’s parent v PiPi PlPl PlPl

Function: bottom-up(T’, G) Tree inclusion algorithm bottom-up(T’, G) is designed to handle the case that both T’ and G are forests. Let T’ = and G =. In bottom-up(T’, G), we will make a series of calls top-down(T l, ), where l = 1,..., k, j 1 = 0, and j 1  j 2 ...  j h  q (for some h  k), controlled as follows. … … PiPi … … TkTk T1T1 TiTi P1P1 PqPq T2T2 … top-down(T l, ) T’: G:G:

Function: bottom-up(T’, G) Tree inclusion algorithm 1.Two index variables l, j are used to scan T 1,..., T k and P 1,..., P q, respectively. 2.Let be the return value of top-down(T l, ). If v l = p j ’s parent, set j to be j + i l - 1. Otherwise, j is not changed. Set l to be l + 1. Go to (2). 3.The loop terminates when all T l ’s or all P j ’s are examined. bottom-up(T’, G) is designed to handle the case that both T’ and G are forests. Let T’ = and G =. In bottom-up(T’, G), we will make a series of calls top-down(T l, ), where l = 1,..., k, j 1 = 0, and j 1  j 2 ...  j h  q (for some h  k), controlled as follows.

Function: bottom-up(T’, G) Tree inclusion algorithm If j > 0 when the loop terminates, bottom-up(T’, G) returns. … … PiPi … … TkTk T1T1 TiTi P1P1 PqPq T2T2 … PjPj

Function: bottom-up(T’, G) Tree inclusion algorithm i)Let,,..., be the respective return values of top-down(T 1, ), top-down(T 2, ), top-down(T k, ). Since j = 0, each v l   -1 (v’) (l = 1,..., k). Otherwise, j = 0. In this case, we will continue to searching for a pair such that T’ contains the first i subtrees of v, where v   -1 (v’) and v’ is the leaf node on the left-most path in P 1, as described below. If j > 0 when the loop terminates, bottom-up(T’, G) returns. P1P1 v1v1 v2v2 vkvk …

ii)If each i l = 0, return, where  is considered to be a descendant of any node in G. Otherwise, find the first v g with children w 1,..., w h such that v g is not a descendant of any other v j, and i g > 0. Call bottom-up(, ). Function: bottom-up(T’, G) Tree inclusion algorithm i)Let,..., be the return values of top-down(T 1, ),..., top-down(T k, ), respectively. Since j = 0, each v l   -1 (v’) (l = 1,..., k). Let be its return value. If y = v g, then the return value of bottom-up(T’, G) is set to be. Otherwise, the return value is. … … T g+1 T1T1 TgTg T2T2 P1P1 v1v1 vgvg vkvk TkTk … … igig

Further improvements Tree inclusion algorithm In the case j = 0: Let,..., be the return values of top-down(T 1, ),..., top-down(T k, ). We will find the first v g such that it is not a descendant of any other v j and i g > 0. Then, bottom-up(, ). is invoked. This shows that all the return values except are not used in the subsequent computation. Thus, the work for looking for such values should be avoided. … … T g+1 T1T1 TgTg T2T2 P1P1 v1v1 vgvg vkvk TkTk … …

Let be the return value of top-down(T j, ) such that i j > 0 and v j is p 1 or a descendant of p 1. Then, during the execution of top-down(T j+1, ), once we have detected that it can only produce a return value with v j+1 being a descendant of v j, we should stop the corresponding computation immediately since this return value will not be used in the subsequent searching. For this purpose, we rearrange top-down(T j+1, ) to top-down(T j+1,, v j ) with v j being used to transfer information, called a controlling-node. Further improvements Tree inclusion algorithm Assume that in the execution of top-down(T j+1,, v j ), we have the following function calls: top-down(T j+1,1,, u 1 ) returns, top-down(T j+1,2,, u 2 ) returns, With all u j ’s being a proper descendant of v j. Then the bottom-up function call with some u i as a controlling node should not be conducted. … bottom-up(,, u i ).

Summary An efficient method for tree inclusion problem -O|T|min{D P, |leaves(P)|}) time and -O(|T| + |P|) space where D P – the height of P, and Future work -adapt the algorithm to a data stream environment -adapt the algorithm to an indexing environment leaves(P) - set of the leaf nodes of P.

Thank you.