Tree edit distance1 Tree Edit Distance.  Minimum edits to transform one tree into another Tree edit distance2 TED.

Slides:



Advertisements
Similar presentations
Two Segments Intersect?
Advertisements

Bipartite Matching, Extremal Problems, Matrix Tree Theorem.
Divide and Conquer. Subject Series-Parallel Digraphs Planarity testing.
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
Priority Queues  MakeQueuecreate new empty queue  Insert(Q,k,p)insert key k with priority p  Delete(Q,k)delete key k (given a pointer)  DeleteMin(Q)delete.
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
QuickSort Average Case Analysis An Incompressibility Approach Brendan Lucier August 2, 2005.
AVL Trees COL 106 Amit Kumar Shweta Agrawal Slide Courtesy : Douglas Wilhelm Harder, MMath, UWaterloo
15.082J & 6.855J & ESD.78J October 14, 2010 Maximum Flows 2.
Breadth-First Search of Graphs Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms.
Constant-Time LCA Retrieval
A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees David R. Karger David R. Karger Philip N. Klein Philip N. Klein Robert E. Tarjan.
Suffix Sorting & Related Algoritmics Martin Farach-Colton Rutgers University USA.
Praktikum zur Analyse von Formen - Abstandsmaße - Helmut Alt Freie Universität Berlin.
TCSS 342 AVL Trees v1.01 AVL Trees Motivation: we want to guarantee O(log n) running time on the find/insert/remove operations. Idea: keep the tree balanced.
Chapter 4: Divide and Conquer The Design and Analysis of Algorithms.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 3 Recurrence equations Formulating recurrence equations Solving recurrence equations.
1 Minimize average access time Items have weights: Item i has weight w i Let W =  w i be the total weight of the items Want the search to heavy items.
Data Structures, Spring 2006 © L. Joskowicz 1 Data Structures – LECTURE 3 Recurrence equations Formulating recurrence equations Solving recurrence equations.
Priority Queues  Queues: first-in first-out in printer schedule  Disadvantage: short job, important job need to wait  Priority queue is a data structure.
Initializing A Max Heap input array = [-, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Chapter 9: Huffman Codes
1 Maximum flow: The preflow/push method of Goldberg and Tarjan (87)
Guided Forest Edit Distance: Better Structure Comparisons by Using Domain-knowledge Z.S. Peng H.F. Ting.
CSC 2300 Data Structures & Algorithms February 16, 2007 Chapter 4. Trees.
The Encoding Complexity of Two Dimensional Range Minimum Data Structures European Symposium on Algorithms, Inria, Sophia Antipolis, France, September 3,
1 Trees 3: The Binary Search Tree Section Binary Search Tree A binary tree B is called a binary search tree iff: –There is an order relation
Data Structure & Algorithm II.  Delete-min  Building a heap in O(n) time  Heap Sort.
Computer Algorithms Submitted by: Rishi Jethwa Suvarna Angal.
Approximate XML Joins Huang-Chun Yu Li Xu. Introduction XML is widely used to integrate data from different sources. Perform join operation for XML documents:
The Lower Bounds of Problems
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
1 Trees 4: AVL Trees Section 4.4. Motivation When building a binary search tree, what type of trees would we like? Example: 3, 5, 8, 20, 18, 13, 22 2.
Preview  Graph  Tree Binary Tree Binary Search Tree Binary Search Tree Property Binary Search Tree functions  In-order walk  Pre-order walk  Post-order.
Lectures on Greedy Algorithms and Dynamic Programming
Lecture 9 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
Priority Queues Two kinds of priority queues: Min priority queue. Max priority queue. Nov 4,
Asymptotic Behavior Algorithm : Design & Analysis [2]
CS223 Advanced Data Structures and Algorithms 1 Priority Queue and Binary Heap Neil Tang 02/09/2010.
Space Efficient Alignment Algorithms and Affine Gap Penalties Dr. Nancy Warter-Perez.
Divide and Conquer. Recall Divide the problem into a number of sub-problems that are smaller instances of the same problem. Conquer the sub-problems by.
CS38 Introduction to Algorithms Lecture 10 May 1, 2014.
1 Algorithmic aspects of radio access network design in 4G cellular networks David Amzallag Computer Science Department, Technion Joint work with Seffi.
1Computer Sciences Department. Objectives Recurrences.  Substitution Method,  Recursion-tree method,  Master method.
Initializing A Max Heap input array = [-, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
Local Exact Pattern Matching for Non-fixed RNA Structures Mika Amit, Rolf Backofen, Steffen Heyne, Gad M. Landau, Mathias Mohl, Christina Schmiedl, Sebastian.
Sorting Lower Bounds n Beating Them. Recap Divide and Conquer –Know how to break a problem into smaller problems, such that –Given a solution to the smaller.
CSE 250 – Data Structures. Today’s Goals  First review the easy, simple sorting algorithms  Compare while inserting value into place in the vector 
B+-Tree Deletion Underflow conditions B+ tree Deletion Algorithm
DAST Tirgul 6. What is a Binary Search Tree? The keys in a binary search tree (BST) are always stored in such a way as to satisfy the search property:
Fibonacci Heaps. Fibonacci Binary insert O(1) O(log(n)) find O(1) N/A union O(1) N/A minimum O(1) O(1) decrease key O(1) O(log(n)) delete O(log(n) O(log(n))
Binary search tree. Removing a node
AVL DEFINITION An AVL tree is a binary search tree in which the balance factor of every node, which is defined as the difference between the heights of.
Binary Search Trees.
Taku Aratsu1, Kouichi Hirata1 and Tetsuji Kuboyama2
Quick-Sort 9/12/2018 3:26 PM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Lecture 7 Algorithm Analysis
CS200: Algorithms Analysis
Chapter 9: Huffman Codes
Initializing A Max Heap
CS223 Advanced Data Structures and Algorithms
Lecture 7 Algorithm Analysis
Lecture 7 Algorithm Analysis
CS223 Advanced Data Structures and Algorithms
CMPS 3120: Computational Geometry Spring 2013
Quick-Sort 4/25/2019 8:10 AM Quick-Sort     2
Space-Saving Strategies for Computing Δ-points
Priority Queues Supports the following operations. Insert element x.
Maximum flow: The preflow/push method of Goldberg and Tarjan (87)
A Heap Is Efficiently Represented As An Array
Presentation transcript:

Tree edit distance1 Tree Edit Distance

 Minimum edits to transform one tree into another Tree edit distance2 TED

Tree edit distance3 Delete a node: The edit operations w ˙˙˙ v Relabel a node:

Tree edit distance4 The edit operations ˙˙˙ Insert a node: ˙˙˙ v

Tree edit distance5 Existing Algorithms

Tree edit distance6 Recursive Algorithm [SZ89] v w FG Recurs on the rightmost root: Delete v d(F,G) = min Delete w Match v and w

Tree edit distance7 Recursive Algorithm [SZ89] v w FG Recurs on the rightmost root: Delete v d(F,G) = min Delete w Match v and w

Tree edit distance8 Recursive Algorithm [SZ89] v w FG Recurs on the rightmost root: Delete v d(F,G) = min Delete w Match v and w

Tree edit distance9 Recursive Algorithm [SZ89] v w FG Recurs on the rightmost root: Delete v d(F,G) = min Delete w Match v and w

Tree edit distance10 Recursive Algorithm [SZ89] v w FG Recurs on the rightmost root: Delete v d(F,G) = min Delete w Match v and w

Tree edit distance11 Recursive Algorithm [SZ89] v w FG Recurs on the rightmost root: Delete v d(F,G) = min Delete w Match v and w

Tree edit distance 12 Time Complexity [SZ89]  relevant subproblem: if it shows up while computing d(F,G)  #relevant subproblems = time complexity = O(n 2 m 2 ) = O(n 4 )  O(nm. min{Depth(F),Leaves(F)}. min{Depth(G),Leaves(G)}) v w F G Relevant subforests

Tree edit distance13 Klein98  Same as previous algorithm, but recurs on a light child in F.  #relevant subproblems = (#relevant subforests of F). m 2 = = O(nlogn. m 2 ) = O(n 3 logn) FG By heavy path decomposition [HT84]

Tree edit distance14 Decomposition strategy [DT03]  For every two subforests (F,G) a strategy says right or left.  Zhang & Shasha’s strategy = right always.  Klein’s strategy = right iff the rightmost tree in F is smaller than the leftmost tree in F.  Lower bound of strategy algorithms =  (nm. logn. logm)  Any strategy algorithm computes the edit distance between any two subtrees of F and G (without their roots).

Tree edit distance15 Our Results  An O ( m 2 n(log + 1) ) = O(n 3 ) time, O(nm) space algorithm. (Today: O((nm) 3/2 )=O(n 3 ) time and space) [DMRW ICALP07]  A strategy algorithm symmetrically dependant on the two input trees.  A matching lower bound for all strategy algorithms. (Today: A lower bound of  (nm 2 ))  Local edit distance and affine gap penalties at the cost of one execution. (Today: Local RNA edit distance) [BHLW CPM06] n m

Tree edit distance16 Our Algorithm  Our algorithm to compute d(F,G): 1.If F<G compute d(G,F). 2.Recursively run d(K i,G) for every K i. 3.Run Klein’s strategy where “master” is F (no need to recurs). K5K5 K3K3 K4K4 F K2K2 K1K1 G

Tree edit distance17 Analysis  Our algorithm to compute d(F,G): 1.If F<G compute d(G,F). 2.Recursively run d(K i,G) for every K i. 3.Run Klein’s strategy where “master” is F (no need to recurs). K5K5 K3K3 K4K4 F K2K2 K1K1 G R(F, G) = ?

Tree edit distance18 An O((nm) 3/2 ) = O(n 3 ) Upper Bound  We show that. Proof by induction: R(F,G)

Tree edit distance19  We show that. Proof by induction: R(F,G) By inductive assumption By (*) and (**) We know G<F An O((nm) 3/2 ) = O(n 3 ) Upper Bound

Tree edit distance20 An O((nm) 3/2 ) = O(n 3 ) Upper Bound  We show that. Proof by induction: R(F,G) By inductive assumption By (*) and (**) We know G<F

An O((nm) 3/2 ) = O(n 3 ) Upper Bound Tree edit distance21  We show that. Proof by induction: R(F,G) By inductive assumption By (*) and (**) We know G<F

Tree edit distance22  We show that. Proof by induction: R(F,G) By inductive assumption By (*) and (**) We know G<F An O((nm) 3/2 ) = O(n 3 ) Upper Bound

Tree edit distance23 An O((nm) 3/2 ) = O(n 3 ) Upper Bound  We show that. Proof by induction: R(F,G) By inductive assumption By (*) and (**) We know G<F

Tree edit distance 24 An O( ) Bound  Proof idea:  At most log(n/m) nested recursive calls where F is “master” before all trees ≤ m.  For all trees ≤ m use previous O(m 3 ) bound. At most n/m such trees so total = n/m. O(m 3 ) = O(nm 2 ). n m K5K5 K3K3 K4K4 F K2K2 K1K1 G

Tree edit distance25 A Matching Lower Bound for all decomposition strategy algorithms

Tree edit distance26 A Matching Lower Bound for all decomposition strategy algorithms  An  (nm 2 ) lower bound: F G

Tree edit distance27 A Matching Lower Bound for all decomposition strategy algorithms  An  (nm 2 ) lower bound:  Consider this computational path:  If the strategy says left delete from F, otherwise delete from G.  For every two internal nodes v in F and w in G we get:  min{|F v |,|G w |} new subproblems (F v is the tree rooted at v).  Summing over all such v,w:

Tree edit distance28 A Matching Lower Bound for all decomposition strategy algorithms  An lower bound  A careful counting argument on: F G

Tree edit distance29 Thank you!