Chapter 6: Union-Find and Related Structures CS6310 ADVANCED DATA STRUCTURE SHADHA MUHI & HASNAA IMAD.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

1 Union-find. 2 Maintain a collection of disjoint sets under the following two operations S 3 = Union(S 1,S 2 ) Find(x) : returns the set containing x.
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
1 Disjoint Sets Set = a collection of (distinguishable) elements Two sets are disjoint if they have no common elements Disjoint-set data structure: –maintains.
1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 20 Prof. Erik Demaine.
Disjoint Sets Given a set {1, 2, …, n} of n elements. Initially each element is in a different set.  {1}, {2}, …, {n} An intermixed sequence of union.
Disjoint Union / Find CSE 373 Data Structures Lecture 17.
CSE 326: Data Structures Disjoint Union/Find. Equivalence Relations Relation R : For every pair of elements (a, b) in a set S, a R b is either true or.
Union-Find Problem Given a set {1, 2, …, n} of n elements. Initially each element is in a different set. {1}, {2}, …, {n} An intermixed sequence of union.
2 -1 Chapter 2 The Complexity of Algorithms and the Lower Bounds of Problems.
Heaps Heaps are used to efficiently implement two operations:
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Lec 15 April 9 Topics: l binary Trees l expression trees Binary Search Trees (Chapter 5 of text)
Dynamic Sets and Data Structures Over the course of an algorithm’s execution, an algorithm may maintain a dynamic set of objects The algorithm will perform.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 17 Union-Find on disjoint sets Motivation Linked list representation Tree representation.
Lecture 16: Union and Find for Disjoint Data Sets Shang-Hua Teng.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
CSE 373, Copyright S. Tanimoto, 2002 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
CS2420: Lecture 42 Vladimir Kulyukin Computer Science Department Utah State University.
Binary Trees Chapter 6.
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
Chapter Tow Search Trees BY HUSSEIN SALIM QASIM WESAM HRBI FADHEEL CS 6310 ADVANCE DATA STRUCTURE AND ALGORITHM DR. ELISE DE DONCKER 1.
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
Searching: Binary Trees and Hash Tables CHAPTER 12 6/4/15 Nyhoff, ADTs, Data Structures and Problem Solving with C++, Second Edition, © 2005 Pearson Education,
Lecture 10 Trees –Definiton of trees –Uses of trees –Operations on a tree.
Union-Find Problem Given a set {1, 2, …, n} of n elements. Initially each element is in a different set.  {1}, {2}, …, {n} An intermixed sequence of.
Chapter 6 Binary Trees. 6.1 Trees, Binary Trees, and Binary Search Trees Linked lists usually are more flexible than arrays, but it is difficult to use.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
Computer Algorithms Submitted by: Rishi Jethwa Suvarna Angal.
Mehdi Mohammadi March Western Michigan University Department of Computer Science CS Advanced Data Structure.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
Outline Binary Trees Binary Search Tree Treaps. Binary Trees The empty set (null) is a binary tree A single node is a binary tree A node has a left child.
Lecture X Disjoint Set Operations
Disjoint Sets Data Structure. Disjoint Sets Some applications require maintaining a collection of disjoint sets. A Disjoint set S is a collection of sets.
Union-find Algorithm Presented by Michael Cassarino.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
Chapter 18: Searching and Sorting Algorithms. Objectives In this chapter, you will: Learn the various search algorithms Implement sequential and binary.
Union Find ADT Data type for disjoint sets: makeSet(x): Given an element x create a singleton set that contains only this element. Return a locator/handle.
Union & Find Problem 황승원 Fall 2010 CSE, POSTECH 2 2 Union-Find Problem Given a set {1, 2, …, n} of n elements. Initially each element is in a different.
ICS 353: Design and Analysis of Algorithms Heaps and the Disjoint Sets Data Structures King Fahd University of Petroleum & Minerals Information & Computer.
Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturers: Haim Kaplan and Uri Zwick January 2014.
CS 146: Data Structures and Algorithms July 16 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak
Union-Find  Application in Kruskal’s Algorithm  Optimizing Union and Find Methods.
MA/CSSE 473 Days Answers to student questions Prim's Algorithm details and data structures Kruskal details.
0 Union-Find data structure. 1 Disjoint set ADT (also Dynamic Equivalence) The universe consists of n elements, named 1, 2, …, n n The ADT is a collection.
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
Internal and External Sorting External Searching
WEEK 5 The Disjoint Set Class Ch CE222 Dr. Senem Kumova Metin
MA/CSSE 473 Day 37 Student Questions Kruskal Data Structures and detailed algorithm Disjoint Set ADT 6,8:15.
CSE 373, Copyright S. Tanimoto, 2001 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
Disjoint Sets Data Structure
CSE 373, Copyright S. Tanimoto, 2001 Up-trees -
B+ Tree.
Disjoint Sets Chapter 8.
An application of trees: Union-find problem
ICS 353: Design and Analysis of Algorithms
CSE 332: Data Abstractions Union/Find II
ICS 353: Design and Analysis of Algorithms
Disjoint Sets Given a set {1, 2, …, n} of n elements.
Disjoint Sets DS.S.1 Chapter 8 Overview Dynamic Equivalence Classes
Disjoint Sets Given a set {1, 2, …, n} of n elements.
Union-Find Problem Given a set {1, 2, …, n} of n elements.
An application of trees: Union-find problem
Lecture 21 Amortized Analysis
Disjoint Set Operations: “UNION-FIND” Method
Presentation transcript:

Chapter 6: Union-Find and Related Structures CS6310 ADVANCED DATA STRUCTURE SHADHA MUHI & HASNAA IMAD

Outline: Background 6.1 Union-Find: Merging Classes of a Partition 6.2 Union-Find with Copies and Dynamic Segment Trees 6.3 List Splitting 6.4 Problems on Root-Directed Trees 6.5 Maintaining a Linear Order

Union WHOM??? & Find WHAT???

Background: Disjoint Data Set Disjoint-set data structure is a data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. Disjoint-set data structure is sometimes called a union-find data structure. Union-find algorithm is an algorithm that performs three useful operations on such a data structure: Find: determine which subset a particular element is in. This can be used for determining if two elements are in the same subset. Union: join two subsets into a single subset. Makeset: makes a set containing only a given element (a singleton). With these three operations, many practical partitioning problems can be solved.

Given a set {1, 2, …, n} of n elements. Initially each element is in a different set. {1}, {2}, …, {n} An intermixed sequence of union and find operations is performed. Background: Disjoint Data Set

S = {2, 4, 5, 9, 11, 13, 30} Some possible tree representations Background: Set as Tree

Background: Root-Directed Trees Union-Find structure can be represented by trees with parent pointers. The root of the tree is called the Representative of the tree (class). Representative= 13 Representative= 7

Union(i,j) : i and j are the roots of two different trees, i != j. To unite the trees, make one tree a subtree of the other. Time Complexity of Union operation is O(1) Background: Union Operation

Which tree should become a subtree of the other??? Union(7,13) Background: Union Operation

There are two approaches to apply union operation: Union by Height Union by Weight Background: Smart Union Approaches

Make tree with smaller height a subtree of the other tree Height = 4 Height = 3 Background: Union by Height = 13 Union(7,13)

The tree with fewer number of elements becomes subtree of the other tree Weight = 10 Weight = 8 Background: Union by Weight = 7 Union(7,13)

Find(i) is to identify the set that contains element i. Start at the node that represents element i and climb up the tree until the root is reached. Return the element in the root (Representative). Time Complexity of find operation is O(h) Find(9) = Find(14) = 7 Background: Find Operation

Quick Quiz: What is the total number of union operations to reach the last tree? What is the time complexity for Union (1, n)? n-1 O(n-1) = O(n)

What are the applications that utilize this structure? There are many applications that utilize this structure. Some of them are: Check the network connectivity. Kruskal's minimum spanning tree algorithm. Find least common ancestor.

6.1 Union-Find: Merging Classes of a Partition The classical version of the union-find structure works in the following model: Items can be inserted into a set, each initially forming a one-element partition class. Items are identified by a pointer. So we have the following operations: insert: takes an item, returns pointer to the node representing the item, and creates a one- element class for it. (make_set operation) join: takes two pointers to nodes and joins the classes containing these items. (union operation) same_class: takes two pointers to nodes and decides whether their items are in the same class. (find operation with two inputs)

6.1 Union-Find: Merging Classes of a Partition same_class (i, j): we can query whether two items are in the same class by following from both nodes the path to their respective roots; they are in the same class if they reach the same root. Examples: same_class (h,i), same_class (m,r) join (i, j): We can join two classes by connecting the root of one tree to the root of the other tree. Example: join (a,d)

6.1 Union-Find: Merging Classes of a Partition When joining two trees, which of the two roots should become the new root? The time taken by the query is the length of the path to the root. Long path=long query time. The best-known solution by using the following two techniques: Union by rank: Each node has rank field, which starts on insertion as 0. Each time we join two classes, the root with the larger rank becomes the new root, and if both roots have the same rank, we increase the rank of one of them. (rank is either the height or the weight of the tree) Path compression: after each update, we go along the path and make all the nodes point directly to the root. join (a,c) Path compression

6.1 Union-Find: Merging Classes of a Partition Worst case for each operation is O(h) which is O(log n) even without path compression. O(log n) upper bound is only for a single operation while for sequence of operations a better amortized bound is achieved (after path compression). The amortized complexity is expressed in a version of inverse Ackermann function which is a slow growing function unlike the fast growing Ackermann function. A(m, 0) = 0 for m ≥ 1, A(m, 1) = A(m − 1, 2) for m ≥ 1, A(0, n) = 2n for n ≥ 0, A(m, n) = A(m − 1, A(m, n − 1)) for m ≥ 1, n ≥ 2. We define as inverse Ackermann function the function: α(n) = min{i | A(i, 1) > n} m = number of operations n = number of elements.

6.1 Union-Find: Merging Classes of a Partition Theorem. The union-find structure with union by rank and path compression supports the operations insert in O(1) and same_set and join in O(log n) time on a set with n elements. A sequence of m same_set or join operations on a set with n elements takes O((m + n)α(n)) time.

6.1 Union-Find: Merging Classes of a Partition If we follow a node v over a sequence of operations, initially its rank is 0 and then it increases by some join operations, but only while v is still the root of its tree. Once v becomes a non-root node, its rank cannot change further and it is not possible for a non-root node to become a root again. If the node is a root then its level is 0, once the node becomes non-root its level increases. After m sequence of operations done on node v, we first observe that the work done with v while v is a root node is O(1) and each operation touches at most two root nodes, and so the part of the work done on root nodes by the m operations is O(m). The majority of the work is done on nodes when they are non-root nodes which is path compression To reduce the worst-case complexity of the union-find operations, tree height which is determined by the number of nodes and their indegree should be reduced by increasing the indegree of the nodes. Reduction is performed in join operation on two roots with same height and small indegree by redirecting edges of one root to the other root so that the new root has larger indegree but same height. Height of a tree with n nodes and k indegree is Θ (log k (n)) = Θ (log(n) / log(k))

6.1 Union-Find: Merging Classes of a Partition Time of join operation depends on the indegree of the root plus the height of the tree which can not be better than O( log(n) / log(log(n)) ) Rules for joining two components with roots r and s:- If r->height >= s->height: s and its lower neighbors are made pointed to a lower neighbor of r Else if r->height = s->height: All lower neighbors of s are added to the list of lower neighbors of r, If r->height > r->indegree: s is made to point to one lower neighbor of r. Else s becomes the new root with r as its only lower neighbor

6.1 Union-Find: Merging Classes of a Partition Theorem. The union-find structure described before supports the operations insert in O(1) and same_set and join in O( log(n) / log(log(n) ) time on a set with n elements.

6.2 Union-Find with Copies and Dynamic Segment Trees Model is very restricted (sets have to be disjoint and we can take only unions of them). After n − 1 unions everything is in the same class. Union-copy structure: keeps track of a set of items represented by fingers, and sets also represented by fingers. Example: Items = { 1, 2, 3, 4, 5, 6, 7} Sets = A, B, C, D, E items sets

items sets items sets 6.2 Union-Find with Copies and Dynamic Segment Trees The underlying representation of the set system is as follows: The data structure consists of item nodes, set nodes, and sets in two extended union-find structures – labeled A and B – which allow both normal and listing queries. Each item node has exactly one outgoing edge. A has at least two incoming edges and one outgoing edge. B has exactly one incoming edge and at least two outgoing edges. Each set node has exactly one incoming edge. Each set node has exactly one outgoing edge. B has at least two incoming edges and one outgoing edge. A has exactly one incoming edge and at least two outgoing edges. Each item node has exactly one incoming edge.

6.2 Union-Find with Copies and Dynamic Segment Trees It supports the following operations which are symmetric with respect to the role of items and sets: create item create set list sets list items insert join sets join items copy set copy item destroy set destroy item

6.2 Union-Find with Copies and Dynamic Segment Trees list_items: lists all items for a given set. Example: list_items (B). 1. Put the initial outgoing edge of the set node on the stack. 2. While the stack is not empty, take the next edge from the stack. 2.1 If this edge goes to an item node, list that item. 2.2 If this edge goes to union-find structure A, perform a listing query and put all outgoing edges on the stack. 2.3 If this edge goes to union-find structure B, perform a naming query and put the one outgoing edge on the stack. The total complexity of a list items query that returns k items is O(k uf(n)). The same holds for list sets query.

6.2 Union-Find with Copies and Dynamic Segment Trees copy_set: Creates representation for a new set, which is a copy of the given set, and returns a finger to it. It takes only constant time O(1). The same holds for copy_items query. Given the set node, we follow the outgoing edge. There are only two cases: 1. The outgoing edge of the set node directly goes to an item node or to the structure A: We create a new set node and a set with two new elements in structure B. The two set nodes are joined to the elements in B, and the name of the set in B is the previous outgoing pointer of the node to be duplicated. Then we return the new set node. 2. The outgoing edge of the set node goes to the structure B: We create a new set node and a new element in B, join the set node to the element, and insert the element in the set that the previous outgoing edge pointed to. Case 1 Case 2 Case 1

6.2 Union-Find with Copies and Dynamic Segment Trees join_sets: Replaces the first set by the union of two given sets and destroys the other set. Requires the two sets to be disjoint. Given two set nodes X and Y. 1.Both go to nodes in structure A. 2.The first set node points to a node in structure A and the second to a node in structure B. 3.Both set nodes point to nodes in structure B or item nodes. 4.The first set node points to a node in structure A and the second to an item node. So the complexity of join_sets and join_items is O(uf(n)). Case 1Case 2 Case 3 Case 4

6.2 Union-Find with Copies and Dynamic Segment Trees Case 1Case 2 Case 3 Case 4

6.2 Union-Find with Copies and Dynamic Segment Trees Theorem. The union-copy structure keeps track of a system of sets of total size n, supporting the operations create item, create set, insert, copy set, copy item in O(1); list sets, list items in output-sensitive O(k uf(n)) time if the output has size k; join sets, join items in O(uf(n)) time; destroy set, destroy item in O(k uf(n)) time if the size of the destroyed object was k.

6.2 Union-Find with Copies and Dynamic Segment Trees Segment tree is static data structure. It can be dynamic using union-find structure. Theorem. A segment tree that uses the union-copy structure to represent the sets associated with the tree nodes supports insert into a tree already containing n intervals in O(log n) time and list intervals for a query value contained in k intervals in O(log n + k uf(n)) output-sensitive time. Each set of intervals is represented by union-find structure

6.3 List Splitting In union-find model, we continue joining classes until all elements are in one class, but How to split a class into two classes, which element goes to which subclass? This is achieved by assuming that all elements are linearly ordered. Split is specified by an item by cutting the list in the right of the item to have smaller lists The model for list-splitting problem is initially an ordered list of n items, each of them with a weight. Later, this list is replaced by a set of lists which partition the items into intervals in the original ordering. It supports the following operations: split: splits the current list into two lists. same_list: checks if two items are in the same list. max_weight: returns a finger to the item of maximum weight.

6.3 List Splitting split (h) split (d) split (e)

6.3 List Splitting The three operations listed previously can be supported by a balanced search tree such as height- balanced trees or red-black trees. We build a single balanced tree from the list in O(n) time as preprocessing and include in each node a pointer to the maximum weight item in its subtree. Then each splitting operation splits the current tree in two trees: for each same_list query we just go up to the root of the current tree and check whether both nodes arrive at the same root. For the max weight query we go to the root and report the pointer stored in it. Each of these operations takes just O(log n) worst-case time. This is even a dynamic data structure: we can insert new elements in a sublist as neighbor of a given element, and we can delete elements and join lists again if the tree supports this

6.3 List Splitting Theorem. Using any balanced search tree that supports split and join, we can build a dynamic structure that supports list splitting, with operations split, same list, max weight, join, insert, and delete, all in O(log n) worst-case time on a list of initial length n.

6.4 Problems on Root-Directed Trees Least common ancestor (lca) : given two nodes of the tree, each node defines a path to the root. What is the first node that lies on both paths? lca=first common ancestor of two given nodes

6.4 Problems on Root-Directed Trees The lca structure keeps track of a set of root-directed trees and supports at least following operations: create tree: Creates a new tree with just one node, the root, and returns a pointer to that root. add leaf: Adds a new leaf that is linked to a given node and returns a pointer to that new leaf node. lca: Returns a pointer to the least common ancestor of the two given nodes or NULL if they are not in the same tree. link: Takes two nodes x and y and different subtrees, of which x is root of its subtree, and links the subtrees by introducing an edge from x to y (not all structures support it) delete leaf: Removes a given node, which must be a leaf. cut: Removes the link from a given node to its upper neighbor, making the given node the root of a new tree. find root: Returns a pointer to the root of a given node. depth: Returns the distance to the root of a given node.

6.4 Problems on Root-Directed Trees Case 1: x and y have equal depths Forward pointers of y …210 jLength= 2 j Log h … 2 j Forward pointers of x

6.4 Problems on Root-Directed Trees Case 2: x and y have different depths (k1 and k2, k1>k2) Replace the node at depth k1 by the node along its path to the root k1 − k2 steps on. Generate the equal depth case. Example: k1=9 k2=6 K1-k2=3 x y lca x` Forward pointers of x`Forward pointers of y

6.4 Problems on Root-Directed Trees Theorem. The lca structure based on trees with lists of exponential forward pointers attached to the nodes supports create tree, depth in O(1) and add leaf, delete leaf, lca, and find root in time O(log h), where h is the maximum height of the trees in the underlying set.

6.5 Maintaining a Linear Order The structure we are implementing is not necessarily a linked list BUT the underlying abstract model is a set with a linear order, which can be visualized by a list. Maintain a linear order under insertion and deletion. The elements are identified by fingers (pointers). Therefore, the following operations should be supported: insert(x, y): Inserts x as immediate smaller neighbor of y and returns a finger to x. delete(x): Deletes element x. compare(x, y): Decides whether x is smaller than y in the current linear order. This problem would be easy if the elements came with a key and the order was the order of the keys. Then we needed just a key comparison to check the order relation. Our problem is that we have to assign these keys based on the neighbor information at the insertion time. Solution: use a balanced search tree with elements on leaves and compare two elements (left-to- right order) by going the path up to the root and checking the order in which the paths enter their first common vertex.

6.5 Maintaining a Linear Order Theorem. Using a balanced search tree that allows constant time update at a known location, we can maintain a linear order with O(1) worst-case time of insert and delete and O(log n) worst-case time of compare.

References: Fundamentals of Data Structures in C++, Ellis Horowitz, Sartaj Sahni and Dinesh Mehta, 2nd Edition. Advanced Data Structures, Peter Brass, Cambridge University Press, Disjoint-Set Data Structures, Lecture (MIT open courses). Computer Algorithms C++, Horowitz and Sahni, Union-copy structures and dynamic segment trees, Kreveld and Overmars, Efficiency of a good But Not Linear Set Union Algorithm, E. Tarjan, University of California, 1975.

Any Questions?? Thank You..