Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturers: Haim Kaplan and Uri Zwick January 2014.

Slides:



Advertisements
Similar presentations
1 Union-find. 2 Maintain a collection of disjoint sets under the following two operations S 3 = Union(S 1,S 2 ) Find(x) : returns the set containing x.
Advertisements

1 Disjoint Sets Set = a collection of (distinguishable) elements Two sets are disjoint if they have no common elements Disjoint-set data structure: –maintains.
1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 20 Prof. Erik Demaine.
© 2004 Goodrich, Tamassia Union-Find1 Union-Find Partition Structures.
Union-Find: A Data Structure for Disjoint Set Operations
Andreas Klappenecker [Based on slides by Prof. Welch]
Disjoint-Set Operation
CSE 326: Data Structures Disjoint Union/Find Ben Lerner Summer 2007.
Disjoint Union / Find CSE 373 Data Structures Lecture 17.
CSE 326: Data Structures Disjoint Union/Find. Equivalence Relations Relation R : For every pair of elements (a, b) in a set S, a R b is either true or.
CPSC 411, Fall 2008: Set 7 1 CPSC 411 Design and Analysis of Algorithms Set 7: Disjoint Sets Prof. Jennifer Welch Fall 2008.
CPSC 311, Fall CPSC 311 Analysis of Algorithms Disjoint Sets Prof. Jennifer Welch Fall 2009.
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 17 Union-Find on disjoint sets Motivation Linked list representation Tree representation.
Course: Data Structures Lecturer: Uri Zwick March 2008
Lecture 16: Union and Find for Disjoint Data Sets Shang-Hua Teng.
CSE 373, Copyright S. Tanimoto, 2002 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
CS2420: Lecture 42 Vladimir Kulyukin Computer Science Department Utah State University.
Course: Data Structures Lecturers: Haim Kaplan and Uri Zwick June 2010
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
COMP 261 Lecture 12 Disjoint Sets. Menu Kruskal's minimum spanning tree algorithm Disjoint-set data structure and Union-Find algorithm Administrivia –Marking.
Chapter 6: Union-Find and Related Structures CS6310 ADVANCED DATA STRUCTURE SHADHA MUHI & HASNAA IMAD.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
1 22c:31 Algorithms Union-Find for Disjoint Sets.
Computer Algorithms Submitted by: Rishi Jethwa Suvarna Angal.
CS 473Lecture X1 CS473-Algorithms I Lecture X1 Properties of Ranks.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Aaron Bauer Winter 2014.
CMSC 341 Disjoint Sets Textbook Chapter 8. Equivalence Relations A relation R is defined on a set S if for every pair of elements (a, b) with a,b  S,
Disjoint Sets Data Structure (Chap. 21) A disjoint-set is a collection  ={S 1, S 2,…, S k } of distinct dynamic sets. Each set is identified by a member.
Lecture X Disjoint Set Operations
Disjoint Sets Data Structure. Disjoint Sets Some applications require maintaining a collection of disjoint sets. A Disjoint set S is a collection of sets.
Union-find Algorithm Presented by Michael Cassarino.
Union Find ADT Data type for disjoint sets: makeSet(x): Given an element x create a singleton set that contains only this element. Return a locator/handle.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Nicki Dell Spring 2014.
CSE373: Data Structures & Algorithms Lecture 10: Implementing Union-Find Dan Grossman Fall 2013.
CSCE 411H Design and Analysis of Algorithms Set 7: Disjoint Sets Prof. Evdokia Nikolova* Spring 2013 CSCE 411H, Spring 2013: Set 7 1 * Slides adapted from.
Disjoint-Set Operation. p2. Disjoint Set Operations : MAKE-SET(x) : Create new set {x} with representative x. UNION(x,y) : x and y are elements of two.
Union-Find  Application in Kruskal’s Algorithm  Optimizing Union and Find Methods.
0 Union-Find data structure. 1 Disjoint set ADT (also Dynamic Equivalence) The universe consists of n elements, named 1, 2, …, n n The ADT is a collection.
CMSC 341 Disjoint Sets. 2 Disjoint Set Definition Suppose we have N distinct items. We want to partition the items into a collection of sets such that:
Union By Rank Ackermann’s Function Graph Algorithms Rajee S Ramanikanthan Kavya Reddy Musani.
21. Data Structures for Disjoint Sets Heejin Park College of Information and Communications Hanyang University.
CSE 373, Copyright S. Tanimoto, 2001 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
CSE 373: Data Structures and Algorithms
Data Structures for Disjoint Sets
Data Structures Binomial Heaps Fibonacci Heaps Haim Kaplan & Uri Zwick
Disjoint Sets Data Structure
CSE 373, Copyright S. Tanimoto, 2001 Up-trees -
Disjoint Sets Data Structure (Chap. 21)
Heaps Binomial Heaps Lazy Binomial Heaps 1.
Data Structures Lecture 4 AVL and WAVL Trees Haim Kaplan and Uri Zwick
Course Outline Introduction and Algorithm Analysis (Ch. 2)
CMSC 341 Disjoint Sets Based on slides from previous iterations of this course.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Linda Shapiro Spring 2016.
Disjoint Set Neil Tang 02/23/2010
Disjoint Set Neil Tang 02/26/2008
CSE 332: Data Structures Disjoint Set Union/Find
CSCE 411 Design and Analysis of Algorithms
CSE 332: Data Abstractions Union/Find II
Course: Data Structures Lecturer: Uri Zwick March 2008
Union-Find Partition Structures
CSE373: Data Structures & Algorithms Implementing Union-Find
Union-Find Partition Structures
Union-Find with Constant Time Deletions
Kruskal’s algorithm for MST and Special Data Structures: Disjoint Sets
Disjoint Sets Data Structure (Chap. 21)
Lecture 21 Amortized Analysis
CSE 373: Data Structures and Algorithms
Disjoint Set Operations: “UNION-FIND” Method
Presentation transcript:

Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturers: Haim Kaplan and Uri Zwick January 2014

Union-Find x  Make-Set(info): Create an item x, with associated information info, and create a set containing it as its single item Union(x,y): Unite the sets containing x and y Find(x): Return a representative of the set containing x Find(x)=Find(y) iff x and y are currently in same set Variation: Make-Set and Union specify a name for new set Find(x) returns name of set containing x

Union Find a  Make-Set() b  Make-Set() Union(a,b) Find(b)  a c d e a  Make-Set() b  Make-Set() Union(a,b) Find(b)  a Find(a)  a c  Make-Set() d  Make-Set() e  Make-Set() Union(c,d) Union(d,e) Find(e)  d

Union-Find Make-Set Link Find O(1) O(log n) O(1) O(log n) O(1) O(α(n)) Amortized Worst Case Amortized Inverse Ackermann “almost constant” Link(x,y): Unite the sets containing the representative elements x and y Union(x,y) → Link(Find(x),Find(y))

Important aplication: Incremental Connectivity A graph on n vertices is built by adding edges At each stage we may want to know whether two given vertices are already connected 5 2 4 7 1 3 6 union(1,2) union(2,7) Find(1)=Find(6)? union(3,5) …

Fun aplication: Generating mazes c1  Make-Set(1) c2 Make-Set(2) c16  Make-Set(16) … 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 find(c6)=find(c7) ? union(c6,c7) find(c7)=find(c11) ? union(c7,c11) … Choose edges in random order and remove them if they connect two different regions

Fun aplication: Generating mazes 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Generating mazes – a larger example Construction time -- O(n2 α(n2))

More serious aplications: Maintaining an equivalence relation Incremental connectivity in graphs Computing minimum spanning trees …

Implementation using linked lists Each set is represented as a linked list Each element has a pointer to the list size k last first α … Set χ β γ a x Find(x) – O(1) time

Union using linked lists size k1 last first α … β γ ξ y x size k2 last first δ ε η Concatenate the two lists Change “set pointers” of shorter list Union(x,y) – O(min{k1,k2}) time

Union using linked lists Analysis Let n be the total number of Make-Set operations Make-Set(x) and Find(x) take O(1) worst case time Union(x,y) takes O(n) worst case time But… Whenever the set pointer of an item is changed, the size of the set containing it is at least doubled The set pointer can be changed at most log n times Total cost of all Union operations is O(n log n)

Union(x,y) → Link(Find(x),Find(y)) Union-Find Make-Set Link Find O(1) O(log n) O(1) O(log n) O(1) O(α(n)) Amortized Worst Case Amortized Link(x,y): Unite the sets containing the representative elements x and y Union(x,y) → Link(Find(x),Find(y))

Union Find Represent each set as a rooted tree Union by rank Path compression x.p x The parent of a vertex x is denoted by x.p Find(x) traces the path from x to the root

Union by rank r+1 r2 r r r1 r1< r2 r1 r1< r2 Union by rank on its own gives O(log n) find time A tree of rank r contains at least 2r elements At most n/2r nodes of rank  r If x is not a root, then x.rank < x.p.rank

Path Compression

Union Find - pseudocode

Union-Find union by rank + path compression Worst case make link find O(1) O(log n) Amortized make link find O(1) O(α(n))

Nesting / Repeated application

Ackermann’s function (one of many variations)

The Tower function T(n) n 2 1 4 16 3 65,536 265,536 5

Ackermann’s function (modified)

Inverse functions

“For all practical purposes log*(n)  5” The log*n function log*(n) n 1 0 – 2 2 3 – 4 3 5 – 16 4 17 – 65,536 5 65,537 – 265,536 “For all practical purposes log*(n)  5”

Inverse Ackermann function is the inverse of the function

Inverse Ackermann function is the inverse of the function A “diagonal” The first “column”

O(log*n) upper bound For the sake of simplicity, we prove an O(log*n) upper bound on the amortized cost of find The O((n)) upper bound is more complicated (see potential based analysis below) We use a variant of the accounting method in which items accumulate debits

The level of a node x is defined to be level(x) = log*(x.rank) O(log*n) upper bound The level of a node x is defined to be level(x) = log*(x.rank) level(x) x.rank 1 0 – 2 2 3 – 4 3 5 – 16 4 17 – 65,536 5 65,537 – 265,536  i T(i1)+1 – T(i)

O(log*n) upper bound level[x] rank[x] 1 0 – 2 i T(i1)+1 – T(i) The number of (non-root) nodes of level 1 The number of (non-root) nodes of level i > 1

The ranks along each path are increasing O(log*n) upper bound The ranks along each path are increasing root Level < log*n Level i+1 x Level i

O(log*n) upper bound Consider a find operation passing through x: If x is not a root, and not a child of the root, and level(x)=level(x.p), we charge x Otherwise, we charge the find operation. Total charge for the find operation ≤ log*n What is the total charge to all the nodes in an arbitrary sequence of operations ???

O(log*n) upper bound

Total charge to all nodes over all Find’s O(log*n) upper bound Charge to each Find Total charge to all nodes over all Find’s amort(Make-Set) amort(Find)

Lowest Common Ancestor (LCA) LCAT(x,y) – The lowest node z which is an ancestor of both x and y a T LCA(e,k) = a LCA(f,g) = b LCA(c,h) = c … b c d e f g h i j k

The off-line LCA problem Given a tree T and a collection P of pairs, find LCAT(x,y) for every (x,y)P Using Union-Find we can get O((m+n)(m+n)) time, where n=|T| and m=|P| There are more involved linear time algorithm, even for the on-line version

The off-line LCA problem Going down: uv Make-Set(v) Going up: vu Union(u,v) u We want these to be the representatives (How do we do it?) v If w<v, then LCA(w,v) = “Find(w)”

The O((n)) upper bound for Union-Find (For those interested)

Amortized analysis (reminder) Actual cost of i-th operation Amortized cost of i-th operation Potential after i-th operation

Amortized analysis (cont.) Total actual cost

Level and Index Back to union-find…

Potentials

Bounds on level Definition Proof Claim

Bounds on index

Amortized cost of make Actual cost: O(1) : 0 Amortized cost: O(1)

Amortized cost of link Actual cost: O(1) x y Actual cost: O(1) z1 … zk The potentials of y and z1,…,zk can only decrease The potentials of x is increased by at most (n)   (n) Amortized cost: O((n))

Amortized cost of find y=p’[x] rank[x] is unchanged rank[p[x]] is increased level(x) is either unchanged or is increased p[x] If level(x) is unchanged, then index(x) is either unchanged or is increased x If level(x) is increased, then index(x) is decreased by at most rank[x]–1 is either unchanged or is decreased

Amortized cost of find Suppose that: xl xj xi x=x0 (x) is decreased !

Amortized cost of find Actual cost: l +1    ((n)+1) – (l +1) xl xj xi The only nodes that can retain their potential are: the first, the last and the last node of each level x=x0 Actual cost: l +1    ((n)+1) – (l +1) Amortized cost: (n)+1