CMSC 341 Disjoint Sets Based on slides from previous iterations of this course.

Slides:



Advertisements
Similar presentations
1 Union-find. 2 Maintain a collection of disjoint sets under the following two operations S 3 = Union(S 1,S 2 ) Find(x) : returns the set containing x.
Advertisements

1 Disjoint Sets Set = a collection of (distinguishable) elements Two sets are disjoint if they have no common elements Disjoint-set data structure: –maintains.
EECS 311: Chapter 8 Notes Chris Riesbeck EECS Northwestern.
Union-Find: A Data Structure for Disjoint Set Operations
Disjoint Sets Given a set {1, 2, …, n} of n elements. Initially each element is in a different set.  {1}, {2}, …, {n} An intermixed sequence of union.
CSE 326: Data Structures Disjoint Union/Find. Equivalence Relations Relation R : For every pair of elements (a, b) in a set S, a R b is either true or.
Union-Find Problem Given a set {1, 2, …, n} of n elements. Initially each element is in a different set. {1}, {2}, …, {n} An intermixed sequence of union.
Heaps Heaps are used to efficiently implement two operations:
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 17 Union-Find on disjoint sets Motivation Linked list representation Tree representation.
Lecture 16: Union and Find for Disjoint Data Sets Shang-Hua Teng.
CSE 373, Copyright S. Tanimoto, 2002 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
CS2420: Lecture 42 Vladimir Kulyukin Computer Science Department Utah State University.
COMP 261 Lecture 12 Disjoint Sets. Menu Kruskal's minimum spanning tree algorithm Disjoint-set data structure and Union-Find algorithm Administrivia –Marking.
Union-Find Problem Given a set {1, 2, …, n} of n elements. Initially each element is in a different set.  {1}, {2}, …, {n} An intermixed sequence of.
Computer Algorithms Submitted by: Rishi Jethwa Suvarna Angal.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Aaron Bauer Winter 2014.
CMSC 341 Disjoint Sets. 8/3/2007 UMBC CMSC 341 DisjointSets 2 Disjoint Set Definition Suppose we have an application involving N distinct items. We will.
CMSC 341 Disjoint Sets Textbook Chapter 8. Equivalence Relations A relation R is defined on a set S if for every pair of elements (a, b) with a,b  S,
Union-find Algorithm Presented by Michael Cassarino.
Union Find ADT Data type for disjoint sets: makeSet(x): Given an element x create a singleton set that contains only this element. Return a locator/handle.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Nicki Dell Spring 2014.
CSE373: Data Structures & Algorithms Lecture 10: Implementing Union-Find Dan Grossman Fall 2013.
Fundamental Data Structures and Algorithms Peter Lee April 24, 2003 Union-Find.
Union & Find Problem 황승원 Fall 2010 CSE, POSTECH 2 2 Union-Find Problem Given a set {1, 2, …, n} of n elements. Initially each element is in a different.
1 Disjoint Set Data Structure. 2 Disjoint Sets What are Disjoint Sets? Tree Representation Basic Operations Parent Array Representation Simple Find and.
Disjoint-Set Operation. p2. Disjoint Set Operations : MAKE-SET(x) : Create new set {x} with representative x. UNION(x,y) : x and y are elements of two.
ICS 353: Design and Analysis of Algorithms Heaps and the Disjoint Sets Data Structures King Fahd University of Petroleum & Minerals Information & Computer.
1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei.
1 Today’s Material The dynamic equivalence problem –a.k.a. Disjoint Sets/Union-Find ADT –Covered in Chapter 8 of the textbook.
CS 146: Data Structures and Algorithms July 16 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak
MA/CSSE 473 Days Answers to student questions Prim's Algorithm details and data structures Kruskal details.
0 Union-Find data structure. 1 Disjoint set ADT (also Dynamic Equivalence) The universe consists of n elements, named 1, 2, …, n n The ADT is a collection.
MST, Topological Sort and Disjoint Sets
WEEK 5 The Disjoint Set Class Ch CE222 Dr. Senem Kumova Metin
MA/CSSE 473 Day 37 Student Questions Kruskal Data Structures and detailed algorithm Disjoint Set ADT 6,8:15.
Data Structures for Disjoint Sets Manolis Koubarakis Data Structures and Programming Techniques 1.
How many edges does a tree of n nodes have? 1.2n-4 2.n(n-1)/2 3.n-1 4.n/2 5.Between n and 2n.
CSE 373: Data Structures and Algorithms
Data Structures for Disjoint Sets
CSCE 210 Data Structures and Algorithms
Data Structures: Disjoint Sets, Segment Trees, Fenwick Trees
Disjoint Sets Chapter 8.
Data Structures: Disjoint Sets
An application of trees: Union-find problem
Course Outline Introduction and Algorithm Analysis (Ch. 2)
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Linda Shapiro Spring 2016.
Disjoint Sets with Arrays
Disjoint Set Neil Tang 02/23/2010
Disjoint Set Neil Tang 02/26/2008
CS200: Algorithm Analysis
CSE 332: Data Structures Disjoint Set Union/Find
ICS 353: Design and Analysis of Algorithms
CSE 373 Data Structures and Algorithms
CSE 332: Data Abstractions Union/Find II
ICS 353: Design and Analysis of Algorithms
CSE 373: Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
CSE373: Data Structures & Algorithms Implementing Union-Find
CMSC 341 Disjoint Sets.
CSE 326 Union Find Richard Anderson Text Chapter CSE 326.
Disjoint Sets DS.S.1 Chapter 8 Overview Dynamic Equivalence Classes
CMSC 341 Disjoint Sets.
Running Time Analysis Union is clearly a constant time operation.
Kruskal’s algorithm for MST and Special Data Structures: Disjoint Sets
An application of trees: Union-find problem
Disjoint Sets Textbook Chapter 8
CSE 373: Data Structures and Algorithms
Disjoint Set Operations: “UNION-FIND” Method
Presentation transcript:

CMSC 341 Disjoint Sets Based on slides from previous iterations of this course

Today’s Topics Introduction to Disjoint Sets Disjoint Set Example Operations of a Disjoint Set Types of Disjoint Sets Optimization of Disjoint Sets Code for Disjoint Sets Performance of Disjoint Sets

Introduction to Disjoint Sets

Disjoint Sets A data structure that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets Great for problems that require equivalence relations

Universe of Items Universal set is made up of all of the items that can be a member of a set A B C Universe of Items E D From: https://www.youtube.com/watch?v=UBY4sF86KEY

Disjoint Sets A group of sets where no item can be in more than one set S3 S1 A B S4 C Universe of Items E D S2 From: https://www.youtube.com/watch?v=UBY4sF86KEY

Disjoint Sets A group of sets where no item can be in more than one set S3 S1 A B S4 Supported Operations: Find() Union() MakeSet() C E D S2

Uses for Disjoint Sets Maze generation Kruskal's algorithm for computing the minimum spanning tree of a graph Given a set of cities, C, and a set of roads, R, that connect two cities (x, y) determine if it’s possible to travel from any given city to another given city Determining if there are cycles in a graph

Disjoint Set Example

Disjoint Set with No Unions 1 2 3 4 5 6 7 8 9 10 11 12 13 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 1 2 3 4 5 6 7 8 9 10 11 12 13 A negative number means we are at the root A positive number means we need to move or “walk” to that index to find our root The LONGER the path, the longer it takes to find, and moves farther away from our goal of a constant timed function

Disjoint Set with Some Unions 9 4 8 1 2 3 5 6 7 10 11 12 13 -1 -1 4 -1 -1 -1 -1 8 9 -1 -1 -1 -1 -1 1 2 3 4 5 6 7 8 9 10 11 12 13 Notice: Value of index is where the index is linked to

Operations of a Disjoint Set

Find() Determine which subset an element is in Returns the name of the subset Find() typically returns an item from this set that serves as its "representative“ By comparing the result of two Find() operations, one can determine whether two elements are in the same subset

Find() Asks the question, what set does item E belong to currently? A What does Find(E) return? A B S4 C Returns S2 E D S2 From: https://www.youtube.com/watch?v=UBY4sF86KEY

Union() Union() Merge two sets (w/ one or more items) together Order can be important One of the roots from the 2 sets will become the root of the merged set

Union() Join two subsets into a single subset. A B C E D S3 S1 S4 Before Union(S2, S1) After Union(S2, S1) C E D S2 From: https://www.youtube.com/watch?v=UBY4sF86KEY

MakeSet() Makes a set containing only a given element (a singleton) Implementation is generally trivial

Types of Disjoint Sets

Types of Disjoint Sets There are two types of disjoint sets Array Based Disjoint Sets Tree Based Disjoint Sets (We can also implement with a linked list)

Array Based Disjoint Sets We will assume that elements are 0 to n - 1 Maintain an array A: for each element i, A[i] is the name of the set containing I Not the links, just the set representative

Array Based Disjoint Sets Find(i) returns A[i] Runs in O(1) Union(i,j) requires scanning entire array Runs in O(n) for (k = 0;k < n; k++) { if (A[k] == A[j]) { A[k] = A[i]; } }

Tree Based Disjoint Sets Disjoint-set forests are data structures Each set is represented by a tree data structure Each node holds a reference to its parent node In a disjoint-set forest, the representative of each set is the root of that set's tree

Tree Based Disjoint Sets Find() follows parent nodes until it reaches the root Union() combines two trees into one by attaching the root of one to the root of the other

Animation Disjoint Sets https://www.cs.usfca.edu/~galles/visualization/DisjointSets.html

Optimization of Disjoint Sets

Optimization Three main optimization operations: Union-by-rank (size) Union-by-rank (height) Path Compression

Union-by-Rank (size) Size = number of nodes (including root) in given set A strategy to keep items in a tree from getting too deep (large paths) by uniting sets intelligently At each root, we record the size of its sub-tree The number of nodes in the collective tree Link the smaller to the larger tree

Union-by-Rank (size) 10 9 4 8 11 1 2 3 5 6 7 12 13 -1 -1 4 -1 -2 -1 -1 1 2 3 5 6 7 12 13 -1 -1 4 -1 -2 -1 -1 8 9 -5 9 9 -1 -1 1 2 3 4 5 6 7 8 9 10 11 12 13 Notice two things: Value of index is where the index is linked to As the size of the set increases, the negative number size of the root decreases (see 4 and 9)

Union-by-Rank (height) A strategy to keep items in a tree from getting too deep (large paths) by uniting sets intelligently At each root, we record the height of its sub-tree When uniting two trees, make the shorter tree a sub-tree of the larger one So that the tree that is larger does not add another level!!

Union-by-Rank (height) 9 4 8 1 2 3 5 6 7 10 11 12 13 -1 -1 4 -1 -2 -1 -1 8 9 -3 -1 -1 -1 -1 1 2 3 4 5 6 7 8 9 10 11 12 13 Notice two things: Value of index is where the index is linked to As the size of the set increases, the negative number height of the root increases (see 4 and 9)

Union-by-Rank (height) 9 4 8 1 2 3 5 6 7 10 11 12 13 -1 -1 4 -1 -2 -1 -1 8 9 -3 -1 -1 -1 -1 1 2 3 4 5 6 7 8 9 10 11 12 13 What if we merge {2,4} with {7, 8, 9}? Because 9 has a greater height than 4, 4 would be absorbed into 9.

Union-by-Rank (height) 4 9 2 8 1 3 5 6 7 10 11 12 13 Update 4 to point to 9 -1 -1 4 -1 9 -2 -1 -1 8 9 -3 -1 -1 -1 -1 1 2 3 4 5 6 7 8 9 10 11 12 13 When uniting two trees, make the smaller tree a sub-tree of the larger one so that the one tree that is larger does not add another level!!

Example of Unions If we union 5 and 9, how will they be joined?

Example of Unions By rank (size)? By rank (height)? 9 becomes a child of 5 By rank (height)? 5 becomes a child of 9

Path Compression If our path gets longer, operations take longer We can shorten this (literally and figuratively) by updating the element values of each child directly to the root node value No more walking through to get to the root Done as part of Find() So the speed up will be eventual

Path Compression Theoretically flattens out a tree Uses recursion Base case Until you find the root Return the root value Reassign as the call stack collapses

Path Compression Before Path Compression 4 10 9 1 2 3 5 6 8 11 7 12 13 1 2 3 5 6 8 11 7 12 13 -1 -1 4 -1 -1 -1 10 8 9 -1 9 9 11 12 1 2 3 4 5 6 7 8 9 10 11 12 13 During a Find(), we update the index to point to the root

Path Compression After Path Compression 4 7 6 12 1 2 3 5 10 9 13 8 11 1 2 3 5 10 9 13 8 11 -1 -1 4 -1 -1 -1 9 9 9 -1 9 9 9 9 1 2 3 4 5 6 7 8 9 10 11 12 13 After we run Find(6)we update it to point to 9 After we run Find(13)we update it to point to 9 Along with all other nodes between 13 and 9!

Code for Disjoint Sets

Generic Code function MakeSet(x) x.parent := x function Find(x) if x.parent == x return x else return Find(x.parent) function Union(x, y) xRoot := Find(x) yRoot := Find(y) xRoot.parent := yRoot Does this include any of the three optimization? (union-by-rank(height), union-by-rank(size), or path compression) ANSWER: NO

C++ Implementation class UnionFind { int[] u; UnionFind(int n) { u = new int[n]; for (int i = 0; i < n; i++) u[i] = -1; } int find(int i) { int j,root; for (j = i; u[j] >= 0; j = u[j]) ; root = j; while (u[i] >= 0) { j = u[i]; u[i] = root; i = j; } return root; void union(int i,int j) { i = find(i); j = find(j); if (i !=j) { if (u[i] < u[j]) { u[i] += u[j]; u[j] = i; } else { u[j] += u[i]; u[i] = j; } Does this include any of the three optimization? (union-by-rank(height), union-by-rank(size), or path compression) ANSWER: YES – union-by-rank(size) at the bottom of union.

The UnionFind class class UnionFind { int[] u; UnionFind(int n) { u = new int[n]; for (int i = 0; i < n; i++) u[i] = -1; } int find(int i) { ... } void union(int i,int j) { ... }

Trick 1: Iterative find int find(int i) { int j, root; for (j = i; u[j] >= 0; j = u[j]) ; root = j; while (u[i] >= 0) { j = u[i]; u[i] = root; i = j; } return root; }

Trick 2: Union by size i = find(i); j = find(j); if (i != j) { void union(int i,int j) { i = find(i); j = find(j); if (i != j) { if (u[i] < u[j]) { u[i] += u[j]; u[j] = i; } else { u[j] += u[i]; u[i] = j; } }

Disjoint Sets Performance

Performance In a nutshell Union operations obviously take Θ(1) time Running time complexity: O(1) for union Using ONE pointer to connect from one root to another Running time of find depends on implementation Union by size: Find is O(log(n)) Union by height: Find is O(log(n)) Union operations obviously take Θ(1) time Code has no loops or recursion Θ(f(n)) is when the worst case and best case are identical

Performance The average running time of any find and union operations in the quick-union data structure is so close to a constant that it's hardly worth mentioning that, in an asymptotic sense, it's slightly slower in real life

Performance A sequence of f find and u union operations (in any order and possibly interleaved) takes Theta(u + f α(f + u, u)) time in the worst case α is an extremely slowly-growing function Known as the inverse Ackermann function. This function is never larger than 4 for any values of f and u you could ever use (though it can get arbitrarily large—for unimaginably large values of f and u). Hence, for all practical purposes think of quick-union as having find operations that run, on average, in constant time.

Announcements Homework 5 is out Project 5 is out Next Time: Due Tuesday, November 17th at 8:59:59 PM Project 5 is out Due Tuesday, December 1st at 8:59:59 PM Next Time: Exam 2 Review No Class on Wednesday, November 25th Happy Thanksgiving!