CSE 373: Data Structures and Algorithms

Slides:



Advertisements
Similar presentations
CSE 326: Data Structures Part 7: The Dynamic (Equivalence) Duo: Weighted Union & Path Compression Henry Kautz Autumn Quarter 2002 Whack!! ZING POW BAM!
Advertisements

1 Union-find. 2 Maintain a collection of disjoint sets under the following two operations S 3 = Union(S 1,S 2 ) Find(x) : returns the set containing x.
1 Disjoint Sets Set = a collection of (distinguishable) elements Two sets are disjoint if they have no common elements Disjoint-set data structure: –maintains.
Union-Find: A Data Structure for Disjoint Set Operations
CSE 326: Data Structures Disjoint Union/Find Ben Lerner Summer 2007.
Disjoint Union / Find CSE 373 Data Structures Lecture 17.
CSE 326: Data Structures Disjoint Union/Find. Equivalence Relations Relation R : For every pair of elements (a, b) in a set S, a R b is either true or.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
CSE 326: Data Structures Lecture #19 Disjoint Sets Dynamic Equivalence Weighted Union & Path Compression David Kaplan Today we’re going to get.
Lecture 9 Disjoint Set ADT. Preliminary Definitions A set is a collection of objects. Set A is a subset of set B if all elements of A are in B. Subsets.
Lecture 16: Union and Find for Disjoint Data Sets Shang-Hua Teng.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
CSE 373, Copyright S. Tanimoto, 2002 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
CS2420: Lecture 42 Vladimir Kulyukin Computer Science Department Utah State University.
MA/CSSE 473 Day 36 Kruskal proof recap Prim Data Structures and detailed algorithm.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
1 22c:31 Algorithms Union-Find for Disjoint Sets.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Aaron Bauer Winter 2014.
CMSC 341 Disjoint Sets. 8/3/2007 UMBC CMSC 341 DisjointSets 2 Disjoint Set Definition Suppose we have an application involving N distinct items. We will.
CMSC 341 Disjoint Sets Textbook Chapter 8. Equivalence Relations A relation R is defined on a set S if for every pair of elements (a, b) with a,b  S,
Union-find Algorithm Presented by Michael Cassarino.
Union Find ADT Data type for disjoint sets: makeSet(x): Given an element x create a singleton set that contains only this element. Return a locator/handle.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Nicki Dell Spring 2014.
CSE373: Data Structures & Algorithms Lecture 10: Implementing Union-Find Dan Grossman Fall 2013.
Fundamental Data Structures and Algorithms Peter Lee April 24, 2003 Union-Find.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 223 – Advanced Data Structures Disjoint Sets.
1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei.
1 Today’s Material The dynamic equivalence problem –a.k.a. Disjoint Sets/Union-Find ADT –Covered in Chapter 8 of the textbook.
Minimum Spanning Trees Featuring Disjoint Sets HKOI Training 2006 Liu Chi Man (cx) 25 Mar 2006.
Union-Find  Application in Kruskal’s Algorithm  Optimizing Union and Find Methods.
MA/CSSE 473 Days Answers to student questions Prim's Algorithm details and data structures Kruskal details.
CMSC 341 Disjoint Sets. 2 Disjoint Set Definition Suppose we have N distinct items. We want to partition the items into a collection of sets such that:
CSE 589 Applied Algorithms Spring 1999 Prim’s Algorithm for MST Load Balance Spanning Tree Hamiltonian Path.
MA/CSSE 473 Day 37 Student Questions Kruskal Data Structures and detailed algorithm Disjoint Set ADT 6,8:15.
CSE 373, Copyright S. Tanimoto, 2001 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
Data Structures for Disjoint Sets Manolis Koubarakis Data Structures and Programming Techniques 1.
How many edges does a tree of n nodes have? 1.2n-4 2.n(n-1)/2 3.n-1 4.n/2 5.Between n and 2n.
CSE 326: Data Structures: Set ADT
Disjoint Sets Data Structure
CSE 373, Copyright S. Tanimoto, 2001 Up-trees -
Chapter 8 Disjoint Sets and Dynamic Equivalence
Greedy method Idea: sequential choices that are locally optimum combine to form a globally optimum solution. The choices should be both feasible and irrevocable.
Disjoint Sets Chapter 8.
Greedy Algorithms / Minimum Spanning Tree Yin Tat Lee
CMSC 341 Disjoint Sets Based on slides from previous iterations of this course.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Linda Shapiro Spring 2016.
Disjoint Set Neil Tang 02/23/2010
Disjoint Set Neil Tang 02/26/2008
CS200: Algorithm Analysis
CSE 332: Data Structures Disjoint Set Union/Find
CSE 373 Data Structures and Algorithms
CSE 332: Data Abstractions Union/Find II
CSE373: Data Structures & Algorithms Lecture 19: Spanning Trees
ICS 353: Design and Analysis of Algorithms
CSE373: Data Structures & Algorithms Implementing Union-Find
Union-Find.
CMSC 341 Disjoint Sets.
CSE 326 Union Find Richard Anderson Text Chapter CSE 326.
Disjoint Sets DS.S.1 Chapter 8 Overview Dynamic Equivalence Classes
Lecture 20: Disjoint Sets
CMSC 341 Disjoint Sets.
Running Time Analysis Union is clearly a constant time operation.
Kruskal’s algorithm for MST and Special Data Structures: Disjoint Sets
Lecture 21 Amortized Analysis
Disjoint Sets Textbook Chapter 8
CSE 373: Data Structures and Algorithms
Disjoint Set Operations: “UNION-FIND” Method
Presentation transcript:

CSE 373: Data Structures and Algorithms Lecture 23: Disjoint Sets

Kruskal's Algorithm Implementation sort edges in increasing order of length (e1, e2, e3, ..., em). T := {}. for i = 1 to m if ei does not add a cycle: add ei to T. return T. But how can we determine that adding ei to T won't add a cycle?

Disjoint-set Data Structure Keeps track of a set of elements partitioned into a number disjoint subsets two sets are said to be disjoint if they have no elements in common Initially, each element e is a set in itself: e.g., { {e1}, {e2}, {e3}, {e4}, {e5}, {e6}, {e7}}

Operations: Union Union(x, y) – Combine or merge two sets x and y into a single set Before: {{e3, e5, e7} , {e4, e2, e8}, {e9}, {e1, e6}} After Union(e5, e1): {{e3, e5, e7, e1, e6} , {e4, e2, e8}, {e9}}

Operations: Find Determine which set a particular element is in Useful for determining if two elements are in the same set Each set has a unique name name is arbitrary; what matters is that find(a) == find(b) is true only if a and b in the same set one of the members of the set is the "representative" (i.e. name) of the set {{e3, e5, e7, e1, e6} , {e4, e2, e8}, {e9}}

Operations: Find Find(x) – return the name of the set containing x. {{e3, e5, e7, e1, e6} , {e4, e2, e8}, {e9}} Find(e1) = e5 Find(e4) = e8

Kruskal's Algorithm Implementation (Revisited) sort edges in increasing order of length (e1, e2, e3, ..., em). initialize disjoint sets. T := {}. for i = 1 to m let ei = (u, v). if find(u) != find(v) union(find(u), find(v)). add ei to T. return T. What does the disjoint set initialize to? How many times do we do a union? How many time do we do a find? What is the total running time if we have n nodes and m edges?

Disjoint Sets with Linked Lists Approach 1: Create a linked list for each set. last/first element is representative cost of union? find? Approach 2: Create linked list for each set. Every element has a reference to its representative.

Disjoint Sets with Trees Observation: trees let us find many elements given one root (i.e. representative)… Idea: if we reverse the pointers (make them point up from child to parent), we can find a single root from many elements… Idea: Use one tree for each subset. The name of the class is the tree root.

Up-Tree for Disjoint Sets Initial state 1 2 3 4 5 6 7 Intermediate state 1 3 7 2 5 4 Roots are the names of each set. 6

Union Operation Union(x, y) – assuming x and y roots, point x to y. 1 3 7 2 5 4 6

Find Operation Find(x): follow x to root and return root Find(6) = 7 1 3 7 2 5 4 Find(6) = 7 6

Simple Implementation Array of indices Up[x] = 0 means x is a root. 1 2 3 4 5 6 7 up 1 7 7 5 1 3 7 4 2 5 6

Union Union(up[] : integer array, x,y : integer) : { //precondition: x and y are roots// up[x] := y } Constant Time!

Find Exercise: write an iterative version of Find. Find(up[] : integer array, x : integer) : integer { //precondition: x is in the range 1 to size if up[x] == 0 return x else return Find(up, up[x]) } Exercise: write an iterative version of Find.

A Bad Case … … … Union(1,2) Union(2,3) : Union(n-1, n) Find(1) n steps!! 2 1

Improving Find Can we do better? Yes! Improve union so that find only takes Θ(log n) Union-by-size Reduces complexity to Θ(m log n + n) Improve find so that it becomes even better! Path compression Reduces complexity to almost Θ(m + n)

Union by Rank Union by Rank (also called Union by Size) Always point the smaller tree to the root of the larger tree Union(1,7) 2 1 1 3 4 7 2 5 4 6

Example Again … … … … 1 2 3 n Union(1,2) 2 3 n Union(2,3) 1 2 n : 1 3 Union(n-1,n) 2 … 1 3 n Find(1) constant time

Improved Runtime for Find via Union by Rank Depth of tree affects running time of Find Union by rank only increases tree depth if depth were equal Results in O(log n) for Find Find log2n

Elegant Array Implementation 2 1 1 3 4 7 2 5 4 6 1 2 3 4 5 6 7 up 1 7 7 5 weight 2 1 4

Union by Rank Union(i,j : index){ //i and j are roots// wi := weight[i]; wj := weight[j]; if wi < wj then up[i] := j; weight[j] := wi + wj; else up[j] :=i; weight[i] := wi + wj; }

Kruskal's Algorithm Implementation (Revisited) sort edges in increasing order of length (e1, e2, e3, ..., em). initialize disjoint sets. T := {}. for i = 1 to m let ei = (u, v). if find(u) != find(v) union(find(u), find(v)). add ei to T. return T.

Kruskal's Algorithm Running Time (Revisited) Assuming |E| = m edges and |V| = n nodes Sort edges: O(m log m) Initialization: O(n) Finds: O(2 * m * log n) = O(m log n) Unions: O(m) Total running time: O (m log n + n + m log n + m) = O(m log n) note: log n and log m are within a constant factor of one another

Path Compression On a Find operation point all the nodes on the search path directly to the root. 7 1 1 7 5 4 2 Find(3) 2 3 6 5 4 6 8 9 10 8 9 3 10

Self-Adjustment Works PC-Find(x) x

Path Compression Exercise: Draw the resulting up tree after Find(e) with path compression. c g a f h Note that everything along the path got pointed to c. Note also that this is definitely not a binary tree! b d i e

Path Compression Find PC-Find(i : index) { r := i; while up[r]  0 do //find root r := up[r]; if i  r then //compress path k := up[i]; while k  r do up[i] := r; i := k; k := up[k] return(r) }

Disjoint Union / Find with Union By Rank and Path Comp. Worst case time complexity for a Union using Union by Rank is (1) and for Find using Path Compression is (log n). Time complexity for m  n operations on n elements is (m log* n) log * is the number of times you need to apply the log function before you get to a number <= 1 log * n < 5 for all reasonable n. Essentially constant time per operation!

Amortized Complexity For disjoint union / find with union by rank and path compression average time per operation is essentially a constant worst case time for a Find is (log n) An individual operation can be costly, but over time the average cost per operation is not This means the bottleneck of Kruskal's actually becomes the sorting of the edges

Other Applications of Disjoint Sets Good for applications in need of clustering cities connected by roads cities belonging to the same country connected components of a graph Forming equivalence classes (see textbook) Maze creation (see textbook)