1 Today’s Material The dynamic equivalence problem –a.k.a. Disjoint Sets/Union-Find ADT –Covered in Chapter 8 of the textbook.

Slides:



Advertisements
Similar presentations
Lecture 10 Disjoint Set ADT.
Advertisements

CSE 326: Data Structures Part 7: The Dynamic (Equivalence) Duo: Weighted Union & Path Compression Henry Kautz Autumn Quarter 2002 Whack!! ZING POW BAM!
1 Union-find. 2 Maintain a collection of disjoint sets under the following two operations S 3 = Union(S 1,S 2 ) Find(x) : returns the set containing x.
1 Disjoint Sets Set = a collection of (distinguishable) elements Two sets are disjoint if they have no common elements Disjoint-set data structure: –maintains.
1 Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 20 Prof. Erik Demaine.
EECS 311: Chapter 8 Notes Chris Riesbeck EECS Northwestern.
Chapter 8: The Disjoint Set Class Equivalence Classes Disjoint Set ADT CS 340 Page 132 Kruskal’s Algorithm Disjoint Set Implementation.
Union-Find: A Data Structure for Disjoint Set Operations
CSE 326: Data Structures Disjoint Union/Find Ben Lerner Summer 2007.
Disjoint Union / Find CSE 373 Data Structures Lecture 17.
CSE 326: Data Structures Disjoint Union/Find. Equivalence Relations Relation R : For every pair of elements (a, b) in a set S, a R b is either true or.
Heaps Heaps are used to efficiently implement two operations:
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 17 Union-Find on disjoint sets Motivation Linked list representation Tree representation.
CSE 326: Data Structures Lecture #19 Disjoint Sets Dynamic Equivalence Weighted Union & Path Compression David Kaplan Today we’re going to get.
Lecture 9 Disjoint Set ADT. Preliminary Definitions A set is a collection of objects. Set A is a subset of set B if all elements of A are in B. Subsets.
Lecture 16: Union and Find for Disjoint Data Sets Shang-Hua Teng.
1 Chapter 8 The Disjoint Set ADT Concerns with equivalence problems Find and Union.
CSE 373, Copyright S. Tanimoto, 2002 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
CS2420: Lecture 42 Vladimir Kulyukin Computer Science Department Utah State University.
1 Search Trees - Motivation Assume you would like to store several (key, value) pairs in a data structure that would support the following operations efficiently.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
Priority Queues and Disjoint Sets CSCI 2720 Spring 2005.
1 22c:31 Algorithms Union-Find for Disjoint Sets.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Aaron Bauer Winter 2014.
CSE373: Data Structures & Algorithms Lecture 10: Disjoint Sets and the Union-Find ADT Lauren Milne Spring 2015.
CMSC 341 Disjoint Sets. 8/3/2007 UMBC CMSC 341 DisjointSets 2 Disjoint Set Definition Suppose we have an application involving N distinct items. We will.
CMSC 341 Disjoint Sets Textbook Chapter 8. Equivalence Relations A relation R is defined on a set S if for every pair of elements (a, b) with a,b  S,
April 14, 2015Applied Discrete Mathematics Week 10: Equivalence Relations 1 Properties of Relations Definition: A relation R on a set A is called transitive.
Lecture X Disjoint Set Operations
Disjoint Sets Data Structure. Disjoint Sets Some applications require maintaining a collection of disjoint sets. A Disjoint set S is a collection of sets.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Nicki Dell Spring 2014.
CSE373: Data Structures & Algorithms Lecture 10: Implementing Union-Find Dan Grossman Fall 2013.
Fundamental Data Structures and Algorithms Peter Lee April 24, 2003 Union-Find.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 223 – Advanced Data Structures Disjoint Sets.
ICS 353: Design and Analysis of Algorithms Heaps and the Disjoint Sets Data Structures King Fahd University of Petroleum & Minerals Information & Computer.
1 The Disjoint Set ADT CS146 Chapter 8 Yan Qing Lei.
CS 146: Data Structures and Algorithms July 16 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak
0 Union-Find data structure. 1 Disjoint set ADT (also Dynamic Equivalence) The universe consists of n elements, named 1, 2, …, n n The ADT is a collection.
CHAPTER 8 THE DISJOINT SET ADT §1 Equivalence Relations 【 Definition 】 A relation R is defined on a set S if for every pair of elements (a, b), a, b 
CMSC 341 Disjoint Sets. 2 Disjoint Set Definition Suppose we have N distinct items. We want to partition the items into a collection of sets such that:
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
MST, Topological Sort and Disjoint Sets
CSE373: Data Structures & Algorithms Lecture 9: Disjoint Sets and the Union-Find ADT Lauren Milne Summer 2015.
WEEK 5 The Disjoint Set Class Ch CE222 Dr. Senem Kumova Metin
CSE 373, Copyright S. Tanimoto, 2001 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
CSE 326: Data Structures: Set ADT
CSE 373: Data Structures and Algorithms
CSE 373, Copyright S. Tanimoto, 2001 Up-trees -
Chapter 8 Disjoint Sets and Dynamic Equivalence
Disjoint Sets Chapter 8.
CSE373: Data Structures & Algorithms Lecture 10: Disjoint Sets and the Union-Find ADT Linda Shapiro Spring 2016.
CMSC 341 Disjoint Sets Based on slides from previous iterations of this course.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Linda Shapiro Spring 2016.
CSE373: Data Structures & Algorithms Lecture 10: Disjoint Sets and the Union-Find ADT Linda Shapiro Winter 2015.
CSE 332: Data Structures Disjoint Set Union/Find
ICS 353: Design and Analysis of Algorithms
CSE 373 Data Structures and Algorithms
Data Structures & Algorithms Union-Find Example
CSE373: Data Structures & Algorithms Lecture 9: Disjoint Sets & Union-Find Dan Grossman Fall 2013.
CSE 332: Data Abstractions Union/Find II
ICS 353: Design and Analysis of Algorithms
Equivalence Relations
CSE373: Data Structures & Algorithms Implementing Union-Find
CSE 326 Union Find Richard Anderson Text Chapter CSE 326.
Disjoint Sets DS.S.1 Chapter 8 Overview Dynamic Equivalence Classes
Disjoint Sets Textbook Chapter 8
CSE 373: Data Structures and Algorithms
Disjoint Set Operations: “UNION-FIND” Method
Presentation transcript:

1 Today’s Material The dynamic equivalence problem –a.k.a. Disjoint Sets/Union-Find ADT –Covered in Chapter 8 of the textbook

2 Motivation Consider the relation “=” between integers 1.For any integer A, A = A (reflexive) 2.For integers A and B, A = B means that B = A (symmetric) 3.For integers A, B, and C, A = B and B = C means that A = C (transitive) Consider cities connected by two-way roads 1.A is trivially connected to itself 2.A is connected to B means B is connected to A 3.If A is connected to B and B is connected to C, then A is connected to C

3 Equivalence Relationships An equivalence relation R obeys three properties: 1.reflexive: for any x, xRx is true 2.symmetric: for any x and y, xRy implies yRx 3.transitive: for any x, y, and z, xRy and yRz implies xRz Preceding relations are all examples of equivalence relations What are not equivalence relations?

4 Equivalence Relationships An equivalence relation R obeys three properties: 1.reflexive: for any x, xRx is true 2.symmetric: for any x and y, xRy implies yRx 3.transitive: for any x, y, and z, xRy and yRz implies xRz What about “<” on integers? –1 and 2 are violated What about “≤” on integers? –2 is violated

5 Equivalence Classes and Disjoint Sets Any equivalence relation R divides all the elements into disjoint sets of “equivalent” items Let ~ be an equivalence relation. If A~B, then A and B are in the same equivalence class. Examples: –On a computer chip, if ~ denotes “electrically connected,” then sets of connected components form equivalence classes –On a map, cites that have two-way roads between them form equivalence classes –What are the equivalence classes for the relation “Modulo N” applied to all integers?

6 Equivalence Classes and Disjoint Sets Let ~ be an equivalence relation. If A~B, then A and B are in the same equivalence class. Examples: –The relation “Modulo N” divides all integers in N equivalence classes (for the remainders 0, 1, …, N-1) –Under Mod 5: –0 ~ 5 ~ 10 ~ 15 … –1 ~ 6 ~ 11 ~ 16 … –2 ~ 7 ~ 12 ~ … –3 ~ 8 ~ 13 ~ … –4 ~ 9 ~ 14 ~ … –(5 equivalence classes denoting remainders 0 through 4 when divided by 5)

7 Union and Find: Problem Definition Given a set of elements and some equivalence relation ~ between them, we want to figure out the equivalence classes Given an element, we want to find the equivalence class it belongs to –E.g. Under mod 5, 13 belongs to the equivalence class of 3 –E.g. For the map example, want to find the equivalence class of Eskisehir (all the cities it is connected to) Given a new element, we want to add it to an equivalence class (union) –E.g. Under mod 5, since 18 ~ 13, perform a union of 18 with the equivalence class of 13 –E.g. For the map example, Ankara is connected to Eskisehir, so add Ankara to equivalence class of Eskisehir

8 Disjoint Set ADT Stores N unique elements Two operations: –Find: Given an element, return the name of its equivalence class –Union: Given the names of two equivalence classes, merge them into one class (which may have a new name or one of the two old names) ADT divides elements into E equivalence classes, 1 ≤ E ≤ N –Names of classes are arbitrary –E.g. 1 through N, as long as Find returns the same name for 2 elements in the same equivalence class

9 Disjoint Set ADT Properties Disjoint set equivalence property: every element of a DS ADT belongs to exactly one set (its equivalence class) Dynamic equivalence property: the set of an element can change after execution of a union Example: –Initial Classes = {1,4,8}, {2,3}, {6}, {7}, {5,9,10} –Name of equiv. class underlined {1,4,8} {6} {7} {5,9,10} {2,3} Find(4) 8 Union(6, 2) Disjoint Set ADT {2,3,6} {6}{6} {2,3}

10 Disjoint Set ADT: Formal Definition Given a set U = {a1, a2, …, an} Maintain a partition of U, a set of subsets (or equivalence classes) of U denoted by {S1, S2, …, Sk} such that: –each pair of subsets Si and Sj are disjoint –together, the subsets cover U –each subset has a unique name Union(a, b) creates a new subset which is the union of a’s subset and b’s subset Find(a) returns the unique name for a’s subset

11 Implementation Ideas and Tradeoffs How about an array implementation? –N element array A: A[i] holds the class name for element i – E.g. Assume 8~ 4~3 pick 3 as class name and set A[8] = A[4] = A[3] = A Running time for Find(i)? O(1) (just return A[i]) Sets: {0}, {1, 2, 5, 9}, {3, 4, 8}, {6, 7} Running time for Union(i, j)? O(N)

12 Implementation Ideas and Tradeoffs How about linked lists? –One linked list for each equivalence class – Class name = head of list E.g.: Sets: {0}, {1, 2, 5, 9}, {3, 4, 8}, {6, 7} Running time for Union(i, j) ? E.g. Union(1, 3) O(1) – Simply append one list to the end of the other Running time for Find(i) = ? O(N) – Must scan all lists in the worst case

13 Implementation Ideas and Tradeoffs Tradeoff between Union-Find – can we do both in O(1) time? –N-1 Unions (the maximum possible) and M Finds O(N 2 + M) for array O(N + MN) for linked list implementation –Can we do this in O(M + N) time? Array Implementation Linked List Implementation Find(i)O(1)O(N) Union(i, j)O(N)O(1)

14 Towards a new Data Structure Intuition: Finding the representative member (= class name) for an element is like the opposite of searching for a key in a given set So, instead of trees with pointers from each node to its children, let’s use trees with a pointer from each node to its parent Such trees are known as Up-Trees

15 Up-Tree Data Structure Each equivalence class (or discrete set) is an up-tree with its root as its representative member All members of a given set are nodes in that set’s uptree a d g b e c f h NULL {a, d, g, b, e} {c, f}{h} Up-Trees are not necessarily binary

16 Implementing Up-Trees Forest of up-trees can easily be stored in an array (call it “up”) up[X] = parent of X; = -1 if root a b d e c f g NULL {a, b, d, e} {c, f} {g} h i NULL {h, i} 0(a) 0 1(b) 2(c) 0 3(d) 1 4(e) 2 5(f) 6(g) 7(h) 7 8(i) Array up:

17 Example Find Find(x): Just follow parent pointers to the root Find(e) = a Find(f) = c Find(g) = g a b d e c f g NULL {a, b, d, e} {c, f} {g} h i NULL {h, i} 0(a) 0 1(b) 2(c) 0 3(d) 1 4(e) 2 5(f) 6(g) 7(h) 7 8(i) Array up: Find(e)

18 Implementing Find(x) #define N 9 int up[N]; /* Returns setid of “x”*/ int Find(int x){ while (up[x] >= 0){ x = up[x]; } /* end-while */ return x; } /* end-Find */ a b d e c f g NULL {a, b, d, e} {c, f} {g} h i NULL {h, i} 0(a) 0 1(b) 2(c) 0 3(d) 1 4(e) 2 5(f) 6(g) 7(h) 7 8(i) Array up: Find(4) Running time? O(maxHeight)

19 Recursive Find(x) #define N 9 int up[N]; /* Returns setid of “x”*/ int Find(int x){ if (up[x] < 0) return x; return Find(up[x]); } /* end-Find */ a b d e c f g NULL {a, b, d, e} {c, f} {g} h i NULL {h, i} 0(a) 0 1(b) 2(c) 0 3(d) 1 4(e) 2 5(f) 6(g) 7(h) 7 8(i) Array up: Find(4)

20 Example Union Union(x, y): Just hang one root from the other! Union(c, a) a b d e c f g NULL {a, b, d, e, c, f} {g} h i NULL {h, i} 0(a) 0 1(b) 2(c) 0 3(d) 1 4(e) 2 5(f) 6(g) 7(h) 7 8(i) Array up:2

21 Implementing Union(x, y) #define N 9 int up[N]; /* Joins two sets */ int Union(int x, int y){ assert(up[x] < 0); assert(up[y] < 0); up[y] = x; } /* end-Union */ a b d e c f g NULL {a, b, d, e, c, f} {g} h i NULL {h, i} (a) 0 1(b) 2(c) 0 3(d) 1 4(e) 2 5(f) 6(g) 7(h) 7 8(i) Array up: Running time?O(1)

22 MakeSet(): Creating initial sets a NULL {a} #define N 9 int up[N]; /* Make initial sets */ void MakeSets(){ int i; for (i=0; i<N; i++){ up[i] = -1; } /* end-for */ } /* end-MakeSets */ b NULL {b} c NULL {c} d NULL {d} e NULL {e} f NULL {f} g NULL {g} h NULL {h} i NULL {i}

23 Detailed Example a {a} b {b} c {c} d {d} e {e} f {f} g {g} h {h} i {i} Initial Sets Union(b, e) a {a} b {c} cd {d} e {b, e} f {f} g {g} h {h} i {i} eb

24 Detailed Example Union(a, d) a {a, d} b {c} c d e {b, e} f {f} g {g} h {h} i {i} a {a} b {c} cd {d} e {b, e} f {f} g {g} h {h} i {i}

25 Detailed Example Union(a, b) a {a, d} b {c} c d e {b, e} f {f} g {g} h {h} i {i} a {a, d, b, e} b {c} c d e f {f} g {g} h {h} i {i}

26 Detailed Example Union(h, i) a {a, d, b, e} b {c} c d e f {f} g {g} h {h, i} i a {a, d, b, e} b {c} c d e f {f} g {g} h {h} i {i}

27 Detailed Example Union(c, f) a {a, d, b, e} b {c, f} c d e f g {g} h {h, i} i a {a, d, b, e} b {c} c d e f {f} g {g} h {h, i} i

28 Detailed Example Union(c, a) a {a, d, b, e, c, f} b c d e f g {g} h {h, i} i a {a, d, b, e} b {c, f} c d e f g {g} h {h, i} i Q: Can we do a better job on this union for faster finds in the future?

29 Implementation of Find & Union #define N 9 int up[N]; /* Joins two sets */ int Union(int x, int y){ assert(up[x] < 0); assett(up[y] < 0); up[y] = x; } /* end-Union */ #define N 9 int up[N]; /* Returns setid of “x”*/ int Find(int x){ if (up[x] < 0) return x; return Find(up[x]); } /* end-Find */ Running time:O(MaxHeight) Running time:O(1) Height depends on previous unions Best Case: 1-2, 1-4, 1-5, … - O(1) Worst Case: 2-1, 3-2, 4-3, … - O(N) Q: Can we do a better?

30 Let’s look back at our example Union(c, a) a {a, d, b, e, c, f} b c d e f g {g} h {h, i} i a {a, d, b, e} b {c, f} c d e f g {g} h {h, i} i Q: Can we do a better job on this union for faster finds in the future? How can we make the new tree shallow?

31 Speeding up Find: Union-by-Size a {a, d, b, e} b {c, f} c d e f g {g} h {h, i} i Idea: In Union, always make the root of the larger tree the new root – union-by-size a {a, d, b, e, c, f} b c d e f g {g} h {h, i} i After Union(c, a) a {a, d, b, e, c, f} b c d e f g {g} h {h, i} i After Union(c, a) with Union-by-size Initial Sets

32 Trick for Storing Size Information Instead of storing -1 in root, store up-tree size as negative value in root node a b d e c f g {a, b, d, e} {c, f} {g} h i {h, i} -4 0(a) 0 1(b) -2 2(c) 0 3(d) 1 4(e) 2 5(f) 6(g) -2 7(h) 7 8(i) Array up:

33 Implementing Union-by-Size #define N 9 int up[N]; /* Joins two sets. Assumes x & y are roots */ int Union(int x, int y){ assert(up[x] < 0); assert(up[y] < 0); if (up[x] < up[y]){ // x is bigger. Join y to x up[x] += up[y]; up[y] = x; } else { // y is bigger. Join x to y up[y] += up[x]; up[x] = y; } /* end-else */ } /* end-Union */ 33 Running time? O(1)

34 Running Time for Find with Union-by-Size Finds are O(MaxHeight) for a forest of up-trees containing N nodes Theorem: Number of nodes in an up-tree of height h using union-by-size is ≥ 2 h Pick up-tree with MaxHeight Then, 2 MaxHeight ≤ N MaxHeight ≤ log N Find takes O(log N) Proof by Induction Base case: h = 0, tree has 2 0 = 1 node Induction hypothesis: Assume true for h < h′ Induction Step: New tree of height h′ was formed via union of two trees of height h′-1. Each tree then has ≥ 2 h’-1 nodes by the induction hypothesis So, total nodes ≥ 2 h’ h’-1 = 2 h’ Therefore, True for all h

35 Union-by-Height Textbook describes alternative strategy of Union-by-height –Keep track of height of each up-tree in the root nodes –Union makes root of up-tree with greater height the new root Same results and similar implementation as Union-by-Size –Find is O(log N) and Union is O(1)

Can we make Find go faster? Can we make Find(g) do something so that future Find(g) calls will run faster? Right now, M Find(g) calls run in total O(M*logN) time –Can we reduce this to O(M)? a b d e c f g {a, b, d, e, g} h i {h, i} {c, f} Idea: Make Find have side-effects so that future Finds will run faster.

37 Introducing Path Compression Path Compression: Point everything along path of a Find to root Reduces height of entire access path to 1 –Finds get faster! a b d e c f g {a, b, d, e, g} h i {h, i} {c, f} Find(g) a b d e c f g {a, b, d, e, g} h i {h, i} {c, f}

38 Another Path Compression Example a b d e c f g {a, b, d, h, e, i, g} {c, f} Find(g) i h a b d e c f g {a, b, d, h, e, i, g} {c, f} i h

39 Implementing Path Compression Path Compression: Point everything along path of a Find to root Reduces height of entire access path to 1 –Finds get faster! #define N … int up[N]; /* Returns setid of “x” */ int Find(int x){ if (up[x] < 0) return x; int root = Find(up[x]); up[x] = root; /* Point to the root */ return root; } /* end-Find */ Running time: O(MaxHeight) But, what happens to the tree height over time? It gets smaller What’s the total running time if we do M Finds? Turns out this is equal to O(M*InvAccerman(M, N))

40 Running time of Find with Path Compression What’s the total running time if we do M Finds? Turns out this is equal to O(M*InvAccerman(M, N)) InverseAccerman(M, N) <= 4 for all practical values of M and N So, total running time of M Finds <= 4*M=O(M) –Meaning that the amortized running time of Find with path compression is O(1)

41 Summary of Disjoint Set ADT The Disjoint Set ADT allows us to represent objects that fall into different equivalence classes or sets Two main operations: Union of two classes and Find class name for a given element Up-Tree data structure allows efficient array implementation –Unions take O(1) worst case time, Finds can take O(N) –Union-by-Size (or by-Height) reduces worst case time for Find to O(log N) –If we use both Union-by-Size/Height & Path Compression Any sequence of M Union/Find operations results in O(1) amortized time per operation (for all practical purposes)

42 Applications of Disjoint Set ADT Disjoint sets can be used to represent: –Cities on a map (disjoint sets of connected cities) – Electrical components on chip –Computers connected in a network –Groups of people related to each other by blood –Textbook example: Maze generation using Unions/Finds: Start with walls everywhere and each cell in a set by itself Knock down walls randomly and Union cells that become connected Use Find to find out if two cells are already connected Terminate when starting and ending cell are in same set i.e. connected (or when all cells are in same set)

43 Disjoint Set ADT Declaration & Operations class DisjointSet { private: int *up; // Up links array int N; // Number of sets public: DisjointSet(int n); // Creates N sets ~DisjointSet(){delete up;} int Find(int x); void Union(int x, int y); };

44 Operations: DisjointSet, Find /* Create N sets */ DisjointSet::DisjointSet(int n){ int i; N = n; up = new int[N]; for (i=0; i<N; i++) up[i] = -1; } //end-DisjointSet /* Returns setid of “x” */ int DisjointSet::Find(int x){ if (up[x] < 0) return x; int root = Find(up[x]); up[x] = root; /* Point to the root */ return root; } /* end-Find */

45 Operations: Union (by size) /* Joins two sets. Assumes x & y are roots */ int DisjointSet::Union(int x, int y){ assert(up[x] < 0); assert(up[y] < 0); if (up[x] < up[y]){ // x is bigger. Join y to x up[x] += up[y]; up[y] = x; } else { // y is bigger. Join x to y up[y] += up[x]; up[x] = y; } /* end-else */ } /* end-Union */