CSE 326 Union Find Richard Anderson Text Chapter 8 7-29-2005 CSE 326
Union Find Problem Fixed collection of items Disjoint subsets Find(x) – which subset is x in? Union(x,y) – merge the sets containing x and y What run times do you expect for implementations of Union and Find? 7-29-2005 CSE 326
Applications of Union Find Building connected components foreach edge (e = (v1, v2)){ if (Find(v1) != Find(v2)) Union(v1, v2); } 7-29-2005 CSE 326
Equivalence Classes Fixed collection of items partitioned into equivalence classes Equivalence classes identified by one of their members {1, 3, 5}, {2, 8}, {4}, {6, 9, 10, 11}, {7} 7-29-2005 CSE 326
Draw an in-tree representation of the following equivalence relation {1,11}, {2, 4, 6, 8, 10, 12}, {3, 5, 7, 9}, {13}, {14}, {15} 7-29-2005 CSE 326
Array representation Store parents in the array P 3 8 3 4 3 9 7 8 9 9 6 7-29-2005 CSE 326
Basic Find operation Write the code for Find(x) using the Array P[ ] 7-29-2005 CSE 326
Union Union(x, y){ x1 = Find(x); y1 = Find(y); P[x1] = y1; } 7-29-2005 CSE 326
Unions create long chains 7-29-2005 CSE 326
Weighted Union W[x]: number of descendents of x (written wt(x)) Union(x, y){ x1 = Find(x); y1 = Find(y); if (W[x1] > W[y1]){ P[y1] = x1; W[x1] = W[x1] + W[y1]; } else { P[x1] = y1; W[y1] = W[x1] + W[y1]; W[x]: number of descendents of x (written wt(x)) Weighted union: smaller tree points to the bigger tree 7-29-2005 CSE 326
Binomial Trees, B0, B1, …, Bk B0 Bk = Union(Bk-1, Bk-1) Bk-1 Bk-1 Bk 7-29-2005 CSE 326
Binomial Trees Draw B4 What is the height of Bk? What is the weight of Bk? 7-29-2005 CSE 326
Theorem: Weight balanced unions guarantee O(log n) depth Show wt(T) >= 2ht(T) Show this is an invariant maintained by Union and Find operations Base case – true for singleton node Trivial case – invariant for Find operations Interesting case: T = Union(T1, T2) 7-29-2005 CSE 326
wt(T1) >= 2ht(T1), wt(T2) >= 2ht(T2) Show wt(T) >= 2ht(T) Assume wt(T2) >= wt(T1) Idea, consider cases: ht(T1) >= ht(T2) ht(T1) < ht(T2) T1 T T2 7-29-2005 CSE 326
Case 1 ht(T1) >= ht(T2) T Show 2ht(T) <= wt(T) 2ht(T) = 2ht(T1)+1 <= 2wt(T1) <= wt(T) T1 T T2 Show 2ht(T) <= wt(T) Hint: what is ht(T)? Recall: wt(T2) >= wt(T1) 2ht(T1) <= wt(T1) 2ht(T2) <= wt(T2) 7-29-2005 CSE 326
Case 2: ht(T1) < ht(T2) T Show 2ht(T) <= wt(T) 2ht(T) = 2ht(T2) <= wt(T2) <= wt(T) T1 T T2 Show 2ht(T) <= wt(T) Hint: What is ht(T)? Recall: wt(T2) >= wt(T1) 2ht(T1) <= wt(T1) 2ht(T2) <= wt(T2) 7-29-2005 CSE 326
Path Compression Long chains are only expensive if they are traversed multiple times Idea: Compress chains when they are traversed 7-29-2005 CSE 326
Path Compression Algorithms Find(x) { y = x; while (P[x] != x) x = P[x]; while (P[y] != x){ t = y; P[y] = x; y = t; } return x; Find(x){ while (P[x] != x) x = P[x] = P[P[x]]; return x; } 7-29-2005 CSE 326
Amortized Complexity Cost of n intermixed Union-Find operations Basic algorithm O(n2) Weighted union O(n log n) Path Compression O(n log n) Weighted union + Path Compression O(a(n) n) 7-29-2005 CSE 326