Data Structures: Disjoint Sets
Disjoint Sets: the Union-Find tree Represent a set of disjoint sets Join sets together (union) Find which set an element belongs to Quickly test if two separate items are in the same set Note: Not the same as a vector of sets – that is much slower to go through. Does not support general set operations, mainly just union and find- set. But, this is sufficient for lots of operations It can also be augmented/adjusted to perform other operations
Implementing disjoint set Each set is a tree – store forest of trees. Tree is identified by the ID at the root. Each node knows its parent – follow to the root. Storage: array of values, array of parent indices, Initially, each item is its own parent array of tree height Actually an upper bound on height, not actual height Starts as height 0
Disjoint Set 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Union operation Merge two trees – set one tree’s root to have parent of other tree’s root. Pick the shorter tree to attach to the longer one If the trees are the same height, pick either Increase the height of the tree which is the new root by 1 Called “union-by-rank” (prevents trees getting too tall)
Union 5 & 7 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Union 5 & 7 – Find smallest tree 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Union 5 & 7 – Merge smaller into larger 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Union 7 & 11 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Union 7 & 11 – both are same size 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Union 7 & 11 – Either can be adjusted 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Find Operation Path Compression – shorten path to root when possible As you traverse the path to the root – do so recursively After determining root, set everything on path back down to that root. Over time, this compresses all paths to point to root more quickly – when lots of queries, makes them fast. Book: see section 2.4.2 for implementation Note: non-recursive versions are possible, but require keeping a stack/queue/vector of nodes along search path to update.
Find set of 8 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Find set of 8 – Visit parent of 8 = 1 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Find set of 8 – Visit parent of 1 = 2 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Find set of 8 – Visit parent of 2 = 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Find set of 8 – Parent of 5 is 5, so found set 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Find set of 8 – Update parent of 2 to 5 (same) 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Find set of 8 – Update parent of 1 to 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Find set of 8 – Update parent of 8 to 5 7 9 11 3 4 2 10 6 1 8 Node 1 2 3 4 5 6 7 8 9 10 11 Parent Size
Notice from example Size of set at 5 is still 4 (an overestimate) Everything on path from 8 to root is now going straight to root If item 8 had any children, they would also have shorter paths