A Simpler Minimum Spanning Tree Verification Algorithm Valerie King July 31,1995
Agenda Abstract Introduction Boruvka tree property Komlós algorithm for full branching tree Implementation of Komlós’s algorithm Analysis
Abstract The Problem considered here is that of determining whether a given spanning tree is a minimal spanning tree. 1984, Komlós presented an algorithm which required only a linear number of comparisons, but nonlinear overhead to determine which comparisons to make. This paper simplifies Komlós’s algorithm, and gives a linear time procedure( use table lookup functions ) for its implementation.
Introduction(1/5) Paper research history Robert E. Tarjan --Department of Computer Science Princeton University --Achievements in the design and analysis of algorithms and data structures. --Turing Award
Introduction(2/5) DRT algorithm step1.Decompose--separates the tree into a large subtree and many “microtrees” step2.Vertify each root of adjacent microtrees has shortest path between them. step3.Find the MST in each microtrees MST algorithm of KKT randomized algorithm
Introduction(3/5) János Komlós --Department of Mathematics Rutgers, The State University of New Jersey Komlós’s algorithm was the first to use a linear number of comparisons, but no linear time method of deciding which comparisons to make has been known.
Introduction(4/5) A spanning tree is a minimum spanning tree iff the weight of each nontree edge {u,v} is at least the weight of the heaviest edge in the path in the tree between u and v. Query paths --”tree path” problem of finding the heaviest edges in the paths between specified pairs of nodes. Full branching tree
Introduction(5/5) If T is a spanning tree, then there is a simple O(n) algorithm to construct a full branching tree B with no more than 2n edges. The weight of the heaviest edge in T(x,y) is the weight of the heaviest edge in B(x,y).
Boruvka Tree Property (1/12) Let T be a spanning tree with n nodes. Tree B is the tree of the components that are formed when Boruvka algorithm is applied to T. Boruvak algorithm –Initially there are n blue trees consisting of the nodes of V and no edges. –Repeat edge contraction until there is one blue tree.
Boruvka Tree Property (2/12) Tree B is constructed with node set W and edge set F by adding nodes and edges to B after each phase of Boruvka algorithm. Initial phase –For each node v V, create a leaf f(v) of B.
Boruvka Tree Property (3/12) Adding phase –Let A be the set of blue trees which are joined into one blue tree t in a phase i. –Add a new node f(t) to W and add edge to F.
Boruvka Tree Property (4/12)
Boruvka Tree Property (5/12) t1t1 t2t2 t3t3 t4t
Boruvka Tree Property (6/12) t1t1 t2t2 t3t3 t4t t t1t1 t2t2 t3t3 t4t
Boruvka Tree Property (7/12) The number of blue trees drops by a factor of at least two after each phase. Note that B is a full branching tree –i.e., it is rooted and all leaves are on the same level and each internal node has at least two children.
Boruvka Tree Property (8/12) Theorem 1 –Let T be any spanning tree and let B be the tree constructed as described above. –For any pair of nodes x and y in T, the weight of the heaviest edge in T(x,y) equals the weight of the heaviest edge in B(f(x), f(y)).
Boruvka Tree Property (9/12) First, we prove that for every edge, there is an edge such that w(e ’ )≥w(e). Let e = {a,b} and let a be the endpoint of e. Then a = f(t) for some blue tree t which contains either x or y.
Boruvka Tree Property (10/12) Let e ’ be the edge in T(x,y) with exactly one endpoint in blue tree t. Since t had the option of selecting e ’, w(e ’ )≥w(e). a f(x) r b f(y) B Blue tree t a f(x) e e’e’ e ’’
Boruvka Tree Property (11/12) Claim: Let e be a heaviest edge in T(x,y). Then there is an edge of the same weight in B(f(x), f(y)). –If e is selected by a blue tree which contains x or y then an edge in B(f(x), f(y)) is labeled with w(e). –On the contrary, assume that e is selected by a blue tree which does not contain x or y. This blue tree contained one endpoint of e and thus one intermediate node on the path from x to y.
Boruvka Tree Property (12/12) –Therefore it is incident to at least two edges on the path. Then e is the heavier of two, giving a contradiction. t x y e i
Komlos’s Algorithm for a Full Branching Tree The goal is to find the heaviest edge on the path between each pair Break up each path into two half-paths extending from the leaf up to the lowest common ancestor of the pair and find the heaviest edge in each half-path The heaviest edge in each query path is determined with one additional comparison per path
Komlos’s Algorithm for a Full Branching Tree (cont’d) Let A(v) be the set of the paths which contain v and is restricted to the interval [root, v] intersecting the query paths Starting with the root, descend level by level to determine the heaviest edge in each path in the set A(v)
Komlos’s Algorithm for a Full Branching Tree (cont’d) Let p be the parent of v, and assume that the heaviest edge in each path in the set A(p) is known We need only to compare w({v,p}) to each of these weights in order to determine the heaviest edge in each path in A(v) Note that it can be done by binary search
Data Structure (1/4) Node Labels –Node: DFS (leaves 0, 1, 2, …; internal longest all 0’s suffix) –lgn bits Edge Tags –Edge: distance(v): v’s distance from the root. i(v): the index of the rightmost 1 in v’s label. Label Property –Given the tag of any edge e and the label of any node on the path from e to any leaf, e can be located in constant time.
Full Branching Tree 0(0000)2(0010)1(0001)5(0101)3(0011)4(0100)6(0110)7(0111)8(1000)9(1001)10(1010)11(1011) 0(0000) 4(0100) 6(0110) 8(1000) 10(1010) 8(1000)
Data Structure (2/4) LCA –LCA(v) is a vector of length wordsize whose ith bit is 1 iff there is a path in A(v) whose upper endpoint is at distance i from the root. –That is, there is a query path with exactly one endpoint contained in the subtree rooted at v, such that the lowest common ancestor of its two endpoints is at distance i form the root. –A(v)’s representation
Data Structure (3/4) BigLists & SmallLists –For any node v, the ith longest path in A(v) will be denoted by A i (v). –The weight of an edge e is denoted w(e). –V is big if |A(v)| > (wordsize/tagsize); O.W v is small. –For each big node v, we keep an ordered list whose ith element is the tag of the heaviest edge in A i (v) for i = 1, …, |A(v)|. This list is a referred to as bigList(v). BigList(v) is stored in ┌|A(v) / (wordsize/tagsize)|┐
Data Structure (4/4) BigLists & SmallLists –For each small v, let a be the nearest big ancestor of v. For each such v, we keep an ordered list, smallList(v), whose ith element is either the tag of the heaviest edge e in A i (v), or if e is in the interval [a, root], then the j such that A i (v|a) = A j (a). That is, j is a pointer to the entry of bigList(a) which contains the tag for e. Once a tag appear in a smallList, all the later entries in the list are tags. (a pointer to the first tag) SmallList is stored in a single word. (logn)
The Algorithm (1/6) The goal is to generate bigList(v) or smallList(v) in time proportional to log|A(v)|, so that time spent implementing Komlos’s algorithm at each node does not exceed the worst case number of comparisons needed at each node. We show that –If v is big O(loglogn) –If v is small O(1).
The Algorithm (2/6) 1.Initially, A(root) = Ø. We proceed down the tree, from the parent p to each of the children v. 2.Depending on |A(v)|, we generate either bigList(v|p) or smallList(v|p). 3.Compare w({v, p}) to the weights of these edges, by performing binary search on the list, and insert the tag of {v, p} in the appropriate places to form bigList(v) or smallList(v). 4.Continue until the leaves are reached.
The Algorithm (3/6) select r –Ex: select(010100, ) = (1,0) selectS r –Ex: selectS((01), (t 1, t 2 )) = (t 2 ) weight r –Ex: weight(011000) = 2 index r –Ex: index(011000) = (2, 3) (2, 3 are pointers) subword1 –( )
The Algorithm (4/6) Let v be any node, p is its parent, and a its nearest big ancestor. To compute A(v|p): –If v is small If p is small create smallList(v|p) from smallList(p) in O(1) time. –ex:Let LCA(v) = ( ), LCA(p) = ( ). Let smallList(p) be (t1, t2). Then L = select( , ) = (01); and smallList(v|p) = selectS((01), (t1, t2)) = (t2) If p is big create smallList(v|p) from LCA(v) and LCA(p) in O(1) time. –ex:Let LCA(p) = ( ), LCA(v) = ( ). Then smallList(v|p) = index(select( , )) = index(10100) = (1, 3).
The Algorithm (5/6) If (v has a big ancestor) –create bigList(v|a) from bigList(a), LCA(v), and LCA(a) in time O(lglgn). Ex: Let LCA(a) = ( ), LCA(v) = ( ), and let bigList(a) be (t1, t2, t3, t4, t5). Then L = (01010); L1 = (01), L2 = (01), and L3 = (0); b1 = (t1, t2), b2 = (t3, t4), b3 = (t5). Then (t2) = selectS((01), (t1, t2)); (t4) = selectS((01), (t3, t4)); and () = selectS(t5). Thus bigList = (t2, t4). –If (p != a) create bigList(v|p) from bigList(v|a) and smallList(p) in time O(lglgn). Ex: bigList(v|a) = (t2, t4), smallList(p) = (2, t’). Then bigList(v|p) = (t2, t’). If (v doesn’t have a big ancestor) –bigList(v|p) ← smallList(p).
The Algorithm (6/6) To insert a tag in its appropriate places in the list: –Let e = {v, p}, –And let i be the rank of w(e) compared with the heaviest edges of A(v|p). –Then we insert the tag for e in position i through |A(v)|, into our list data structure for v in time O(1) if v is small, or O(loglogn) if v is big. –Ex: Let smallList(v|p) = (1, 3). Then t is the tag of {v, p}. To put t into positions 1 to j = |A(v)| = 2, we compute t * subword1 = t * = (t, t) followed some extra 0 bits, which are discarded to get smallList(v|p) = (t, t).
Analysis When v is small –The cost of the overhead for performing the insertions by binary search is a constant. When v is big –|A(v)|/(wordsize/tagsize) = Ω(logn/loglogn), the cost of the overhead is O(lglgn). –Hence the implementation cost is O(lg|A(v)|), which is proportional to the number of comparisons needed by the Komlós algorithm to find the heaviest edges in 2m half-paths of the tree in the worst case. Summed over all nodes, this comes to O(nlog((m+n)/n)) as has Komlós shown.
Thank you