1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

1 1 Minimum Spanning Tree in expected linear time. Epilogue: Top-in card shuffling.

2 2 The problem Input: –A connected n-node m-edge graph G with edge weight w. Output: –A spanning tree T of G with minimum w(T).

3 3 Illustration 2 1 1 1 2 2 2 3 1

4 4 Inventor of MST Otakar Borůvka –Czech scientist –Introduced the problem –Gave an O(m log n) time algorithm –The original paper was written in Czech in 1926. –The purpose was to efficiently provide electric coverage of Bohemia.

5 5 Bohemia – Western Czech

6 6 The competition Unit-cost RAM model –O(m) Fredman-Willard (FOCS 1990) Deterministic comparison based algorithms. –O(m log n) Borůvka, Prim, Dijkstra, Kruskal,… –O(m log log n) Yao (1975), Cheriton-Tarjan (1976) –O(m  (m, n)) Fredman-Tarjan (1987) –O(m log  (m, n)) Gabow-Galil-Spencer-Tarjan (1986) –O(m  (m, n)) Chazelle (JACM 2000) –O(m) Holy grail

7 7 Today’s Topic Expected O(m)-time comparison- based algorithm for MST [Karger-Klein-Tarjan, JACM 1995]

8 8 Without loss of generality We may assume that all edge weights are distinct. Why?

9 9 Warm-up: Fundamental Properties of MST (a) Cut Property (b) Cycle Property (c) Uniqueness Property

10 10 Cut Property u v y Why? x

11 11 Cycle Property Why? F or ANY cyc l e C o f G, t h ee d geon C w i t h max i mum we i g h t canno t b e i n ANY m i n i mumspann i ng t reeo f G.

12 12 Uniqueness Property u v y x T T ¤

13 13 Boruvka’s algorithm Repeat the following procedure until the resulting graph becomes a single node. –For each node u, mark its lightest incident edge. –Now, the marked edges form a forest F. Add the edges of F into the set of edges to be reported. –Contract each maximal subtree of F into a single node.

14 14 Illustration 2.1 1.3 2.3 1.2 2.2 3.1 2.4 3 1 1.5 1.4 2.6 2.7 2.5 3.2 5 3.3 4 4.1 5.1

15 15 Running time = O(m log n)

16 16 Karger-Klein-Tarjan

17 17 Question: What edges can be deleted without affecting the optimality of the output tree? Resorting to the cycle property!

18 18 T-heavy edges v u TG ¡ T

19 19 The Heaviness Lemma

20 20 Illustration 2.1 1.3 2.3 1.2 2.2 3.1 2.4 3 1 1.5 1.4 2.6 2.7 2.5 3.2 5 3.3 4 4.1 5.1

21 21 Tool 1: Dixon-Rauch-Tarjan [SIAM J. Computing 1992] –Given a spanning tree T of G, it takes (deterministic) O(m) time to output all T-heavy edges of G.

22 22 Verifying MST is easier! It follows from Dixon-Rauch-Tarjan that verifying whether an input tree T is the minimum spanning tree G can be done in (deterministic) O(m) time.

23 23 Tool 2: A Sampling Lemma

24 24 The (recursive) algorithm

25 25 Expected Running Time

26 26 Comments The original sampling lemma, which is slightly more complicated, is due to Karger, Klein, and Tarjan. The version we see is due to Timothy Chan [IPL 1998]. –The statement and its proof are both extremely simple!

27 27 Chan’s Proof

30 30 Shuffling cards

31 31 Top-In Shuffling Suppose that we are given a deck of n cards. Each iteration, we pick the card on top, and then insert it back to the deck at a random position: there are n positions, each with probability 1/n.

32 32 Question How many iterations are required to make the deck random?

