Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees 黃則翰 R96922141 蘇承祖 R96922077 張紘睿 R96922136 許智程 D95922022 戴于晉 R96922171 David R. Karger.

Similar presentations


Presentation on theme: "A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees 黃則翰 R96922141 蘇承祖 R96922077 張紘睿 R96922136 許智程 D95922022 戴于晉 R96922171 David R. Karger."— Presentation transcript:

1 A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees 黃則翰 R96922141 蘇承祖 R96922077 張紘睿 R96922136 許智程 D95922022 戴于晉 R96922171 David R. Karger Philip N. Klein Robert E. Tarjan

2 Outline Introduction Basic Property & Definition Algorithm Analysis

3 Outline Introduction Basic Property & Definition Algorithm Analysis

4 Introduction [Borůvka 1962] O(m log n) Gabow et al.[1984] O(m log β (m,n) ) ◦ β (m,n)= min { i |log (i) n <= m/n} Verification algorithm ◦ King[1993] O(m) A randomize algorithm runs in O(m) time with high probability

5 Outline Introduction Basic Property & Definition Algorithm Analysis

6 Cycle property For any cycle C in a graph, the heaviest edge in C dose not appear in the minimum spanning forest. 2 3 5 6 2 3 5 6

7 Cut Property 2 3 5 6 For any proper nonempty subset X of the vertices, the lightest edge with exactly one endpoint in X belongs to the minimum spanning tree X

8 Definition Let G be a graph with weighted edges. ◦ w(x,y)  The weight of edge {x,y} If F is a forest of a subgraph in G ◦ F(x, y)  the path (if any) connecting x and y in F ◦ w F (x, y)  the maximum weight of an edge on F(x, y) ◦ w F (x, y)=∞  If x and y are not connected in F

9 F-heavy & F-light An edge {x,y} is F-heavy if w(x,y) > w F (x,y) and F-light otherwise Edge of F are all F-light AC B D 2 3 5 6 EG F H 2 3 5 6 W(B,D)=6 W F (B,D)=max{2,3,5} F-heavy W(F,H)=6 W F (F,H)= ∞ F-light W(C,D)=5 W F (C,D)=5 F-light

10 No F-heavy edge can be in the minimum spanning forest of G (cycle property) Discard edge that cannot be in the minimum spanning tree F-light edge can be the candidate edge for the minimum spanning tree of G Observation

11 Outline Introduction Basic Property & Definition Algorithm Analysis

12 Boruvka Algorithm For each vertex, select the minimum-weight edge incident to the vertex. Replace by a single vertex each connected component defined by the selected edges. Delete all resulting isolated vertices, loops, and all but the lowest-weight edge among each set of multiple edges.

13 Algorithm Step1 Apply two successive Boruvka steps to the graph, thereby reducing the number of vertices by at least a factor of four.

14 Algorithm Step2 Choose a subgraph H by selecting each edge independently with probability ½. Apply the algorithm recursively to H, producing a minimum spanning forest F of H. Find all the F-heavy edges and delete them from the contracted graph.

15 Algorithm Step3 Apply the algorithm recursively to the remaining graph to compute a spanning forest F’. Return those edges contracted in Step1 together with the edges of F’.

16 G H Boruvka × 2 G*G* Original Problem G’G’ Right Sub- problem Return minimum forest F of H Delete F-heavy edges from G* Left Sub- problem F’ Sample with p=0.5

17 Correctness By the cut property, every edge contracted during Step1 is in the MSF. By the cycle property, the edges deleted in Step2 do NOT belong to the MSF. By the induction hypothesis, the MSF of the remaining graph is correctly determined in the recursive call of Step3.

18 Candidate Edge of MST The expected number of F-light edges in G is at most n/p (negative binomial) For every sample graph H, the expected candidate edge for MST in G is at most n/p (F- light edge)

19 Random-sampling To help discard some edge that cannot be in the minimum spanning tree Construct the sample graph H ◦ Process the edges in increasing order ◦ To process an edge e ◦ 1. Test whether both endpoints of e in same component ◦ 2. Include the edge in H with probability p ◦ 3. If e is in H and is F-light, add e to the Forest F

20 Random-sampling CE D F 6 5 11 9 AG 4 3 10 14 13 B 7 CE D F 6 5 11 9 AG 4 3 10 14 13 B 7 GH F W(E,G)=14 W F (E,G)=max{5,6,9,13} F-heavy W(E,F)=11 W F (E,F)=max{5,6,9} F-heavy W(D,F)=9 W F (D,F)=9 F-light W(A,B)=7 W F (A,B)= ∞ F-light

21 Random-sampling CE D F 6 5 11 9 AG 4 3 10 14 13 B 7 G F 1.Increasing Order 2.If F-light Throw If Select 3.Else Throw Don’t select 1.Random select edges to H 2.Find F of H CE D F 6 5 11 9 AG 4 3 10 14 13 B 7 G

22 No F-heavy edge can be in the minimum spanning forest of G (cycle property) F-light edge can be the candidate edge for the minimum spanning tree of G The forest F produced is the forest that would be produced by Kruskal and inlcude all possible MSF of G Observation

23 Observation The size of F is at most n-1 The expected number of F-light edges in G is at most n/p (negative binomial) Mean k = Expected n =

24 Outline Introduction Basic Property & Definition Algorithm Analysis

25 Analysis of the Algorithm The worst case. The expectations running time. The probability of the expectations running time.

26 Running time Analysis Total running time= running time in each steps. Step(1): 2 steps Boruvka’s algorithm Step(2):Dixon-Rauch-Tarjan verification algorithm. All takes linear time to the number of edges. ◦ Estimate the total number of edges.

27 Observe the recursion tree G=(V,E) |V| = n, |E|=m. ◦ m ≧ n/2 since there is no isolate vertices. Each problem generates at most 2 subproblems. ◦ At depth d, there is at most 2 d nodes. ◦ Each node in depth d has at most n/4 d vertices. The depth d is at most log 4 n. ◦ There are at most vertices in all subproblems

28 The worst case Theorem 4.1 The worst-case running time of the minimum-spanning-forest algorithm is O(min{n 2,m log n}), the same as the bound for Boruvka’s algorithm. Proof: There is two different estimate ways. 1.A subproblem at depth contains at most (n/4 d ) 2 /2 edges.  Total edges in all subproblems is:

29 The worst case 2. Consider a subprolbem G=(V,E) after step(1), we have a G ’ =(V ’,E ’ ),|E ’ | ≦ |E| - |V|/2, |V ’ | ≦ |V|/4 Edges in left-child = |H| Edges in right-child ≦ |E ’ | - |H| + |F| so edges in two subproblem is less then: (|H|) + (|E ’ | - |H| + |F|) =|E ’ | +|F| ≦ |E|-|V|/2 + |V|/4 ≦ |E| The two sub problem at most contains |E| edges.

30 The worst case m edges

31 The worst case The depth is at most log 4 n and each level has at most m edges, so there are at most (m log n) edges. The worst-case running time of the minimum- spanning-forest algorithm is O(min{n 2,m log n}).

32 Analysis of the Algorithm The worst case. The expectations running time. The probability of the expectations running time.

33 Analysis – Average Case (1/8) Theorem: the expected running time of the minimum spanning forest algorithm is O(m) ◦ Calculating the expected total number of edges for all left path problems Original Problem Left Sub-problem Right Sub-problem Left Subsub-problemRight Subsub-problem

34 Analysis – Average Case (2/8) Calculating the expected total edge number for one left path started at one problem with m’ edges Evaluating the total edge number for all right sub-problems # of edges = m’ Expected total edge number ≤ 2m’

35 Analysis – Average Case (3/8) G HG’G’ Boruvka × 2 G*G* Sample with p=0.5 1. E[edge number of H] = 0.5 × edge number of G* Original Problem Left Sub-problem Right Sub-problem 2. ∵ Boruvka × 2 ∴ edge number of G* ≤ edge number of G E[edge number of H] ≤ 0.5 × edge number of G Calculating the expected total edge number for one left path started at one problem with m’ edges

36 Analysis – Average Case (4/8) G HG’G’ Boruvka × 2 G*G* Sample with p=0.5 Original Problem Left Sub-problem Right Sub-problem E[edge number of H] ≤ 0.5 × edge number of G Calculating the expected total edge number for one left path started at one problem with m’ edges # of edges = m’ # of edges ≤ 0.5 × m’ Expected total edge number ≤ = 2m’

37 Analysis – Average Case (5/8) Calculating the expected total edge number for one left path L started at one problem with m’ edges ◦ Expected total edge number on L ≤ 2m’ Evaluating the total edge number of all right sub-problems E[total edges of all right sub-problem] ≤ n K.O.

38 Analysis – Average Case (6/8) G HG’G’ Original Problem Left Sub-problem Right Sub-problem 1. ∵ Boruvka × 2 ∴ vertex number of G* ≤ 0.25 × vertex number of G E[edge number of G’] ≤ 0.5×vertex number of G Evaluating the total edge number for all right sub- problems ◦ To prove : E[total edges of all right sub-problem] ≤ n Boruvka × 2 G*G* Sample with p=0.5 Return minimum forest F of H Delete F-heavy edges from G* 2. Based on lemma 2.1: E[edge number of G’] ≤ 2 × vertex number of G*

39 Analysis – Average Case (7/8) E[edge number of G’] ≤ 0.5×vertex number of G Evaluating the total edge number for all right sub- problems ◦ To prove : E[total edges of all right sub-problem] ≤ n G HG’G’ Original Problem Left Sub-problem Right Sub-problem Boruvka × 2 G*G* Sample with p=0.5 # of vertices of sub- problems ≤ 2×n/4 # of vertices of sub- problems ≤ 4×n/4 2 # of vertices of sub- problems ≤ 8×n/4 3 # of vertices of sub- problems ≤ 16×n/4 4 # of edges of right sub-problems ≤ n/2 # of edges of right sub-problems ≤ 2×n/8 # of vertices of original- problems=n # of edges of right sub- problems ≤ 4×n/(4 2 ×2) # of edges of right sub- problems ≤ 8×n/(4 3 ×2) = n

40 Analysis – Average Case (8/9) Evaluating the total edge number for all right sub- problems ◦ E[total vertices of all right sub-problem] ≤ n/2 ◦ To prove: E[processed edges of one sub-problem] ≤ 2 × vertex number of this sub-problem G HG’G’ Boruvka × 2 G*G* Sample with p=0.5 1. E[processed edges of G] = E[most trial] = vertex number of G* / 0.5 Original Problem Left Sub-problem Right Sub-problem 2. vertex number of G* ≤ vetex number of G E[processed edges of G] ≤ 2 × vertex number of G E[# of processed edges of all right sub-problems] ≤ n

41 Analysis – Average Case (8/8) Calculating the expected total edge number for one left path started at one problem with m’ edges ◦ Expected total edge number for one left path ≤ 2m’ Evaluating the total edge number for all right sub- problems ◦ E[total edges of all right sub-problem] ≤ n # of edges = m’ Expected total edge number ≤ 2m’ E[processed edges in the original problem and all sub-problems] =2×(m+n)

42 Analysis of the Algorithm The worst case. The expectations running time. The probability of the expectations running time.

43 The Probability of Linearity Theorem 4.3 ◦ The minimum spanning forest algorithm runs in Ο (m) time with probability 1 – exp(- Ω (m))

44 The Probability of Linearity Chernoff Bound: Given x i as i.d.d. random variables and 0 0, we have Thus, the probability that less than s successes (each with chance p) within k trials is

45 The Probability of Linearity Right Subproblems ◦ At most the number of vertices in all right subproblems: n/2 ( proved by theorem 4.2 ) ◦ n/2 is the upper bound on the total number of heads in nickel-flips

46 Right Subproblems The probability ◦ It occurs fewer than n/2 heads in a sequence of 3m nickel-tosses m + n ≦ 3m since n/2 ≦ m The probability is exp (- Ω (m)) by a Chernoff bound

47 The Probability of Linearity Left Subproblem ◦ Sequence: every sequence ends up with a tail, that is, HH…HHT ◦ The number of occurrences of tails is at most the number of sequences ◦ Assume that there are at most m’ edges in the root problem and in all right subproblems

48 Left Subproblems The probability ◦ It occurs m’ tails in a sequence of more than 3m’ coin-tosses The probability is exp (- Ω (m)) by a Chernoff bound

49 The Probability of Linearity Combining Right & Left Subproblems ◦ The total number of edges is Ο (m) with a high-probability bound 1 – exp(- Ω (m))


Download ppt "A Randomized Linear-Time Algorithm to Find Minimum Spaning Trees 黃則翰 R96922141 蘇承祖 R96922077 張紘睿 R96922136 許智程 D95922022 戴于晉 R96922171 David R. Karger."

Similar presentations


Ads by Google