Tirgul 8 Graph algorithms: Strongly connected components
Reminder – white path theorem In a depth forest of a graph G, a vertex v is a descendant of a vertex u iff in time d.u, when the DFS algorithm discovers u there is a path from u to v which consists only of white vertices (vertices which were not discovered yet)
Reminder – the parenthesis theorem The parenthesis theorem: Let G be a graph (directed or undirected) then after DFS on the graph: For each two vertices u, v exactly one of the following is true: [u.d, u.f] and [v.d, v.f] are disjoint. [u.d, u.f] [v.d, v.f] and u is a descendant of v. [v.d, v.f] [u.d, u.f] and v is a descendant of u. Immediate conclusion: a vertex v is a descendant of a vertex u in the first-depth forest iff [v.d, v.f] [u.d, u.f].
Reminder – after DFS: After DFS on directed graph. Each vertex has a time stamp. The edges in gray are the edges in the depth-first forest. The first-depth forest of the above graph. There is a correspondence between the discover and finish times of each vertex and the parenthesis structure below.
Strongly connected components Definition: First we define the following binary relation between two vertices: R(u,v)= there is a path from u to v and a path from v to u. Note that this is an equivalence relation and the graph can be divided into equivalence classes Those classes are the strongly connected components of the graph.
SCC algorithm STRONGLY-CONNECTED-COMPONENTS(G): Call DFS(G) to compute finishing times u.f for each vertex u. Compute G T, by reversing the direction of the edges in G. Call DFS(G T ), but in the main loop of DFS, consider the vertices in order of decreasing u.f The vertices of each tree in the last search form a SCC
Lemma 1 Claim: Let u and v be vertices of the same SCC, then each path between u and v is contained in this SCC. Proof: Let u and v be vertices of the same SCC According to the SCC definition there is a path from u to v and a path from v to u.
Lemma 1 – cont. Let w be some vertex on some path between u and v, u w v. According to the selection of w, there is a path from u to w. u and v belong to the same connected component. Thus, there is a path from v to u, v u. Again from the selection of w, there is a path from w to v, w v. By concatenating the paths w v and v u we get a path from w to u. That implies that u and w are in the same connected component.
Lemma 2 Claim: After DFS, all the vertices that belongs to the same SCC, belongs to a single tree in the depth forest. Proof: Let r be the first vertex of the SCC that DFS discovers. In the time that DFS discovers r, all the other vertices in the same connected component are white. There is a path form r to each one of those vertices. According to the previous lemma, the path is contained in the SCC and thus white. According to the white path theorem, each vertex in the SCC becomes a descendant of r in the depth forest
Forefather - definition Definition: Let G be a graph with DFS finish times. Let (u)=argmax w {w.f s.t. u w} We will say that (u) is the forefather of u. In other words, the forefather of u is the vertex that is reachable from u that finished last in the DFS. Note: (u)=u is possible.
Forefather - properties We will show that: ( (u))= (u). for two vertices u, v it is obvious that: u v (v).f (u).f u (u) by definition. Thus, ( (u)).f (u).f And of course (u).f ( (u)).f (by definition of forefather) Therefore, (u).f = ( (u)).f which implies that ( (u))= (u).
SCC representative In the following two theorems we will prove that in each SCC there is one vertex which is the forefather of all the vertices in this SCC. This forefather is the “representative” of the SCC. It is the first vertex in the SCC which is discovered and the last that finishes. In the DFS on G T it is the root of the depth forest. Theorem 1: (u) is an ancestor of u in the depth forest. Theorem 2: Two vertices are in the same SCC iff they have the same forefather in DFS on G.
Theorem 1 Theorem: (u) is an ancestor of u in the depth forest. Conclusion: u and (u) belongs to the same SCC. (immediate) Proof: If (u)=u we are done. Otherwise, let us consider the color of (u) in time u.d If (u) is black than (u).f u.f - contradiction! If (u) is gray than it is an ancestor of u and we are done. We only need to prove that (u) can not be white in time u.d
Theorem 1 – cont. Assume that (u) is white. Let us consider the vertices on the path from u to (u). Case 1: all the vertices on the path are white Then, according to the white path theorem (u) is a descendant of u in the depth forest (u).f u.f, again contradiction. Case 2: there is a non-white vertex on the path Let t be the last vertex which is not white. t must be gray, since there is no edge from a black vertex to a white vertex. But than there is a white path from t to (u). Thus, (u).f t.f. Contradiction to the definition of (u).
Theorem 2 Theorem: Two vertices u,v are in the same SCC iff (u)= (v) Proof: Assume that (u)= (v). According to the conclusion of the previous theorem: u and (u) are in the same SCC. v and (v) are in the same SCC. From transitivity, u and v are in the same SCC
Theorem 2 – cont. Assume that u and v are in the same SCC. There is a path from u to v and vice versa. Thus, the set of the vertices that are reachable from u is equal to the set of vertices that are reachable from v. By definition of forefather, (u)= (v). Now we have enough tools to understand the SCC algorithm.
DFS on G T, why ? Let us look on the vertex r, with the largest finishing time according to the times that were computed by the first DFS. By the definition of a forefather r must be one, since it is the forefather of itself. The SCC of r, as we proved, is all the vertices that r is their forefather. Since r has the largest finishing time, those are simply all the vertices that r can be reached from, or, in other words, all the vertices that can be reached from r in G T. Afterwards, the algorithm, continues by choosing the vertex r’, which is the vertex with the largest finishing time that is not in the same SCC as r. Again r’ is the forefather of itself and the same process is repeated.
SCC correctness Theorem: The procedure SCC(G) correctly computes the strongly connected components of a graph G. Proof: By induction on the number of depth trees in DFS on G T. Each step proves that a depth tree that is constructed in the DFS is a SCC, assuming that all the previous trees are SCC. Induction basis: Trivial, The first tree has no previous trees, so the assumption is true.
SCC correctness – cont. Let us look at a depth tree T, with a root r which is created in DFS on G T. We will denote C(r)={v | (v)=r} Now we prove that u T u C(r). Assume u C(r). From theorem 2 we know that u is in the same SCC as r. From lemma 2, u will be in the same depth forest as r. Assume u T We will show that for each vertex w such that (w) r, w is not in T and since u T, (u) will be r.
SCC correctness – cont. If (w).f > r.f then by induction w is already in the tree which is rooted by (w) and can’t be in the current tree. If (w).f < r.f. If w is in T that implies w r, and by definition of forefather, r would be it’s forefather, since it’s finish time is larger. Contradiction !