CSC401: Analysis of Algorithms 6-1 CSC401 – Analysis of Algorithms Chapter 6 Graphs Objectives: Introduce graphs and data structures Discuss the graph connectivity and biconnectivity Present the depth-first and breath-first search algorithms as well as algorithms for finding biconnected components Introduce directed graphs and algorithms performed on directed graphs: Reachability, transitive closure, DAG
CSC401: Analysis of Algorithms6-2 Graph A graph is a pair (V, E), where –V is a set of nodes, called vertices –E is a collection of pairs of vertices, called edges –Vertices and edges are positions and store elements Example: –A vertex represents an airport and stores the three-letter airport code –An edge represents a flight route between two airports and stores the mileage of the route ORD PVD MIA DFW SFO LAX LGA HNL
CSC401: Analysis of Algorithms6-3 Edge Types Directed edge –ordered pair of vertices (u,v) –first vertex u is the origin –second vertex v is the destination –e.g., a flight Undirected edge –unordered pair of vertices (u,v) –e.g., a flight route Directed graph –all the edges are directed –e.g., route network Undirected graph –all the edges are undirected –e.g., flight network ORDPVD flight AA 1206 ORDPVD 849 miles
CSC401: Analysis of Algorithms6-4 Terminology End vertices (or endpoints) of an edge –U and V are the endpoints of a Edges incident on a vertex –a, d, and b are incident on V Adjacent vertices –U and V are adjacent Degree of a vertex –X has degree 5 Parallel edges –h and i are parallel edges Self-loop –j is a self-loop XU V W Z Y a c b e d f g h i j
CSC401: Analysis of Algorithms6-5 P1P1 Terminology (cont.) Path –sequence of alternating vertices and edges –begins with a vertex –ends with a vertex –each edge is preceded and followed by its endpoints Simple path –path such that all its vertices and edges are distinct Examples –P 1 =(V,b,X,h,Z) is a simple path –P 2 =(U,c,W,e,X,g,Y,f,W,d,V) is a path that is not simple XU V W Z Y a c b e d f g hP2P2
CSC401: Analysis of Algorithms6-6 Terminology (cont.) Cycle –circular sequence of alternating vertices and edges –each edge is preceded and followed by its endpoints Simple cycle –cycle such that all its vertices and edges are distinct Examples –C 1 =(V,b,X,g,Y,f,W,c,U,a,) is a simple cycle –C 2 =(U,c,W,e,X,g,Y,f,W,d,V,a, ) is a cycle that is not simple C1C1 XU V W Z Y a c b e d f g hC2C2
CSC401: Analysis of Algorithms6-7 Properties Notation n number of vertices n number of vertices m number of edges m number of edges deg(v) degree of vertex v Property 1 v deg(v) 2m Proof: Proof: each edge is counted twice Property 2 In an undirected graph with no self-loops and no multiple edges m n (n 1) 2 Proof: Proof: each vertex has degree at most (n 1) What is the bound for a directed graph? Example –n 4 –m 6 –deg(v) 3
CSC401: Analysis of Algorithms6-8 Main Methods of the Graph ADT Vertices and edges –are positions –store elements Accessor methods –aVertex() –incidentEdges(v) –endVertices(e) –isDirected(e) –origin(e) –destination(e) –opposite(v, e) –areAdjacent(v, w) Update methods –insertVertex(o) –insertEdge(v, w, o) –insertDirectedEdge(v, w, o) –removeVertex(v) –removeEdge(e) Generic methods –numVertices() –numEdges() –vertices() –edges()
CSC401: Analysis of Algorithms6-9 Edge List Structure Vertex object –element –reference to position in vertex sequence Edge object –element –origin vertex object –destination vertex object –reference to position in edge sequence Vertex sequence –sequence of vertex objects Edge sequence –sequence of edge objects v u w ac b a z d uv w z bcd
CSC401: Analysis of Algorithms6-10 Adjacency List Structure Edge list structure Incidence sequence for each vertex –sequence of references to edge objects of incident edges Augmented edge objects –references to associated positions in incidence sequences of end vertices u v w ab a uv w b
CSC401: Analysis of Algorithms6-11 Adjacency Matrix Structure Edge list structure Augmented vertex objects –Integer key (index) associated with vertex 2D-array adjacency array –Reference to edge object for adjacent vertices –Null for non nonadjacent vertices The “old fashioned” version just has 0 for no edge and 1 for edge u v w ab 012 0 1 2 a uv w 01 2 b
CSC401: Analysis of Algorithms6-12 Asymptotic Performance n vertices, m edges n vertices, m edges no parallel edges no parallel edges no self-loops no self-loops Bounds are “big-Oh” Bounds are “big-Oh” Edge List Adjacency List Adjacenc y Matrix Space n m n2n2n2n2 incidentEdges( v ) m deg(v) n areAdjacent ( v, w ) m min(deg(v), deg(w)) 1 insertVertex( o ) 11 n2n2n2n2 insertEdge( v, w, o ) 111 removeVertex( v ) m deg(v) n2n2n2n2 removeEdge( e ) 111
CSC401: Analysis of Algorithms6-13 Trees and Forests A (free) tree is an undirected graph T such that –T is connected –T has no cycles This definition of tree is different from the one of a rooted tree A forest is an undirected graph without cycles The connected components of a forest are trees Tree Forest
CSC401: Analysis of Algorithms6-14 Spanning Trees and Forests A spanning tree of a connected graph is a spanning subgraph that is a tree A spanning tree is not unique unless the graph is a tree Spanning trees have applications to the design of communication networks A spanning forest of a graph is a spanning subgraph that is a forest Graph Spanning tree
CSC401: Analysis of Algorithms6-15 Depth-First Search Depth-first search (DFS) is a general technique for traversing a graph A DFS traversal of a graph G –Visits all the vertices and edges of G –Determines whether G is connected –Computes the connected components of G –Computes a spanning forest of G DFS on a graph with n vertices and m edges takes O(n m ) time DFS can be further extended to solve other graph problems –Find and report a path between two given vertices –Find a cycle in the graph Depth-first search is to graphs what Euler tour is to binary trees
CSC401: Analysis of Algorithms6-16 DFS Algorithm The algorithm uses a mechanism for setting and getting “labels” of vertices and edges Algorithm DFS(G, v) Input graph G and a start vertex v of G Output labeling of the edges of G in the connected component of v as discovery edges and back edges setLabel(v, VISITED) for all e G.incidentEdges(v) if getLabel(e) UNEXPLORED w opposite(v,e) if getLabel(w) UNEXPLORED setLabel(e, DISCOVERY) DFS(G, w) else setLabel(e, BACK) Algorithm DFS(G) Input graph G Output labeling of the edges of G as discovery edges and back edges for all u G.vertices() setLabel(u, UNEXPLORED) for all e G.edges() setLabel(e, UNEXPLORED) for all v G.vertices() if getLabel(v) UNEXPLORED DFS(G, v)
CSC401: Analysis of Algorithms6-17 Properties of DFS Property 1 DFS(G, v) visits all the vertices and edges in the connected component of v Property 2 The discovery edges labeled by DFS(G, v) form a spanning tree of the connected component of v DB A C E
CSC401: Analysis of Algorithms6-18 Analysis of DFS Setting/getting a vertex/edge label takes O(1) time Each vertex is labeled twice –once as UNEXPLORED –once as VISITED Each edge is labeled twice –once as UNEXPLORED –once as DISCOVERY or BACK Method incidentEdges is called once for each vertex DFS runs in O(n m) time provided the graph is represented by the adjacency list structure –Recall that v deg(v) 2m
CSC401: Analysis of Algorithms6-19 Path Finding We can specialize the DFS algorithm to find a path between two given vertices u and z using the template method pattern We call DFS(G, u) with u as the start vertex We use a stack S to keep track of the path between the start vertex and the current vertex As soon as destination vertex z is encountered, we return the path as the contents of the stack Algorithm pathDFS(G, v, z) setLabel(v, VISITED) S.push(v) if v z return S.elements() for all e G.incidentEdges(v) if getLabel(e) UNEXPLORED w opposite(v,e) if getLabel(w) UNEXPLORED setLabel(e, DISCOVERY) S.push(e) pathDFS(G, w, z) S.pop(e) else setLabel(e, BACK) S.pop(v)
CSC401: Analysis of Algorithms6-20 Cycle Finding We can specialize the DFS algorithm to find a simple cycle using the template method pattern We use a stack S to keep track of the path between the start vertex and the current vertex As soon as a back edge (v, w) is encountered, we return the cycle as the portion of the stack from the top to vertex w Algorithm cycleDFS(G, v, z) setLabel(v, VISITED) S.push(v) for all e G.incidentEdges(v) if getLabel(e) UNEXPLORED w opposite(v,e) S.push(e) if getLabel(w) UNEXPLORED setLabel(e, DISCOVERY) pathDFS(G, w, z) S.pop(e) else T new empty stack repeat o S.pop() T.push(o) until o w return T.elements() S.pop(v)
CSC401: Analysis of Algorithms6-21 Breadth-First Search Breadth-first search (BFS) is a general technique for traversing a graph A BFS traversal of a graph G –Visits all the vertices and edges of G –Determines whether G is connected –Computes the connected components of G –Computes a spanning forest of G BFS on a graph with n vertices and m edges takes O(n m ) time BFS can be further extended to solve other graph problems –Find and report a path with the minimum number of edges between two given vertices –Find a simple cycle, if there is one
CSC401: Analysis of Algorithms6-22 BFS Algorithm The algorithm uses a mechanism for setting and getting “labels” of vertices and edges Algorithm BFS(G, s) L 0 new empty sequence L 0.insertLast(s) setLabel(s, VISITED ) i 0 while L i.isEmpty() L i 1 new empty sequence for all v L i.elements() for all e G.incidentEdges(v) if getLabel(e) UNEXPLORED w opposite(v,e) if getLabel(w) UNEXPLORED setLabel(e, DISCOVERY ) setLabel(w, VISITED ) L i 1.insertLast(w) else setLabel(e, CROSS ) i i 1 Algorithm BFS(G) Input graph G Output labeling of the edges and partition of the vertices of G for all u G.vertices() setLabel(u, UNEXPLORED ) for all e G.edges() setLabel(e, UNEXPLORED ) for all v G.vertices() if getLabel(v) UNEXPLORED BFS(G, v)
CSC401: Analysis of Algorithms6-23 Properties Notation –G s : connected component of s Property 1 BFS(G, s) visits all the vertices and edges of G s Property 2 The discovery edges labeled by BFS(G, s) form a spanning tree T s of G s Property 3 For each vertex v in L i –The path of T s from s to v has i edges –Every path from s to v in G s has at least i edges CB A E D L0L0 L1L1 F L2L2 CB A E D F
CSC401: Analysis of Algorithms6-24 Analysis Setting/getting a vertex/edge label takes O(1) time Each vertex is labeled twice –once as UNEXPLORED –once as VISITED Each edge is labeled twice –once as UNEXPLORED –once as DISCOVERY or CROSS Each vertex is inserted once into a sequence L i Method incidentEdges is called once for each vertex BFS runs in O(n m) time provided the graph is represented by the adjacency list structure –Recall that v deg(v) 2m
CSC401: Analysis of Algorithms6-25 Applications Using the template method pattern, we can specialize the BFS traversal of a graph G to solve the following problems in O(n m) time –Compute the connected components of G –Compute a spanning forest of G –Find a simple cycle in G, or report that G is a forest –Given two vertices of G, find a path in G between them with the minimum number of edges, or report that no such path exists
CSC401: Analysis of Algorithms6-26 DFS vs. BFS CB A E D L0L0 L1L1 F L2L2 CB A E D F DFS BFS ApplicationsDFSBFS Spanning forest, connected components, paths, cycles Shortest paths Biconnected components DFS: Back edge (v,w) –w is an ancestor of v in the tree of discovery edges BFS: Cross edge (v,w) –w is in the same level as v or in the next level in the tree of discovery edges
CSC401: Analysis of Algorithms6-27 Separation Edges and Vertices Definitions -- Let G be a connected graph –A separation edge of G is an edge whose removal disconnects G –A separation vertex of G is a vertex whose removal disconnects G Applications –Separation edges and vertices represent single points of failure in a network and are critical to the operation of the network Example –DFW, LGA and LAX are separation vertices –(DFW,LAX) is a separation edge ORDPVD MIA DFW SFO LAX LGA HNL
CSC401: Analysis of Algorithms6-28 Biconnected Graph Equivalent definitions of a biconnected graph G –Graph G has no separation edges and no separation vertices –For any two vertices u and v of G, there are two disjoint simple paths between u and v (i.e., two simple paths between u and v that share no other vertices or edges) –For any two vertices u and v of G, there is a simple cycle containing u and v Example ORD PVD MIA DFW SFO LAX LGA HNL
CSC401: Analysis of Algorithms6-29 Biconnected Components Biconnected component of a graph G –A maximal biconnected subgraph of G, or –A subgraph consisting of a separation edge of G and its end vertices Interaction of biconnected components –An edge belongs to exactly one biconnected component –A nonseparation vertex belongs to exactly one biconnected component –A separation vertex belongs to two or more biconnected components Example of a graph with four biconnected components ORDPVD MIA DFW SFO LAX LGA HNL RDU
CSC401: Analysis of Algorithms6-30 Equivalence Classes Given a set S, a relation R on S is a set of ordered pairs of elements of S, i.e., R is a subset of S S An equivalence relation R on S satisfies the following properties Reflexive: (x,x) R Symmetric: (x,y) R (y,x) R Transitive: (x,y) R (y,z) R (x,z) R An equivalence relation R on S induces a partition of the elements of S into equivalence classes Example (connectivity relation among the vertices of a graph): –Let V be the set of vertices of a graph G –Define the relation C = {(v,w) V V such that G has a path from v to w} –Relation C is an equivalence relation –The equivalence classes of relation C are the vertices in each connected component of graph G
CSC401: Analysis of Algorithms6-31 Link Relation Edges e and f of connected graph G are linked if –e f, or –G has a simple cycle containing e and f Theorem: The link relation on the edges of a graph is an equivalence relation Proof Sketch: –The reflexive and symmetric properties follow from the definition –For the transitive property, consider two simple cycles sharing an edge a b g c j d e f i Equivalence classes of linked edges: {a} {b, c, d, e, f} {g, i, j} a b g c j d e f i
CSC401: Analysis of Algorithms6-32 Link Components The link components of a connected graph G are the equivalence classes of edges with respect to the link relation A biconnected component of G is the subgraph of G induced by an equivalence class of linked edges A separation edge is a single-element equivalence class of linked edges A separation vertex has incident edges in at least two distinct equivalence classes of linked edge ORDPVD MIA DFW SFO LAX LGA HNL RDU
CSC401: Analysis of Algorithms6-33 Auxiliary Graph Auxiliary graph B for a connected graph G –Associated with a DFS traversal of G –The vertices of B are the edges of G –For each back edge e of G, B has edges (e,f 1 ), (e,f 2 ), …, (e,f k ), where f 1, f 2, …, f k are the discovery edges of G that form a simple cycle with e –Its connected components correspond to the the link components of G –In the worst case, the number of edges of the auxiliary graph is proportional to nm Auxiliary graph B a d b c e h i j f g a b g c j d e f i DFS on graph G h i Auxiliary graph B DFS on graph G
CSC401: Analysis of Algorithms6-34 Proxy Graph Algorithm proxyGraph(G) Input connected graph G Output proxy graph F for G F empty graph DFS(G, s) { s is any vertex of G} for all discovery edges e of G F.insertVertex(e) setLabel(e, UNLINKED) for all vertices v of G in DFS visit order for all back edges e (u,v) F.insertVertex(e) repeat f discovery edge with dest. u F.insertEdge(e,f, ) if f getLabel(f) UNLINKED setLabel(f, LINKED) u origin of edge f else u v { ends the loop } until u v return F a b g c j d e f i Proxy graph F DFS on graph G a d b c e h i j f h g i
CSC401: Analysis of Algorithms6-35 Proxy Graph (cont.) Proxy graph F for a connected graph G –Spanning forest of the auxiliary graph B –Has m vertices and O(m) edges –Can be constructed in O(n m) time –Its connected components (trees) correspond to the the link components of G Given a graph G with n vertices and m edges, we can compute the following in O(n m) time –The biconnected components of G –The separation vertices of G –The separation edges of G a b g c j d e f i Proxy graph F DFS on graph G a d b c e h i j f h g i
CSC401: Analysis of Algorithms6-36 Digraphs A digraph is a graph whose edges are all directed –Short for “directed graph” Applications –one-way streets –flights –task scheduling Properties: A graph G=(V,E) A C E B D –Each edge goes in one direction: Edge (a,b) goes from a to b, but not b to a. –If G is simple, m < n*(n-1). –If we keep in-edges and out-edges in separate adjacency lists, we can perform listing of in-edges and out-edges in time proportional to their size.
CSC401: Analysis of Algorithms6-37 Directed DFS We can specialize the traversal algorithms (DFS and BFS) to digraphs by traversing edges only along their direction In the directed DFS algorithm, we have four types of edges –discovery edges –back edges –forward edges –cross edges A directed DFS starting at a vertex s determines the vertices reachable from s A C E B D
CSC401: Analysis of Algorithms6-38 Reachability DFS tree rooted at v: vertices reachable from v via directed paths A C E B D F A C ED A C E B D F Strong Connectivity Each vertex can reach all other vertices a d c b e f g
CSC401: Analysis of Algorithms6-39 Pick a vertex v in G. Perform a DFS from v in G. –If there’s a w not visited, print “no”. Let G’ be G with edges reversed. Perform a DFS from v in G’. –If there’s a w not visited, print “no”. –Else, print “yes”. Running time: O(n+m). Strong Connectivity Algorithm G: G’: a d c b e f g a d c b e f g
CSC401: Analysis of Algorithms6-40 Maximal subgraphs such that each vertex can reach all other vertices in the subgraph Can also be done in O(n+m) time using DFS, but is more complicated (similar to biconnectivity). Strongly Connected Components { a, c, g } { f, d, e, b } a d c b e f g
CSC401: Analysis of Algorithms6-41 Transitive Closure Given a digraph G, the transitive closure of G is the digraph G* such that –G* has the same vertices as G –if G has a directed path from u to v ( u v ), G* has a directed edge from u to v The transitive closure provides reachability information about a digraph B A D C E B A D C E G G*
CSC401: Analysis of Algorithms6-42 Computing the Transitive Closure Perform DFS starting at each vertex: O(n(n+m)) Dynamic programming: Floyd-Warshall Algorithm – –If there's a way to get from A to B and from B to C, then there's a way to get from A to C. –Idea #1: Number the vertices 1, 2, …, n. –Idea #2: Consider paths that use only vertices numbered 1, 2, …, k, as intermediate vertices: k j i Uses only vertices numbered 1,…,k-1 Uses only vertices numbered 1,…,k-1 Uses only vertices numbered 1,…,k (add this edge if it’s not already in)
CSC401: Analysis of Algorithms6-43 Floyd-Warshall’s Algorithm Floyd-Warshall’s algorithm numbers the vertices of G as v 1, …, v n and computes a series of digraphs G 0, …, G n –G 0 =G –G k has a directed edge (v i, v j ) if G has a directed path from v i to v j with intermediate vertices in the set {v 1, …, v k } We have that G n = G* In phase k, digraph G k is computed from G k 1 Running time: O(n 3 ), assuming areAdjacent is O(1) (e.g., adjacency matrix) Algorithm FloydWarshall(G) Input digraph G Output transitive closure G* of G i 1 for all v G.vertices() denote v as v i i i + 1 G 0 G for k 1 to n do G k G k 1 for i 1 to n (i k) do for j 1 to n (j i, k) do if G k 1.areAdjacent(v i, v k ) G k 1.areAdjacent(v k, v j ) if G k.areAdjacent(v i, v j ) G k.insertDirectedEdge(v i, v j, k) return G n
CSC401: Analysis of Algorithms6-44 Floyd-Warshall Example JFK BOS MIA ORD LAX DFW SFO v 2 v 1 v 3 v 4 v 5 v 6
CSC401: Analysis of Algorithms6-45 Floyd-Warshall, Iteration 1 JFK BOS MIA ORD LAX DFW SFO v 2 v 1 v 3 v 4 v 5 v 6
CSC401: Analysis of Algorithms6-46 Floyd-Warshall, Iteration 2 JFK BOS MIA ORD LAX DFW SFO v 2 v 1 v 3 v 4 v 5 v 6
CSC401: Analysis of Algorithms6-47 Floyd-Warshall, Iteration 3 JFK BOS MIA ORD LAX DFW SFO v 2 v 1 v 3 v 4 v 5 v 6
CSC401: Analysis of Algorithms6-48 Floyd-Warshall, Iteration 4 JFK BOS MIA ORD LAX DFW SFO v 2 v 1 v 3 v 4 v 5 v 6
CSC401: Analysis of Algorithms6-49 Floyd-Warshall, Iteration 5 JFK MIA ORD LAX DFW SFO v 2 v 1 v 3 v 4 v 5 v 6 BOS
CSC401: Analysis of Algorithms6-50 Floyd-Warshall, Iteration 6 JFK MIA ORD LAX DFW SFO v 2 v 1 v 3 v 4 v 5 v 6 BOS
CSC401: Analysis of Algorithms6-51 Floyd-Warshall, Conclusion JFK MIA ORD LAX DFW SFO v 2 v 1 v 3 v 4 v 5 v 6 BOS
CSC401: Analysis of Algorithms6-52 DAGs and Topological Ordering A directed acyclic graph (DAG) is a digraph that has no directed cycles A topological ordering of a digraph is a numbering v 1, …, v n of the vertices such that for every edge (v i, v j ), we have i j Example: in a task scheduling digraph, a topological ordering a task sequence that satisfies the precedence constraints Theorem A digraph admits a topological ordering if and only if it is a DAG B A D C E DAG G B A D C E Topological ordering of G v1v1 v2v2 v3v3 v4v4 v5v5
CSC401: Analysis of Algorithms6-53 Topological Sorting Number vertices, so that (u,v) in E implies u < v write c.s. program play wake up eat nap study computer sci. more c.s. work out sleep dream about graphs A typical student day make cookies for professors
CSC401: Analysis of Algorithms6-54 Note: This algorithm is different than the one in Goodrich-Tamassia Running time: O(n + m). How…? Algorithm for Topological Sorting Method TopologicalSort(G) H G// Temporary copy of G n G.numVertices() while H is not empty do Let v be a vertex with no outgoing edges Label v n n n - 1 Remove v from H
CSC401: Analysis of Algorithms6-55 Topological Sorting Algorithm using DFS Simulate the algorithm by using depth-first search O(n+m) time. Algorithm topologicalDFS(G, v) Input graph G and a start vertex v of G Output labeling of the vertices of G in the connected component of v setLabel(v, VISITED) for all e G.incidentEdges(v) if getLabel(e) UNEXPLORED w opposite(v,e) if getLabel(w) UNEXPLORED setLabel(e, DISCOVERY) topologicalDFS(G, w) else {e is a forward or cross edge} Label v with topological number n n n - 1 Algorithm topologicalDFS(G) Input dag G Output topological ordering of G n G.numVertices() for all u G.vertices() setLabel(u, UNEXPLORED) for all e G.edges() setLabel(e, UNEXPLORED) for all v G.vertices() if getLabel(v) UNEXPLORED topologicalDFS(G, v)