Graphs Part 1
Outline and Reading Graphs (§13.1) Data structures for graphs (§13.2) Definition Applications Terminology Properties ADT Data structures for graphs (§13.2) Edge list structure Adjacency list structure Adjacency matrix structure
Graph A graph is a pair (V, E), where Example: PVD ORD SFO LGA HNL LAX Graphs 二○一八年十一月十六日 Graph A graph is a pair (V, E), where V is a set of nodes, called vertices E is a collection of pairs of vertices, called edges Vertices and edges are positions and store elements Example: A vertex represents an airport and stores the three-letter airport code An edge represents a flight route between two airports and stores the mileage of the route 849 PVD 1843 ORD 142 Represent connectivity information between objects SFO 802 LGA 1743 337 1387 HNL 2555 1099 LAX 1233 DFW 1120 MIA
Edge & Graph Types Edge Types Graph Types Directed edge ordered pair of vertices (u,v) first vertex u is the origin second vertex v is the destination e.g., a flight Undirected edge unordered pair of vertices (u,v) e.g., a flight route Weighted edge Graph Types Directed graph (Digraph) all the edges are directed Undirected graph all the edges are undirected Weighted graph all the edges are weighted
Applications Electronic circuits Transportation networks Printed circuit board Integrated circuit Transportation networks Highway network Flight network Computer networks Local area network Internet Databases Entity-relationship diagram
Terminology End points (or end vertices) of an edge U and V are the endpoints of a Edges incident on a vertex a, d, and b are incident on V Adjacent vertices U and V are adjacent Degree of a vertex X has degree 5 Parallel (multiple) edges h and i are parallel edges self-loop j is a self-loop X U V W Z Y a c b e d f g h i j
Terminology (cont.) outgoing edges of a vertex h and b are the outgoing edges of X incoming edges of a vertex e, g, and i are incoming edges of X in-degree of a vertex X has in-degree 3 out-degree of a vertex X has out-degree 2 V b h j d X Z e i W g f Y
Terminology (cont.) Path Simple path Examples V a b P1 d U X Z P2 h c sequence of alternating vertices and edges begins with a vertex ends with a vertex each edge is preceded and followed by its endpoints Simple path path such that all its vertices and edges are distinct Examples P1=(V,b,X,h,Z) is a simple path P2=(U,c,W,e,X,g,Y,f,W,d,V) is a path that is not simple V a b P1 d U X Z P2 h c e W g f Y
Terminology (cont.) Cycle Simple cycle Examples V a b d U X Z C2 h e circular sequence of alternating vertices and edges each edge is preceded and followed by its endpoints Simple cycle cycle such that all its vertices and edges are distinct Examples C1=(V,b,X,g,Y,f,W,c,U,a,) is a simple cycle C2=(U,c,W,e,X,g,Y,f,W,d,V,a,) is a cycle that is not simple V a b d U X Z C2 h e C1 c W g f Y
Exercise on Terminology Graphs 二○一八年十一月十六日 Exercise on Terminology # of vertices? # of edges? What type of the graph is it? Show the end vertices of the edge with largest weight Show the vertices of smallest degree and largest degree Show the edges incident to the vertices in the above question Identify the shortest simple path from HNL to PVD Identify the simple cycle with the most edges 849 Represent connectivity information between objects PVD 1843 ORD 142 SFO 802 LGA 1743 337 1387 HNL 2555 1099 LAX 1233 DFW 1120 MIA
Exercise: Properties of Undirected Graphs Property 1 – Total degree Sv deg(v) = ? Property 2 – Total number of edges In an undirected graph with no self-loops and no multiple edges m Upper bound? Notation n number of vertices m number of edges deg(v) degree of vertex v Example n = ? m = ? deg(v) = ? A graph with given number of vertices (4) and maximum number of edges
Exercise: Properties of Undirected Graphs Property 1 – Total degree Sv deg(v) = 2m Proof: each edge is counted twice Property 2 – Total number of edges In an undirected graph with no self-loops and no multiple edges m n (n - 1)/2 Proof: each vertex has degree at most (n - 1) Notation n number of vertices m number of edges deg(v) degree of vertex v Example n = 4 m = 6 deg(v) = 3 A graph with given number of vertices (4) and maximum number of edges
Exercise: Properties of Directed Graphs 2018/11/16 Exercise: Properties of Directed Graphs Property 1 – Total in-degree and out-degree Sv in-deg(v) = ? Sv out-deg(v) = ? Property 2 – Total number of edges In a directed graph with no self-loops and no multiple edges m Upper bound? Notation n number of vertices m number of edges deg(v) degree of vertex v Example n = ? m = ? deg(v) = ? A graph with given number of vertices (4) and maximum number of edges
Exercise: Properties of Directed Graphs Property 1 – Total in-degree and out-degree Sv in-deg(v) = m Sv out-deg(v) = m Property 2 – Total number of edges In a directed graph with no self-loops and no multiple edges m n (n - 1) Notation n number of vertices m number of edges deg(v) degree of vertex v Example n = 4 m = 12 deg(v) = 6 A graph with given number of vertices (4) and maximum number of edges
Main Methods of the Graph ADT Update methods insertVertex(o) insertEdge(v, w, o) insertDirectedEdge(v, w, o) removeVertex(v) removeEdge(e) Generic methods numVertices() numEdges() vertices() edges() Vertices and edges are positions store elements Accessor methods incidentEdges(v) adjacentVertices(v) degree(v) endVertices(e) opposite(v, e) areAdjacent(v, w) isDirected(e) origin(e) destination(e) Specific to directed edges
Exercise on ADT PVD ORD SFO LGA HNL LAX DFW MIA insertVertex(IAH) Graphs 二○一八年十一月十六日 Exercise on ADT insertVertex(IAH) insertEdge(MIA, PVD, 1200) removeVertex(ORD) removeEdge((DFW,ORD)) isDirected((DFW,LGA)) origin ((DFW,LGA)) destination((DFW,LGA))) incidentEdges(ORD) adjacentVertices(ORD) degree(ORD) endVertices((LGA,MIA)) opposite(DFW, (DFW,LGA)) areAdjacent(DFW, SFO) 849 Represent connectivity information between objects PVD 1843 ORD 142 SFO 802 LGA 1743 337 1387 HNL 2555 1099 LAX 1233 DFW 1120 MIA
Edge List Structure Edge List Vertex Sequence An edge list can be stored in a sequence, a vector, a list or a dictionary such as a hash table (ORD, PVD) 849 ORD (ORD, DFW) 802 LGA (LGA, PVD) 142 PVD 849 PVD ORD 142 (LGA, MIA) 1099 DFW 802 LGA 1387 (DFW, LGA) 1387 1099 MIA DFW 1120 MIA (DFW, MIA) 1120
Exercise: Edge List Structure Construct the edge list for the following graph x u a y z v
Asymptotic Performance Vertices and edges are positions store elements Accessor methods Accessing vertex sequence degree(v) O(1) Accessing edge list endVertices(e) O(1) opposite(v, e) O(1) isDirected(e) O(1) origin(e) O(1) destination(e) O(1) Generic methods numVertices() O(1) numEdges() O(1) vertices() O(n) edges() O(m) Edge List Vertex Sequence Weight Directed Degree (ORD, PVD) 849 False ORD 2 (ORD, DFW) 802 False LGA 3 Specific to directed edges (LGA, PVD) 142 False PVD 2 (LGA, MIA) 1099 False DFW 3 (DFW, LGA) 1387 False MIA 2 (DFW, MIA) 1120 False
Asymptotic Performance of Edge List Structure n vertices, m edges no parallel edges no self-loops Bounds are “big-Oh” Edge List Space n + m incidentEdges(v) adjacentVertices(v) m areAdjacent (v, w) insertVertex(o) 1 insertEdge(v, w, o) removeVertex(v) removeEdge(e) Edge List Vertex Sequence Weight Directed Degree (ORD, PVD) 849 False ORD 2 (ORD, DFW) 802 False LGA 3 (LGA, PVD) 142 False PVD 2 (LGA, MIA) 1099 False DFW 3 (DFW, LGA) 1387 False MIA 2 (DFW, MIA) 1120 False
Adjacency List Structure 849 PVD ORD 142 802 LGA 1387 Adjacency List 1099 DFW 1120 ORD (ORD, PVD) (ORD, DFW) MIA LGA (LGA, PVD) (LGA, MIA) (LGA, DFW) PVD (PVD, ORD) (PVD, LGA) DFW (DFW, ORD) (DFW, LGA) (DFW, MIA) MIA (MIA, LGA) (MIA, DFW)
Exercise: Adjacency List Structure Construct the adjacency list for the following graph x u a y z v
Asymptotic Performance of Adjacency List Structure n vertices, m edges no parallel edges no self-loops Bounds are “big-Oh” Adjacency List Space n + m incidentEdges(v) adjacentVertices(v) deg(v) areAdjacent (v, w) min(deg(v), deg(w)) insertVertex(o) 1 insertEdge(v, w, o) removeVertex(v) deg(v)* removeEdge(e) 1* Adjacency List ORD LGA PVD DFW MIA (ORD, PVD) (ORD, DFW) (LGA, PVD) (LGA, MIA) (PVD, ORD) (PVD, LGA) (LGA, DFW) (DFW, ORD) (DFW, LGA) (DFW, MIA) (MIA, LGA) (MIA, DFW)
Adjacency Matrix Structure 1 2 3 4 0:ORD 2:PVD 4:MIA 3:DFW 1:LGA 849 802 1387 1099 1120 142
Exercise: Adjacency Matrix Structure Construct the adjacency matrix for the following graph x u a y z v
Asymptotic Performance of Adjacency Matrix Structure n vertices, m edges no parallel edges no self-loops Bounds are “big-Oh” Adjacency Matrix Space n2 incidentEdges(v) adjacentVertices(v) n areAdjacent (v, w) 1 insertVertex(o) insertEdge(v, w, o) removeVertex(v) removeEdge(e) 1 2 3 4
Asymptotic Performance Graphs Asymptotic Performance 二○一八年十一月十六日 n vertices, m edges no parallel edges no self-loops Bounds are “big-Oh” Edge List Adjacency List Adjacency Matrix Space n + m n2 incidentEdges(v) adjacentVertices(v) m deg(v) n areAdjacent (v, w) min(deg(v), deg(w)) 1 insertVertex(o) insertEdge(v, w, o) removeVertex(v) removeEdge(e) Weight of an edge
Graphs 二○一八年十一月十六日 Depth-First Search D B A C E
Outline and Reading Definitions (§13.1) Depth-first search (§13.3.1) Subgraph Connectivity Spanning trees and forests Depth-first search (§13.3.1) Algorithm Example Properties Analysis Applications of DFS Path finding Cycle finding
Subgraphs A subgraph S of a graph G is a graph such that The vertices of S are a subset of the vertices of G The edges of S are a subset of the edges of G A spanning subgraph of G is a subgraph that contains all the vertices of G Subgraph Spanning subgraph
Connectivity A graph is connected if there is a path between every pair of vertices A connected component of a graph G is a maximal connected subgraph of G Connected graph Non connected graph with two connected components
Trees and Forests A (free) tree is an undirected graph T such that T is connected T has no cycles This definition of tree is different from the one of a rooted tree A forest is an undirected graph without cycles The connected components of a forest are trees Tree Forest
Spanning Trees and Forests A spanning tree of a connected graph is a spanning subgraph that is a tree A spanning tree is not unique unless the graph is a tree Spanning trees have applications to the design of communication networks A spanning forest of a graph is a spanning subgraph that is a forest Graph Spanning tree
Depth-First Search Depth-first search (DFS) is a general technique for traversing a graph A DFS traversal of a graph G Visits all the vertices and edges of G Determines whether G is connected Computes the connected components of G Computes a spanning forest of G DFS on a graph with n vertices and m edges takes O(n + m ) time DFS can be further extended to solve other graph problems Find and report a path between two given vertices Find a cycle in the graph Depth-first search is to graphs what Euler tour is to binary trees
Example unexplored vertex visited vertex unexplored edge Graphs 二○一八年十一月十六日 Example unexplored vertex A A A visited vertex B D E unexplored edge discovery edge F C G back edge A D B A C E Explore incident edges in order, go along the unexplored edge 1. Seeing a neighbor already explored -> check next edge 2. Reaching dead end (no more edges to explore) -> go back to parent B D E F C G F G
Example (cont.) D B A C E D B A C E G F G F D B A C E D B A C E F G F
Example (cont.) A(G) = Φ D B A C E D B A C E F G F G D B A C E F G
Graphs 二○一八年十一月十六日 DFS and Maze Traversal The DFS algorithm is similar to a classic strategy for exploring a maze We mark each intersection, corner and dead end (vertex) visited We mark each corridor (edge ) traversed We keep track of the path back to the entrance (start vertex) by means of a rope (recursion stack) Use a rope to mark the visited route. Choose an unexplored way at the intersection 1. Seeing the rope -> choose another way 2. Reaching dead end (no more ways to explore) -> go back to the last intersection
Graphs 二○一八年十一月十六日 DFS Algorithm The algorithm uses a mechanism for setting and getting “labels” of vertices and edges Algorithm DFS(G, v) Input graph G and a start vertex v of G Output labeling of the edges of G in the connected component of v as discovery edges and back edges setLabel(v, VISITED) for all e G.incidentEdges(v) if getLabel(e) = UNEXPLORED w opposite(v,e) if getLabel(w) = UNEXPLORED setLabel(e, DISCOVERY) DFS(G, w) else setLabel(e, BACK) Algorithm DFS(G) Input graph G Output labeling of the edges of G as discovery edges and back edges for all u G.vertices() setLabel(u, UNEXPLORED) for all e G.edges() setLabel(e, UNEXPLORED) for all v G.vertices() if getLabel(v) = UNEXPLORED DFS(G, v)
Exercise: DFS Algorithm Perform DFS of the following graph, start from vertex A Assume adjacent edges are processed in alphabetical order Number vertices in the order they are visited Label edges as discovery or back edges A B C D E F
Properties of DFS Property 1 v1 Property 2 v2 DFS(G, v) visits all the vertices and edges in the connected component of v Property 2 The discovery edges labeled by DFS(G, v) form a spanning tree of the connected component of v v1 D B A C E F v2 G
Analysis of DFS Setting/getting a vertex/edge label takes O(1) time Each vertex is labeled twice once as UNEXPLORED once as VISITED Each edge is labeled twice once as DISCOVERY or BACK Function DFS(G, v) and the method incidentEdges are called once for each vertex D B A C E G F
Graphs 二○一八年十一月十六日 Analysis of DFS DFS runs in O(n + m) time provided the graph is represented by the adjacency list structure Recall that ∑v deg(v) = 2m Algorithm DFS(G, v) Input graph G and a start vertex v of G Output labeling of the edges of G in the connected component of v as discovery edges and back edges setLabel(v, VISITED) for all e G.incidentEdges(v) if getLabel(e) = UNEXPLORED w opposite(v,e) if getLabel(w) = UNEXPLORED setLabel(e, DISCOVERY) DFS(G, w) else setLabel(e, BACK) Algorithm DFS(G) Input graph G Output labeling of the edges of G as discovery edges and back edges for all u G.vertices() setLabel(u, UNEXPLORED) for all e G.edges() setLabel(e, UNEXPLORED) for all v G.vertices() if getLabel(v) = UNEXPLORED DFS(G, v) O(n) Which data structure is the best for DFS? Analysis: Sumv deg(v) = 2m What is the complexity of this algorithm if we use edge list or adjacency matrix? O(m) O(n +m)
Path Finding We can specialize the DFS algorithm to find a path between two given vertices v and z using the template method pattern We call DFS(G, v) with v as the start vertex We use a stack S to keep track of the path between the start vertex and the current vertex As soon as destination vertex z is encountered, we return the path as the contents of the stack Algorithm pathDFS(G, v, z) setLabel(v, VISITED) S.push(v) if v = z return S.elements() for all e G.incidentEdges(v) if getLabel(e) = UNEXPLORED w opposite(v,e) if getLabel(w) = UNEXPLORED setLabel(e, DISCOVERY) S.push(e) pathDFS(G, w, z) S.pop() else setLabel(e, BACK)
Graphs 二○一八年十一月十六日 Breadth-First Search C B A E D L0 L1 F L2
Outline and Reading Breadth-first search (Sect. 13.3.5) DFS vs. BFS Algorithm Example Properties Analysis Applications DFS vs. BFS Comparison of applications Comparison of edge labels
Breadth-First Search Breadth-first search (BFS) is a general technique for traversing a graph A BFS traversal of a graph G Visits all the vertices and edges of G Determines whether G is connected Computes the connected components of G Computes a spanning forest of G BFS on a graph with n vertices and m edges takes O(n + m ) time BFS can be further extended to solve other graph problems Find and report a path with the minimum number of edges between two given vertices Find a simple cycle, if there is one
Example unexplored vertex visited vertex unexplored edge C B A E D L0 L1 F A unexplored vertex A visited vertex unexplored edge discovery edge cross edge L0 L0 A A L1 L1 B C D B C D E F E F
Example (cont.) A C B A E D L0 L1 F C B A E D L0 L1 F L2 C B A E D L0 discovery edge cross edge visited vertex A unexplored vertex unexplored edge C B A E D L0 L1 F C B A E D L0 L1 F L2 C B A E D L0 L1 F L2 C B A E D L0 L1 F L2
Example (cont.) A C B A E D L0 L1 F L2 L0 A L1 B C D L2 E F C B A E D discovery edge cross edge visited vertex A unexplored vertex unexplored edge C B A E D L0 L1 F L2 L0 A L1 B C D L2 E F C B A E D L0 L1 F L2
BFS Algorithm Algorithm BFS(G, s) L0 new empty sequence L0.insertLast(s) setLabel(s, VISITED) i 0 while Li.isEmpty() Li +1 new empty sequence for all v Li.elements() for all e G.incidentEdges(v) if getLabel(e) = UNEXPLORED w opposite(v,e) if getLabel(w) = UNEXPLORED setLabel(e, DISCOVERY) setLabel(w, VISITED) Li +1.insertLast(w) else setLabel(e, CROSS) i i +1 The algorithm uses a mechanism for setting and getting “labels” of vertices and edges Algorithm BFS(G) Input graph G Output labeling of the edges and partition of the vertices of G for all u G.vertices() setLabel(u, UNEXPLORED) for all e G.edges() setLabel(e, UNEXPLORED) for all v G.vertices() if getLabel(v) = UNEXPLORED BFS(G, v)
Exercise: BFS Algorithm Perform BFS of the following graph, start from vertex A Assume adjacent edges are processed in alphabetical order Number vertices in the order they are visited and note the level they are in Label edges as discovery or cross edges E D C B F A
Properties Notation Property 1 Property 2 Property 3 Gs: connected component of s Property 1 BFS(G, s) visits all the vertices and edges of Gs Property 2 The discovery edges labeled by BFS(G, s) form a spanning tree Ts of Gs Property 3 For each vertex v in Li The path of Ts from s to v has i edges Every path from s to v in Gs has at least i edges A B C D E F L0 A L1 B C D L2 E F
Analysis Setting/getting a vertex/edge label takes O(1) time Each vertex is labeled twice once as UNEXPLORED once as VISITED Each edge is labeled twice once as DISCOVERY or CROSS Each vertex is inserted once into a sequence Li Method incidentEdges() is called once for each vertex BFS runs in O(n + m) time provided the graph is represented by the adjacency list structure Recall that Sv deg(v) = 2m
Applications Using the template method pattern, we can specialize the BFS traversal of a graph G to solve the following problems in O(n + m) time Compute the connected components of G Compute a spanning forest of G Find a simple cycle in G, or report that G is a forest Given two vertices of G, find a path in G between them with the minimum number of edges, or report that no such path exists
DFS vs. BFS Applications DFS BFS DFS BFS Spanning forest, connected components, paths, cycles Shortest paths C B A E D L0 L1 F L2 A B C D E F DFS BFS
Cycle Finding Algorithm cycleDFS(G, v, z) setLabel(v, VISITED) S.push(v) for all e G.incidentEdges(v) if getLabel(e) = UNEXPLORED w opposite(v,e) S.push(e) if getLabel(w) = UNEXPLORED setLabel(e, DISCOVERY) pathDFS(G, w, z) S.pop() else T new empty stack repeat o S.pop() T.push(o) until o = w return T.elements() We can specialize the DFS algorithm to find a simple cycle using the template method pattern We use a stack S to keep track of the path between the start vertex and the current vertex As soon as a back edge (v, w) is encountered, we return the cycle as the portion of the stack from the top to vertex w
DFS vs. BFS (cont.) Back edge (v,w) Cross edge (v,w) DFS BFS w is an ancestor of v in the tree of discovery edges Cross edge (v,w) w is in the same level as v or in the next level in the tree of discovery edges C B A E D L0 L1 F L2 A B C D E F DFS BFS
Adjacency Matrix Structure v b Edge list structure Augmented vertex objects Integer key (index) associated with vertex 2D-array adjacency array Reference to edge object for adjacent vertices Null for non nonadjacent vertices The “old fashioned” version just has 0 for no edge and 1 for edge u w u 1 v 2 w 1 2 a b
Edge List Structure Vertex object Edge object Vertex sequence element reference to position in vertex sequence Edge object origin vertex object destination vertex object reference to position in edge sequence Vertex sequence sequence of vertex objects Edge sequence sequence of edge objects u a c b d v w z u v w z a b c d
Adjacency List Structure v b Edge list structure Incidence sequence for each vertex sequence of references to edge objects of incident edges Augmented edge objects references to associated positions in incidence sequences of end vertices u w u v w a b