CS 261 – Data Structures Graphs
Used in a variety of applications and algorithms Graphs represent relationships or connections Superset of trees (i.e., a tree is a restricted form of a graph) : –A graph represents general relationships: Each node may have many predecessors There may be multiple paths (or no path) from one node to another Can have cycles or loops Examples: airline flight connections, friends, algorithmic flow, etc. –A tree has more restrictive relationships and topology: Each node has a single predecessor—its parent There is a single, unique path from the root to any node No cycles Example: less than or greater than in a binary search tree
Graphs: Vertices and Edges A graph is composed of vertices and edges Vertices (also called nodes) : –Represent objects, states (i.e., conditions or configurations), positions, or simply just place holders –Set {v 1, v 2, …, v n } : each vertex is unique no two vertices represent the same object/state Edges (also called arcs) : –Can be either directed or undirected –Can be either weighted (or labeled) or unweighted –An edge (v i, v j ) between two vertices indicates that they are directly related, connected, etc. –If there is an edge from v i to v j, then v j is a neighbor of v i (if the edge is undirected then v i and v j are neighbors or each other)
Graphs: Types of Edges UndirectedDirected Unweighted Weighted v1v1 v2v2 v4v4 v3v3 v1v1 v2v2 v4v4 v3v3 v1v1 v2v2 v4v4 v3v3 w 1,2 w 2,3 w 1,3 w 3,4 w 4,1 v1v1 v2v2 v4v4 v3v3 w 1-2 w 2-3 w 1-3 w 3-4 w 1-4
Graphs: Directed and Undirected An undirected edge e = (v i, v j ) indicates that the relationship, connection, etc. is bi-direction: –Can go from v i to v j (i.e., v i is related to v j ) and vice-versa –Example: friends – Steve and Alicia are friends A directed edge = (v i, v j ) specifies a one-directional relationship or connection: –Can only go from v i to v j –Example: like – George likes Mary SteveAliciaGeorgeMary
Graphs: Directed and Undirected (cont.) A graph will have either directed or undirected edges, but typically not both. Two directed edges can replace any undirected edge: –If there is an undirected edge e = (v i, v j ), it is replaced with two directed edges = (v i, v j ) and = (v j, v i ) –If the graph is unweighted (or the weights for both directed edges are the same), then a shorthand (graphical) notation can be used –We will discuss only directed graphs (and use e instead of ) vivi vjvj vivi vjvj vivi vjvj vivi vjvj
Graphs: Example Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix
Graphs: Representations Two most common representations: 1.Adjacency matrix: Represents graph as a 2-D matrix Vertices are used as indices for both the rows and columns of the matrix –Vertices must be numbered or must associate an integer value with each vertex A matrix entry in row i and column j of: 1: indicates an edge from v i to v j 0: no edge from v i to v j Weighted representation has weight w i,j instead of 1 and instead of 0 Requires O(V 2 ) space for V vertices 2.Edge list: Each vertex v i lists the set of directly connected neighbors Weighted representation replaces the neighbor set with a Map : –key vertex v j –value weight w i,j Stores only the edges more space efficient for sparse graph: O(V + E)
Graphs Representation: Adjacency Matrix City : Pendleton? : Pensacola0? : Peoria00? : Phoenix001?0101 4: Pierre1000?000 5: Pittsburgh01000?00 6: Princeton000001?0 7: Pueblo ? Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix What about the diagonal matrix entries? Is a vertex connected to itself?
Graphs Representation: Adjacency Matrix City : Pendleton : Pensacola : Peoria : Phoenix : Pierre : Pittsburgh : Princeton : Pueblo Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix By convention, a vertex is usually connected to itself (though, this is not always the case)
Graphs Representation: Edge List Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pendleton:{Pueblo, Phoenix} Pensacola:{Phoenix} Peoria:{Pueblo, Pittsburgh} Phoenix:{Pueblo, Peoria, Pittsburgh} Pierre:{Pendleton} Pittsburgh:{Pensacola} Princeton:{Pittsburgh} Pueblo:{Pierre}
Reachability A common question to ask about a graph is reachability: Single-source: –Which vertices are “reachable” from a given vertex v i ? –Basic algorithm: Initialize set of reachable vertices with v i and add v i to a stack While stack is not empty Get and remove (pop) last vertex v from stack For all neighbors, v j, of v If v j is not is set of reachable vertices, add to stack and reachable set All-pairs: –For all pairs of vertices v i and v j, is v j “reachable” from v i ? –Solves the single source question for all vertices
All-Pairs Reachability: Adjacency Matrix How do we compute all-pairs reachability Converts adjacency matrix into a “reachability” matrix –A matrix entry in row i and column j of: 1: indicates that there is a path (of zero or more edges) from v i to v j 0: no path from v i to v j Warshall’s algorithm: –Named after the computer scientist who discovered it –Three nested loops of order V O(V 3 ) –Key idea: each iteration of the outer loop (index k) adds any path of length 2 that has vertex v k as its center –Since the adjacency and the reachability matrices are binary (0 or 1 entries), use bitwise operations
Warshall’s Algorithm class Warshall { static void warshall(int [][] a) { // Input: initial adjacency matrix. int n = a.length; // Number of vertices. for (int k = 0; k < n; k++) { // Add paths of length 2 through v k. for (int i = 0; i < n; i++) for (int j = 0; j < n; j++) // Add path from v i to v j a[i][j] |= a[i][k] & a[k][j]; // going through v k. } static void matrixOutput(int [][] a) { // Print matrix. for (int i = 0; i < a.length; i++) { for (int j = 0; j < a[i].length; j++) System.out.print(" " + a[i][j]); System.out.println(" "); }... }
Warshall’s Algorithm: Initialization class Warshall {... static public void main (String [] args) { int [][] adjacency = {{1, 0, 0, 1, 0, 0, 0, 1}, // Initialize {0, 1, 0, 1, 0, 0, 0, 0}, // adjacency {0, 0, 1, 0, 0, 1, 0, 1}, // matrix. {0, 0, 1, 1, 0, 1, 0, 1}, {1, 0, 0, 0, 1, 0, 0, 0}, {0, 1, 0, 0, 0, 1, 0, 0}, {0, 0, 0, 0, 0, 1, 1, 0}, {0, 0, 0, 0, 1, 0, 0, 1}}; warshall (adjacency); // Compute all-pairs reachability matrix. matrixOutput(adjacency); // Print resulting reachability matrix. }
Warshall’s Algorithm: Initialization City : Pendleton : Pensacola : Peoria : Phoenix : Pierre : Pittsburgh : Princeton : Pueblo Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Initial adjacency matrix
Warshall’s Algorithm: After Iteration 0 City : Pendleton : Pensacola : Peoria : Phoenix : Pierre : Pittsburgh : Princeton : Pueblo Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Add all paths of length 2 that go through vertex 0 (Pendleton) Define some paths of length 2 in this (the first) iteration
Warshall’s Algorithm: After Iteration 1 City : Pendleton : Pensacola : Peoria : Phoenix : Pierre : Pittsburgh : Princeton : Pueblo Add all paths of length 2 that go through vertex 1 (Pensacola) Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Could potentially define paths of length 3 in this (the second) iteration
Warshall’s Algorithm: After Iteration 2 City : Pendleton : Pensacola : Peoria : Phoenix : Pierre : Pittsburgh : Princeton : Pueblo Add all paths of length 2 that go through vertex 2 (Peoria) Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Edges already exist (doesn’t add anything new) Could potentially define paths of length 4 in this (the third) iteration
Warshall’s Algorithm: After Iteration 3 City : Pendleton : Pensacola : Peoria : Phoenix : Pierre : Pittsburgh : Princeton : Pueblo Add all paths of length 2 that go through vertex 3 (Phoenix) Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Notice that it also includes the paths formed in interations 0 and 1 (resulting, in this case, in new reachability paths of length 3) Could potentially define paths of length 5 in this (the fourth) iteration
Warshall’s Algorithm: After Iteration 7 City : Pendleton : Pensacola : Peoria : Phoenix : Pierre : Pittsburgh : Princeton : Pueblo Final result: Every city can reach every other city, except for Princeton Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix The final matrix has a 1 in row i and column j if vertex v j is reachable from vertex v i via some path
Warshall’s Algorithm: Another Look static void warshall(int [][] a) { // Input: initial adjacency matrix. int n = a.length; // Number of vertices. for (int k = 0; k < n; k++) { // Add paths of length 2 through v k. for (int i = 0; i < n; i++) for (int j = 0; j < n; j++) // Add path from v i to v j a[i][j] |= a[i][k] & a[k][j]; // going through v k. } Notice that a[i][k] doesn’t change inside the inner loop: –If zero, then the inner loop does nothing –If one, don’t need to perform bitwise AND ( & ) –If graph is sparse, the inner loop is not needed very often (at least initially) –May get a performance improvement by checking a[i][k] prior to third loop
Warshall’s Algorithm: Another Look (cont.) Improved code: static void warshall(int [][] a) { // Input: initial adjacency matrix. int n = a.length; // Number of vertices. for (int k = 0; k < n; k++) { // Add paths of length 2 through v k. for (int i = 0; i < n; i++) if (a[i][k] == 1) // If there is a path from v i to v k : for (int j = 0; j < n; j++) // Add path from v i to v j a[i][j] |= a[k][j]; // going through v k. }
Weighted Graph: All-Pairs Reachability What if we have a weighted graph? –Example: the distance between cities The all-pairs reachability question then becomes: –Is there a path between any two vertices v i and v k and, if so, what is weight of the minimum weight path? Floyds’s algorithm: –Named after the computer scientist who discovered it –Very much the same as Warshall’s algorithm –Key difference: adjacency graph now has weights instead of binary values In place of bitwise AND, use addition In place of bitwise OR, use a minimum-value calculation
Floyds’s Algorithm: Initialization City : Pendleton0 4 8 1: Pensacola 0 5 2: Peoria 0 5 3 3: Phoenix 40 10 3 4: Pierre2 0 5: Pittsburgh 4 0 6: Princeton 10 7: Pueblo 3 0 Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Initial adjacency matrix
Floyd’s Algorithm class Floyd { static void floyd(double [][] a) { // Input: initial adjacency matrix. int n = a.length; // Number of vertices. for (int k = 0; k < n; k++) { // Add paths of length 2 through v k. for (int i = 0; i < n; i++) for (int j = 0; j < n; j++) { // Check path from v i to v j double m = a[i][k] + a[k][j]; // going through v k. if (m < a[i][j]) a[i][j] = m; // Update if less. }... }
Floyd’s Algorithm: Improved Of course, we can apply a similar improvement as we did to Warshall’s algorithm class Floyd { static void floyd(double [][] a) { // Input: initial adjacency matrix. int n = a.length; // Number of vertices. for (int k = 0; k < n; k++) { // Add paths of length 2 through v k. for (int i = 0; i < n; i++) { double m = a[i][k]; // If there is a path from v i to v k : if (m < Double.POSITIVE_INFINITY) for (int j = 0; j < n; j++) { // Check path from v i to v j double m += a[k][j]; // going through v k. if (m < a[i][j]) a[i][j] = m; // Update if less. }... }
Single Source Reachability: Edge-List How do you determine reachability with an edge-list representation? –In this case we want to answer the single-source question: What is reachable from a given vertex v i ? –Should be faster than the all-pairs question –Basic algorithm: Initialize set of reachable vertices with v i and add v i to a stack While stack is not empty Get and remove (pop) last vertex v from stack For all neighbors, v j, of v If v j is not is set of reachable vertices, add to stack and reachable set
Edge-List Representation: Vertex Need a Vertex class: –In this case we want to answer the single-source question: What is reachable from a given vertex v i ? –Should be faster than the all-pairs question –Basic algorithm: Initialize set of reachable vertices with v i and add v i to a stack While stack is not empty Get and remove (pop) last vertex v from stack For all neighbors, v j, of v If v j is not is set of reachable vertices, add to stack and reachable set
Graphs: Example Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix What cities are rechable from peoria?
Graphs: Example Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Stack: Pittsburgh Pueblo Reachable: {Peoria}
Graphs: Example Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pop Stack, push next Stack: Pensacola, Pueblo Reachable: {Peoria, Pittsburgh, Pueblo, Pensacola}
Graphs: Example Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pop Stack, push next Stack: Phoenix, Pueblo Reachable: {Peoria, Pittsburgh, Pueblo, Pensacola, Phoenix}
Graphs: Example Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pop Stack, push next Stack: Pueblo Reachable: {Peoria, Pittsburgh, Pueblo, Pensacola, Phoenix} Although there are 3 outgoing arcs from Phoenix, they are all going to cities that are already known to be reachable. Question: Would the algorithm still work if we used a queue instead of a stack? How would it be different?
Weighted Graphs Representation: Edge List Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pendleton:{Pueblo:8, Phoenix:4} Pensacola:{Phoenix:5} Peoria:{Pueblo:3, Pittsburgh:5} Phoenix:{Pueblo:3, Peoria:4, Pittsburgh:10} Pierre:{Pendleton:2} Pittsburgh:{Pensacola:4} Princeton:{Pittsburgh:2} Pueblo:{Pierre:3} Instead of a Map of Sets, use a Map of Maps First key is source, second key is destination, value is weight
What about Weighted Graphs? Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Dijkstra’s algorithm. Use a priority queue instead of a stack. Return a map of city, distance pairs. Pqueue orders values on shortest distance Dijkstra (String startCity, Map[String, Map[String, double]] distances) Make empty map of distances into variable reachable Put (StartingCity, 0) into Pqueue While Pqueue not empty pull new city from queue, if not ready in reachable, add to reachable add neighbors to queue, adding weight to distance from starting city When done with loop, return reachable map
Example: What is the distance from Pierre Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pierre: Dijkstra (String startCity, Map[String, Map[String, double]] distances) Make empty map of distances into variable reachable Put (StartingCity, 0) into Pqueue While Pqueue not empty pull new city from queue, if not ready in reachable, add to reachable add neighbors to queue, adding weight to distance from starting city When done with loop, return reachable map
Example: What is the distance from Pierre Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pierre: Pendeleton: Dijkstra (String startCity, Map[String, Map[String, double]] distances) Make empty map of distances into variable reachable Put (StartingCity, 0) into Pqueue While Pqueue not empty pull new city from queue, if not ready in reachable, add to reachable add neighbors to queue, adding weight to distance from starting city When done with loop, return reachable map
Example: What is the distance from Pierre Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pierre: 0, Pendleton: Phoenix: 6, Pueblo: 10 Notice how the distances have been added Dijkstra (String startCity, Map[String, Map[String, double]] distances) Make empty map of distances into variable reachable Put (StartingCity, 0) into Pqueue While Pqueue not empty pull new city from queue, if not ready in reachable, add to reachable add neighbors to queue, adding weight to distance from starting city When done with loop, return reachable map
Example: What is the distance from Pierre Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pierre: 0, Pendleton: 2, Phoenix: Pueblo: 9, Peoria: 10, Pueblo: 10, Pittsburgh: 16 Notice how values are stored in the Pqueue in distance order Dijkstra (String startCity, Map[String, Map[String, double]] distances) Make empty map of distances into variable reachable Put (StartingCity, 0) into Pqueue While Pqueue not empty pull new city from queue, if not ready in reachable, add to reachable add neighbors to queue, adding weight to distance from starting city When done with loop, return reachable map
Example: What is the distance from Pierre Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pierre: 0, Pendleton: 2, Phoenix: 6, Pueblo: Peoria: 10, Pueblo: 10, Pierre: 13, Pittsburgh: 16 Pierre gets put in queue, although it is known to be reachable Dijkstra (String startCity, Map[String, Map[String, double]] distances) Make empty map of distances into variable reachable Put (StartingCity, 0) into Pqueue While Pqueue not empty pull new city from queue, if not ready in reachable, add to reachable add neighbors to queue, adding weight to distance from starting city When done with loop, return reachable map
Example: What is the distance from Pierre Pendleton Pierre Pensacola Princeton Pittsburgh Peoria Pueblo Phoenix Pierre: 0, Pendleton: 2, Phoenix: 6, Pueblo: 9, Peoria: Pueblo: 10, Pierre: 13, Pueblo: 13, Pittsburgh: 15, Pittsburgh: 16 Duplicates only removed when pulled out of queue Dijkstra (String startCity, Map[String, Map[String, double]] distances) Make empty map of distances into variable reachable Put (StartingCity, 0) into Pqueue While Pqueue not empty pull new city from queue, if not ready in reachable, add to reachable add neighbors to queue, adding weight to distance from starting city When done with loop, return reachable map
Graphs: Traveling Salesman Problem Famous graph problem: –Given a set of weighted graph of cities with weights corresponding to distances between cities –Find the shortest tour (closed path) that visits all cities Shown to be exponential in terms of its computational complexity
Your Turn Do Dijkstra algorithm starting from Pensacola