Graphs Quebec Toronto Montreal Ottawa 449 km 255 km 200 km 545 km Winnipeg 2075 km 2048 km New York 596 km 790 km 709 km
Graphs A graph is a pair (V, E), where – V is a set of nodes, called vertices – E is a collection of pairs of vertices, called edges – Vertices and edges are positions and store elements Example: – A vertex represents an airport and stores the three- letter airport code – An edge represents a flight route between two airports and stores the mileage of the route ORD PVD MIA DFW SFO LAX LGA HNL
Edge Types Undirected edge –unordered pair of vertices (u,v) –e.g., a flight route Undirected graph –all the edges are undirected –e.g., flight network ORDPVD 849 miles
Edge Types Directed edge –ordered pair of vertices (u,v) –first vertex u is the origin –second vertex v is the destination –e.g., a flight ORDPVD flight AA 1206 Directed graph –all the edges are directed –e.g., route network
Applications Electronic circuits –Printed circuit board –Integrated circuit Transportation networks –Highway network –Flight network Computer networks –Local area network –Internet –Web Databases –Entity-relationship diagram
Terminology End vertices (or endpoints) of an edge XU V W Z Y a c b e d f g h i j –U and V are the endpoints of a Edges incident on a vertex Adjacent vertices –U and V are adjacent –a, d, and b are incident on V Degree of a vertex –X has degree 5 Parallel edges –h and i are parallel edges Self-loop –j is a self-loop
Path P1P1 Terminology (cont.) XU V W Z Y a c b e d f g hP2P2 –sequence of alternating vertices and edges –begins with a vertex –ends with a vertex –each edge is preceded and followed by its endpoints Simple path –path such that all its vertices and edges are distinct Examples –P1=(V,b,X,h,Z) is a simple path –P 2 =(U,c,W,e,X,g,Y,f,W,d,V) is a path that is not simple
Terminology (cont.) Cycle:- circular sequence of alternating vertices and edges - each edge is preceded and followed by its endpoints C1C1 XU V W Z Y a c b e d f g hC2C2 Simple cycle – cycle such that all its vertices and edges are distinct Examples – C1=(V,b,X,g,Y,f,W,c,U,a, ) is a simple cycle – C2=(U,c,W,e,X,g,Y,f,W,d,V,a, ) is a cycle that is not simple
Properties Notation: n - # of vertices m - # of edges deg(v) - degree of vertex v Property 1: ∑ v deg(v) = 2m Proof: each endpoint is counted twice Example – n = 4 m = 6 – deg(v) = 3 Property 2 In an undirected graph with no self-loops and no multiple edges m n (n - 1)/2 Proof: each vertex has degree at most (n - 1)
Subgraphs A subgraph S of a graph G is a graph such that Subgraph Spanning subgraph – The vertices of S are a subset of the vertices of G – The edges of S are a subset of the edges of G A spanning subgraph of G is a subgraph that contains all the vertices of G
11 Trees and Forests A (free) tree is an undirected graph T such that –T is connected –T has no cycles A forest is an undirected graph without cycles (a collection of trees). The connected components of a forest are trees Tree Forest
Connected Graphs A (non-directed) graph is connected if there exists a path u, v V. Connected components u u v v G G
Main Methods of the Graph Accessor methods – aVertex() – isEdge(u,v) – incidentEdges(v) – endVertices(e) – isDirected(e) – origin(e) – destination(e) – opposite(v, e) – areAdjacent(v, w) Update methods – insertVertex(o) – insertEdge(v, w, o) – insertDirectedEdge(...) – removeVertex(v) – removeEdge(e) Generic methods – numVertices() – numEdges() – vertices() – edges() There could be other methods...
Representations Edge List n - number of nodes m - number of edges Adjacency List Adjacency Matrix Incidence Matrix
Edge List Structure Vertex object – element – reference to position in the list of vertices v u w ac b a z d uv w z bcd Representations Two Lists: List of vertices and List of edges Edge object – element – origin vertex object – destination vertex object – reference to position in the list of edges Space: O(n+m) n = # of vertices m = # of edges
Adjacency List Structure List of Incidence for each vertex – element – List of references to incident edges Representations n lists: For each vertex a list of incident edges Augmented edge objects – element – references to both extreme vertices – reference to position in the list of edges Space: O(n+m) n = # of vertices m = # of edges (1,2) (1,4) (3,2) (4,5) (5,1) (5,2)
u v w ab a uv w b Adjacency List Structure Representations
G If G is not-directed symmetric matrix Adjacency Matrix Structure Representations G
Edge list structure u v w ab 1 2 a uv w 01 2 b Adjacency Matrix Structure Representations Augmented vertex objects 2D-array adjacency array – Reference to edge object for adjacent vertices – Null for non adjacent vertices –Integer key (index) associated with vertex Space: O(n * n) Lots of waste space if the matrix is SPARSE …
v1v v2v v3v v4v v5v v6v Incidence Matrix Structure Representations Space: O(n*m) G v1v1 v4v4 v2v2 v6v6 v5v5 v3v3
Is (v i, v j ) an edge? Adjacency Matrix: [ ] i j O(1) i … Min{O(deg(i)), O(deg(j))} O(m) l1l1 l2l2 l3l3 l4l4 l5l5 l6l6 l7l7 l8l8 l9l9 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Adjacency List: Edge List: j
Which nodes are adjacent to v i ? Adjacency Matrix: [ ] i O(n) i O(deg(i)) O(m) l1l1 l2l2 l3l3 l4l4 l5l5 l6l6 l7l7 l8l8 l9l9 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Adjacency List: Edge List:
Mark all Edges Adjacency Matrix: [ ] O(n 2 ) O(m) l1l1 l2l2 l3l3 l4l4 l5l5 l6l6 l7l7 l8l8 l9l9 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Adjacency List: Edge List: 12n12n 1 2 … n 12n12n
Add an Edge (v i, v j ) Adjacency Matrix: [ ] i j O(1) l1l1 l2l2 l3l3 l4l4 l5l5 l6l6 l7l7 l8l8 l9l9 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Adjacency List: Edge List: i 1
Remove an Edge (v i, v j ) Adjacency Matrix: [ ] i j O(1) l1l1 l2l2 l3l3 l4l4 l5l5 l6l6 l7l7 l8l8 l9l9 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Adjacency List: Edge List: i 0 j
n vertices m edges no parallel edges no self-loops Edge List Adjacency List Adjacenc y Matrix Space n m n2n2 incidentEdges( v ) mdeg(v)n areAdjacent ( v, w ) mmin(deg(v), deg(w))1 insertVertex( o ) 11n2n2 insertEdge( v, w, o ) 111 removeVertex( v ) mdeg(v)n2n2 removeEdge( e ) 111 Performance
Special Graphs Bipartite Graphs Planar Graphs Cannot have
n – 1 m 1 deg(i) n – 1 connected, non-directed degree n – 1 m n(n – 1) 1 deg(i) n – 1 connected, directed OUT-degree n = | V | m = | E | Bound for the number of edges
29 Graph Traversals DB A C E
30 Graph Traversals A traversal of a graph G: – Visits all the vertices and edges of G – Determines whether G is connected – Computes the connected components of G – Computes a spanning forest of G – Build a spanning tree in a connected graph
Depth-First Search (DFS) is a graph traversal technique that: The idea: Starting at an arbitrary vertex, follow along a simple path until you get to a vertex which has no unvisited adjacent vertices. Then start tracing back up the path, one vertex at a time, to find a vertex with unvisited adjacent vertices. on a graph with n vertices and m edges takes O(n + m ) time (which is O(m) ) can be further extended to solve other graph problems –Find and report a path between two given vertices Find a cycle in the graph, if there is one
32
33
34
35
36
37
38
39
40
41
42 R
43 R
44 Depth-First Search DB A C E Back edges Tree edges
DFS Algorithm – With a Stack When we arrive to a node for the first time: − We push all its incident edges to the stack − We add the edge from where we came to the tree. To move to a new node we remove the first edge in the stack and move through it.
Complexity Number of PUSH: Number of POP: Visit of a node: n O(n+m) = O(m) Elementary operations: Pop, Push, and visits If the graph is implemented with adjacency list
DFS Algorithm – Recursive version DFS(v){ - Mark v visited For all vertex w adjacent to v if w is not visited{ visit w DFS(w) }
Properties of DFS Property 1 DFS(G, v) visits all the vertices and edges in the connected component of v DB A C E Property 2 The discovery edges labeled by DFS(G, v) form a spanning tree of the connected component of v
DFS runs in O(n + m) time provided the graph is represented by the adjacency list structure O(n + m) = O(m) Conclusion If we represent the graph with an adjacency list Complexity of DFS is O(m) WORST CASE: m = O(n 2 ), when … Question: Avec adjacency matrix ?
50 Path Finding Do not consider backtrack edges. As soon as destination vertex z is encountered, we return the path as the contents of the stack Cycle Finding As soon as a back edge (v, w) is encountered, we return the cycle as the portion of the stack from the top to vertex w
Breadth-First Search (BFS) is a graph traversal technique that: The idea: Visit a vertex and then visit all unvisited vertices that are adjacent to it before visiting a vertex which is 2 nodes away from it. on a graph with n vertices and m edges takes O(n + m ) time (which is O(m) ) can be further extended to solve other graph problems –Find and report a path with the minimum number of edges between two given vertices Find a simple cycle, if there is one
52
53 CB A E D L0L0 L1L1 F L2L2 Breadth-First Search
BFS Algorithm – With a Queue When we arrive to a node for the first time: − We add all the incident edges to the queue − We take the next edge from the Queue. − If the vertex is not visited we mark it and we add the traversed edge to the tree. − We move through the edge to a new node
Properties of BFS Notation: G s : connected component of s Property 1: BFS(G, s) visits all the vertices and edges of G s Property 2: The discovery edges labeled by BFS(G,s) form a spanning tree T s of G s Property 3: For each vertex v in L i − The path ofT s from s to v has i edges – Every path from s to v in G s has at least i edges CB A E D L0L0 F L2L2 L1L1
Complexity Method incidentEdges is called once for each vertex BFS runs in O(n + m) time Recall that If the graph is implemented with adjacency list
DFS vs. BFS Back edge (v,w) –w is an ancestor of v in the tree of discovery edges Cross edge (v,w) –w is in the same level as v or in the next level in the tree of discovery edges CB A E D L0L0 L1L1 F L2L2 CB A E D F DFSBFS
58 Shortest Path C B A E D F
Weighted Graphs In a weighted graph, each edge has an associated numerical value, called the weight of the edge Edge weights may represent, distances, costs, etc. Example: –In a flight route graph, the weight of an edge represents the distance in miles between the endpoint airports ORD PVD MIA DFW SFO LAX LGA HNL
60 Shortest Path Problem Given a weighted graph and two vertices u and v, we want to find a path of minimum total weight between u and v Applications –Flight reservations –Driving directions –Internet packet routing ORD PVD MIA DFW SFO LAX LGA HNL Providence Honolulu
Shortest Path Properties Property 1: A subpath of a shortest path is itself a shortest path Property 2: There is a tree of shortest paths from a start vertex to all the other vertices ORD PVD MIA DFW SFO LAX LGA HNL
Dijkstra’s Algorithm The distance of a vertex v from a vertex s is the length of a shortest path between s and v Dijkstra’s algorithm computes the distances of all the vertices from a given start vertex s Assumptions: –the graph is connected –the edges are undirected –the edge weights are nonnegative
We grow a “cloud” of vertices, beginning with s and eventually covering all the vertices At each vertex v we store d(v) = best distance of v from s in the subgraph consisting of the cloud and its adjacent vertices CB A E D F
At each step We add to the cloud the vertex u outside the cloud with the smallest distance label We update the labels of the vertices adjacent to u C B A E D F 3 - better way ! 11 - better way ! 5 - better way ! CB A E D F
Update = Edge Relaxation Consider an edge e = (u,z) such that –u is the vertex most recently added to the cloud d(z) 75 d(u) z s u d(z) 60 d(u) z s u –z is not in the cloud The relaxation of edge e updates distance d(z) as follows d(z) min(d(z),d(u) weight(e))
66 Example CB A E D F C B A E D F C B A E D F C B A E D F
67 Example (cont) CB A E D F CB A E D F
68 Dijkstra’s Algorithm we use a priority queue Q to store the vertices not in the cloud, where D[v] is the key of a vertex v in Q
69 O(deg(u) log n) while Q do {pull u into the cloud C} u Q.removeMinElement() for each vertex z adjacent to u such that z is in Q do {perform the relaxation operation on edge (u, z) } if D[u] + w((u, z)) < D[z] then D[z] D[u] + w((u, z)) change the key value of z in Q to D[z] O(log n) deg(u) of them O(log n) u G (1 + deg(u)) log n = O((n+m) log n) = O(m log n) Using a Heap and if the graph is implemented with adjacency list
70 An Unsorted Sequence: O(n) when we extract minimum elements, but fast key updates (O(1)). There are only n-1 extractions and m updates. The running time is O(n 2 +m) = O(n 2 ) O(m log n) O(n 2 ) Heap Sequence In conclusion:
Minimum Spanning Tree 71 ORD PIT ATL STL DEN DFW DCA
72 Spanning subgraph –Subgraph of a graph G containing all the vertices of G ORD PIT ATL STL DEN DFW DCA Minimum Spanning Tree Spanning tree –Spanning subgraph that is itself a (free) tree Minimum spanning tree (MST) –Spanning tree of a weighted graph with minimum total edge weight Applications – Communications networks – Transportation networks
73 Cycle Property – Let T be a minimum spanning tree of a weighted graph G – Let e be an edge of G that is not in T and let C be the cycle formed by adding e to T – For every edge f of C, weight(f) weight(e) e C f C e f Replacing f with e yields a better spanning tree Proof: By contradiction –If weight(f) > weight(e) we can get a spanning tree of smaller weight by replacing e with f
74 In other words: take a MST in any cycle of the graph the non-spanning tree edge (dotted line) has max weight. ORD PIT ATL STL DEN DFW DCA Cycle Property
75 UV Partition Property Consider a partition of the vertices of G into subsets U and V. Let e be an edge of minimum weight across the partition. There is a minimum spanning tree of G containing edge e Proof: –Let T be an MST of G –If T does not contain e, consider the cycle C formed by e with T and let f be an edge of C across the partition –By the cycle property, weight(f) weight(e) –Thus, weight(f) = weight(e) –We obtain another MST by replacing f with e e f e f Replacing f with e yields another MST UV
76 Prim-Jarnik’s Algorithm Prim-Jarnik’s algorithm for computing an MST is similar to Dijkstra’s algorithm We assume that the graph is connected We pick an arbitrary vertex s and we grow the MST as a cloud of vertices, starting from s We store with each vertex v a label d(v) representing the smallest weight of an edge connecting v to any vertex in the cloud (as opposed to the total sum of edge weights on a path from the start vertex to u).
77 At each step –We add to the cloud the vertex u outside the cloud with the smallest distance label –We update the labels of the vertices adjacent to u Prim-Jarnik’s Algorithm
78 Use a priority queue Q whose keys are D labels, and whose elements are vertex-edge pairs. –Key: distance –Element: vertex Any vertex v can be the starting vertex. We still initialize all the D[u] values to INFINITE, but we also initialize E[u] (the edge associated with u) to null. Return the minimum-spanning tree T. We can reuse code from Dijkstra’s, and we only have to change a few things.
79 Example B D C A F E B D C A F E 7 B D C A F E 7 B D C A F E
80 Example (contd.) B D C A F E B D C A F E
81 Analysis Graph operations –Method incidentEdges is called once for each vertex Label operations –We set/get the labels of vertex z O(deg(z)) times –Setting/getting a label takes O(1) time Priority queue operations –Each vertex is inserted once into and removed once from the priority queue, where each insertion or removal takes O(log n) time –The key of a vertex w in the priority queue is modified at most deg(w) times, where each key change takes O(log n) time Prim-Jarnik’s algorithm runs in O((n + m) log n) time provided the graph is represented by the adjacency list structure –Recall that S v deg(v) = 2m The running time is O(m log n) since the graph is connected