Fundamental Data Structures and Algorithms

15-211 Fundamental Data Structures and Algorithms
Graph Algorithms Fundamental Data Structures and Algorithms Aleks Nanevski March 18, 2004

Announcements Homework Assignment #6 due April 5
Building a Web Search Engine Web Crawler (due March 25) Web Reader (due March 25) Web Search (due April 5) Reading: Chapter 14

Graphs — an overview Vertices (aka nodes) BOS DTW SFO PIT JFK LAX
618 DTW SFO 2273 211 190 PIT 318 JFK 344 1987 2145 2462 LAX Edges Weights (Undirected)

Definitions Graph G = (V,E) Weighted graph Directed graph (digraph)
Set V of vertices (nodes) Set E of edges Elements of E are pair (v,w) where v,w  V. An edge (v,v) is a self-loop. (Usually assume no self-loops.) Weighted graph Elements of E are ((v,w),x) where x is a weight. Directed graph (digraph) The edge pairs are ordered Undirected graph The edge pairs are unordered E is a symmetric relation (v,w)  E implies (w,v)  E In an undirected graph (v,w) and (w,v) are the same edge

Representing graphs Adjacency matrix Adjacency lists
1 2 3 4 5 6 7 x Adjacency matrix 2-dimensional array For each edge (u,v), set A[u][v] to true; otherwise false Adjacency lists For each vertex, keep a list of adjacent vertices 1 2 3 4 5 6 7 3 4 5 6 7 1 4 7 6 3 5 2 Q: How to represent weights?

Representing graphs We will assume adjacency list representation
For each vertex v, denote the adjacency list of v as adj(v)

Graph Traversal

Graph Traversals Explore all vertices reachable from some vertex s.
One of the fundamental operations in graphs. Find things such as Count the total edges Output the content in each vertex Identify connected components

Graph Traversals Before/during the tour - mark vertices as
Completed (v and adj(v) are visited) Frontier (v visited, but adj(v) not) Not visited Data structures C: list of completed nodes F: list representing frontier

Graph Traversals Both take time: O(|V|+|E|)

Depth-first Search C = [] // empty list F = [s] // singleton list
while (F <> []) { x = head(F); F = tail(F); C = x, C; // mark x as completed for each y in adj(x) do if (y not in C) then F = y, F }

Depth-first Search Completed empty Frontier s f c s b e a g d

Depth-first Search Completed s Frontier c b a f c s b e a g d

Depth-first Search Completed s c Frontier f b a f c s b e a g d

Depth-first Search Completed s c f Frontier e b a f c s b e a g d

Depth-first Search Completed s c f e Frontier b a f c s b e a g d

Depth-first Search Completed s c f e b Frontier d b a f c s b e a g d

Depth-first Search Completed s c f e b d Frontier a b f c s b e a g d

Depth-first Search f c s b e a g d Completed s c f e b d a Frontier
empty f c s b e a g d

Depth-first Search Notice: Frontier list is used like a stack.
Simpler implementation using recursion. dfs( vertex x ) { mark x as completed; // put x in C forall y in adj(x) do if( y not marked ) dfs( y ); // explore edge }

Depth-first Search Complexity: Overall cost: O(|E|+|V|)
each node marked on C only once for a total of O(|V|) operations. each node put on F once for each of its incoming edges, for a total of O(|E|) operations. Overall cost: O(|E|+|V|)

Breadth-first Search C = [] // empty list F = [s] // singleton list
while (F <> []) { F’=[]; for each x in F { for each y in adj(x) do if (y not in C and y not in F) F = F,y } C = C, F // mark all of F as completed F = F’

Breadth-first Search Completed empty Frontier s f c s b e a g d

Breadth-first Search Completed s Frontier c b a f c s b e a g d

Breadth-first Search Completed s c Frontier b a f f c s b e a g d

Breadth-first Search Completed s c b Frontier a f e d f c s b e a g d

Breadth-first Search Completed s c b a Frontier f e d f c s b e a g d

Breadth-first Search Completed s c b a f Frontier e d f c s b e a g d

Breadth-first Search Completed s c b a f e Frontier d f c s b e a g d

Breadth-first Search f c s b e a g d Completed s c b a f e d Frontier
empty f c s b e a g d

Breadth-first Search Notice: frontier list used like a queue.
Implementation using a queue: F.enqueue( s ); mark s; // put x into C while (!F.empty()) { x = F.dequeue(); forall y in adj(x) do if ( y not marked ) { F.enqueue(y); mark y; // put y into C }

Breadth-first Search Complexity: Overall cost: O(|E|+|V|)
each node enqued on F only once for a total of O(|V|) operations. each edge touched at most twice (once for each of its end nodes) for a total of O(|E|) operations. Overall cost: O(|E|+|V|)

Single-Source Shortest Paths

Airline routes BOS ORD PVD SFO JFK BWI LAX DFW MIA 2704 867 1846 187
849 PVD SFO 740 JFK 144 802 1464 337 621 1258 184 BWI 1391 LAX DFW 1090 1235 946 1121 MIA 2342

Single-source shortest path
Suppose we live in Baltimore (BWI) and want the shortest path to San Francisco (SFO). One way to solve this is to solve the single-source shortest path problem: Find the shortest path from BWI to every city.

Single Source Shortest Path Problem
Given: a directed graph G = (V,E) weight(u,v)  0 for all edges (u,v)  E Find: length of the shortest path from start vertex s to every other vertex in G s f a b d e c g 4 2 5 1 Example: shortest path from s to e has length 6 (it’s the path s-b-e) Question: how to adapt for unweighted graphs?

Single-source shortest path
While we may be interested only in BWI-to-SFO, there are no known algorithms that are asymptotically faster than solving the single-source problem for BWI-to-every-city.

Many applications Shortest paths model many useful real-world problems. Minimization of latency in the Internet. Minimization of cost in power delivery. Job and resource scheduling. Route planning.

Dijkstra’s algorithm Initialization a. Set D(s) = 0
(see Weiss, Section 14.3) Initialization a. Set D(s) = 0 b. For all vertices v  V, v  s, set D(v) =  c. Insert all vertices into priority queue P, using distances as the keys s f a b d e c g 4 2 5 1 s a b c d e f g 

Dijkstra’s algorithm While P is not empty:
1. Select the next vertex u to visit u = P.deleteMin() 2. Update D(w) for each vertex w in adj(u): If D(u) + weight(u,w) < D(w), a. D(w) = D(u) + weight(u,w) b. Update the priority queue to reflect new distance for w

Dijkstra’s algorithm s f a b d e c g 4 2 5 1 Visited s a b c d e f g 

Dijkstra’s algorithm b c a d e f g 2 4 5  Visited s (D = 0) s f a b d
1 Visited s (D = 0) b c a d e f g 2 4 5 

Dijkstra’s algorithm d c a e f g 3 4 5 6  Visited s (D = 0) b (D = 2)
1 Visited s (D = 0) b (D = 2) d c a e f g 3 4 5 6 

Dijkstra’s algorithm c a e f g 4 6  Visited s (D = 0) b (D = 2)
5 1 Visited s (D = 0) b (D = 2) d (D = 3) c a e f g 4 6 

Dijkstra’s algorithm a e f g 4 6  Visited s (D = 0) b (D = 2)
c g 4 2 5 1 Visited s (D = 0) b (D = 2) d (D = 3) c (D = 4) a e f g 4 6 

Dijkstra’s algorithm e f g 6  Visited s (D = 0) b (D = 2) d (D = 3)
c g 4 2 5 1 Visited s (D = 0) b (D = 2) d (D = 3) c (D = 4) a (D = 4) ... e f g 6 

Dijkstra’s algorithm Visited s (D = 0) b (D = 2) d (D = 3) c (D = 4)
f a b d e c g 4 2 5 1 Visited s (D = 0) b (D = 2) d (D = 3) c (D = 4) a (D = 4) e (D = 6) f (D = 6) g (D = )

Features of Dijkstra’s Algorithm
A greedy algorithm “Visits” every vertex only once, when it becomes the vertex with minimal distance amongst those still in the priority queue Distances may be revised multiple times: current values represent ‘best guess’ based on our observations so far Once a vertex is visited we are guaranteed to have found the shortest path to that vertex…. why?

Correctness (via contradiction)
Assume u is the first vertex visited such that D(u) is not a shortest path (thus the true shortest path must pass through some unvisited vertex) Let x represent the first unvisited vertex on the true shortest path to u u x s visited unvisited D(x) must represent a shortest path to x, and D(x)  Dshortest(u). However, Dijkstra’s always visits the vertex with the smallest distance next, so we can’t possibly visit u before we visit x

Performance (using a heap)
Initialization of priority queue: O(|V|) Visitation loop: |V| calls deleteMin(): O(log|V|) Each edge is considered only once during entire execution, for a total of |E| updates of the priority queue, each O(log|V|) Overall cost: O(|V|log|V| + |E|log|V|)

Unweighted case O(|V| + |E|)
If all weights are 1, no need for priority queue -- an ordinary queue suffices: newly enqueued nodes have larger distance from already visited nodes Visitation loop: |V| calls dequeue(): O(1) Each edge is considered only once during entire execution, for a total of |E| enqueues, each O(log 1). Overall cost: Question: which algorithm is obtained this way? O(|V| + |E|)

Representing shortest paths
We now have an algorithms to compute the length of the shortest path between s and w. But what if we actually want to find the vertices on the shortest path?

Representing shortest paths
We now have an algorithms to compute the length of the shortest path between s and w. But what if we actually want to find the vertices on the shortest path? Fact: if s=s0,s1,...,sn=w is the shortest path from s to w, then s=s0, s1,...,sn-1 is the shortest path from s to sn-1. Idea: With each D(w), remember the previous node P(w) = sn-1 in the shortest path.

Dijkstra with predecessors
While P is not empty: 1. Select the next vertex u to visit u = P.deleteMin() 2. Update D(w) for each vertex w adjacent to u: If D(u) + weight(u,w) < D(w), a. D(w) = D(u) + weight(u,w) b. P(w) = u c. Update the priority queue to reflect new values for D(w) and P(w)

Negative weights? Dijkstra’s greedy algorithm can only guarantee shortest paths for non-negative weights s f a b d e c g 4 2 5 - 3 1 b c a d e f g 2 4 5  visiting b incorrectly produces a path of distance 2 How can we address this problem?

The Bellman-Ford algorithm
(see Weiss, Section 14.4) Returns a boolean: TRUE if and only if there is no negative-weight cycle reachable from the source: a simple cycle <v0, v1,…,vk>, where v0=vk and FALSE otherwise If returned TRUE, it also produces the shortest paths Question: why do we avoid negative cycles?

Bellman-Ford algorithm
Initialization a. Set D(s) = 0 b. For all vertices v  V, v  s, set D(v) =  s f a b d e c g 4 2 5 - 3 1 - 2 s a b c d e f g 

Bellman-Ford algorithm
Path updates and negative cycle check: 1. Do |V|-1 times: For each edge (u,v) in |E|, If D(u) + weight(u,v) < D(v) D(v) = D(u) + weight(u,v) 2. For each edge (u,v) in |E|: return false 3. Return true

Bellman-Ford path updates
Assume edges are examined in lexicographic order (i.e., (b,d), (b,e), (c,b), (c,f), (d,a), (f,e), (s,a), (s,b), (s,c) ) s f a b d e c g 4 2 5 - 3 1 - 2 Iteration 1: s a b c d e f g 5 2 4 

Assume edges are examined in lexicographic order (i.e., (b,d), (b,e), (c,b), (c,f), (d,a), (f,e), (s,a), (s,b), (s,c) ) s f a b d e c g 4 2 5 - 3 1 - 2 Iteration 2: s a b c d e f g 5 2 4 3 

Assume edges are examined in lexicographic order (i.e., (b,d), (b,e), (c,b), (c,f), (d,a), (f,e), (s,a), (s,b), (s,c) ) s f a b d e c g 4 2 5 - 3 1 - 2 Iteration 2: s a b c d e f g 5 2 4 3 6 

Assume edges are examined in lexicographic order (i.e., (b,d), (b,e), (c,b), (c,f), (d,a), (f,e), (s,a), (s,b), (s,c) ) s f a b d e c g 4 2 5 - 3 1 - 2 Iteration 2: s a b c d e f g 5 1 4 3 6 

Assume edges are examined in lexicographic order (i.e., (b,d), (b,e), (c,b), (c,f), (d,a), (f,e), (s,a), (s,b), (s,c) ) s f a b d e c g 4 2 5 - 3 1 - 2 Iteration 2: s a b c d e f g 4 1 3 6 

Assume edges are examined in lexicographic order (i.e., (b,d), (b,e), (c,b), (c,f), (d,a), (f,e), (s,a), (s,b), (s,c) ) s f a b d e c g 4 2 5 - 3 1 - 2 Iteration 2: s a b c d e f g 4 1 3 6  etcetera...

Bellman-Ford cycle check
After Iteration 7: s f a b d e c g 4 2 5 - 3 1 - 2 s a b c d e f g 3 1 4 2 6  Performs one final iteration for all edges If any weights change at this point, a negative cycle exists. For this graph, the algorithm returns TRUE.

Key features If the graph contains no negative-weight cycles reachable from the source vertex, after |V| - 1 iterations all distance estimates represent shortest paths…why? We assumed edges were considered in the same order for each iteration. Would the algorithm still work if we changed the order for every iteration?

Correctness Case 1: Graph G=(V,E) doesn’t contain any negative-weight cycles reachable from the source vertex s Consider a shortest path p = < s, v1,..., vk>, which must have k  |V| - 1 edges To show: after k iterations, D(vk)= length(p). Thus, after |V|-1 passes, the algorithm correctly computes the length of all shortest paths. The algorithm returns TRUE, because in the cycle-check phase, no distance changes.

Correctness By induction on k: For k = 0:
D(s) = 0 after initialization For k = i > 0: Assume D(vi-1) is a shortest path after iteration (i-1) At the i-th iteration: - D(vi-1) is not changed D(vi) = D(vi-1) + weight(vi-1,vi) Thus, D(vi) reflects the shortest path to vi

Correctness Case 2: Graph G=(V,E) contains a negative-weight cycle < v0, v1,..., vk =v0> reachable from the source vertex s Proof by contradiction: Assume the algorithm returns TRUE Thus, D(vi-1) + weight(vi-1, vi)  D(vi) for i = 1,…,k Summing the inequalities for the cycle: leads to a contradiction since the first sums on each side are equal (each vertex appears exactly once since vk=v0) and the sum of weights must be less than 0.

Performance Initialization: O(|V|) Path update and cycle check:
|V| calls checking |E| edges, O(|VE|) Overall cost: O(|VE|)

Fundamental Data Structures and Algorithms

Similar presentations

Presentation on theme: "Fundamental Data Structures and Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Fundamental Data Structures and Algorithms

Similar presentations

Presentation on theme: "Fundamental Data Structures and Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback