Fundamental Data Structures and Algorithms

Slides:



Advertisements
Similar presentations
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture10.
Advertisements

3 -1 Chapter 3 The Greedy Method 3 -2 The greedy method Suppose that a problem can be solved by a sequence of decisions. The greedy method has that each.
Shortest Paths Definitions Single Source Algorithms –Bellman Ford –DAG shortest path algorithm –Dijkstra All Pairs Algorithms –Using Single Source Algorithms.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Data Structures, Spring 2004 © L. Joskowicz 1 Data Structures – LECTURE 15 Shortest paths algorithms Properties of shortest paths Bellman-Ford algorithm.
1 Graphs: shortest paths Fundamental Data Structures and Algorithms Ananda Guna April 3, 2003.
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
Shortest Paths1 C B A E D F
Graphs – Shortest Path (Weighted Graph) ORD DFW SFO LAX
© The McGraw-Hill Companies, Inc., Chapter 3 The Greedy Method.
Greedy Algorithms Fundamental Data Structures and Algorithms Peter Lee March 19, 2004.
Greedy Algorithms Fundamental Data Structures and Algorithms Ananda Guna February 6, 2003 Based on lectures given by Peter Lee, Avrim Blum, Danny.
Lecture 16. Shortest Path Algorithms
1 Dijkstra’s Algorithm Dr. Ying Lu RAIK 283 Data Structures & Algorithms.
Fundamental Data Structures and Algorithms Peter Lee April 24, 2003 Union-Find.
Greedy Algorithms Fundamental Data Structures and Algorithms Peter Lee February 6, 2003.
MST, Topological Sort and Disjoint Sets
1 COMP9024: Data Structures and Algorithms Week Twelve: Graphs (II) Hui Wu Session 1, 2014
Graphs – Breadth First Search
Graphs 10/24/2017 6:47 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and.
CSE 373 Data Structures and Algorithms
Shortest Paths C B A E D F Shortest Paths
COMP9024: Data Structures and Algorithms
Greedy Algorithms.
Minimum Spanning Trees
Shortest Paths C B A E D F Shortest Paths
Chapter 5 : Trees.
Shortest Path 6/18/2018 4:22 PM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
Shortest Paths C B A E D F Shortest Paths 1
Minimum Spanning Trees
Greedy method Idea: sequential choices that are locally optimum combine to form a globally optimum solution. The choices should be both feasible and irrevocable.
Shortest Paths C B A E D F Shortest Paths
Algorithms and Data Structures Lecture XIII
CS 3343: Analysis of Algorithms
Introduction to Graphs
Shortest Paths C B A E D F Shortest Paths
Shortest Paths C B A E D F
CSE 421: Introduction to Algorithms
CMSC 341 Disjoint Sets Based on slides from previous iterations of this course.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Linda Shapiro Spring 2016.
Shortest Paths C B A E D F Shortest Paths
Shortest Paths C B A E D F Shortest Paths
Minimum Spanning Trees
Algorithms (2IL15) – Lecture 5 SINGLE-SOURCE SHORTEST PATHS
Lectures on Graph Algorithms: searching, testing and sorting
Chapter 13 Graph Algorithms
CSE 373: Data Structures and Algorithms
Minimum Spanning Tree Section 7.3: Examples {1,2,3,4}
Algorithms and Data Structures Lecture XIII
Lecture 13 Algorithm Analysis
Lecture 13 Algorithm Analysis
Minimum Spanning Tree Algorithms
CSC 413/513: Intro to Algorithms
Lecture 13 Algorithm Analysis
CSE332: Data Abstractions Lecture 18: Minimum Spanning Trees
Single-source shortest paths
Lecture 13 Algorithm Analysis
Fundamental Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
CSE373: Data Structures & Algorithms Implementing Union-Find
Lecture 14 Shortest Path (cont’d) Minimum Spanning Tree
Algorithms: Design and Analysis
Fundamental Data Structures and Algorithms
Shortest Paths.
Weighted Graphs C B A E D F Sequences
Spanning Trees Lecture 20 CS2110 – Spring 2015.
Lecture 13 Shortest Path (cont’d) Minimum Spanning Tree
More Graphs Lecture 19 CS2110 – Fall 2009.
Presentation transcript:

15-211 Fundamental Data Structures and Algorithms Shortest Paths 15-211 Fundamental Data Structures and Algorithms Ananda Guna April 11, 2006

In this lecture.. recap of union/find algorithm Unweighted and weighted graphs Graphs with no edge costs Simple BFS algorithm Graphs with non negative cost edges Dijkstra’s Algorithm Shortest Path in a DAG Next: Graphs with negative cost edges Bellman-Ford Algorithm

Understanding Union-Find

Forest and trees Each set is a tree {1}{2}{0,3} {4}{5} union(1,2) adds a new subtree to a root {1,2}{0,3}{4}{5} union(0,1) adds a new subtree to a root {1,2,0,3}{4}{5} 1 2 3 4 5 1 3 4 2 5 1 3 4 2 5

Forest and trees - Array Representation {1,2,0,3}{4}{5} find(2) = 1 find(4) = 4 Array representation 3 -1 1 1 -1 -1 0 1 2 3 4 5 1 4 5 2 3

Find Operation {1,2,0,3}{4}{5} find(0) = 1 3 -1 1 1 -1 -1 3 -1 1 1 -1 -1 0 1 2 3 4 5 public int find(int x) { if (s[x] < 0) return x; return find(s[x]); } 1 4 5 2 3

Union Operation {1,2}{0,3}{4}{5} {1,2,0,3}{4}{5} union(0,2) 4 2 5 {1,2}{0,3}{4}{5} {1,2,0,3}{4}{5} union(0,2) 3 -1 1 -1 -1 -1 before 3 -1 1 1 -1 -1 after 0 1 2 3 4 5 public void union(int x, int y){ S[find(x)] = find(y) }

The problem Find must walk the path to the root Unlucky combinations of unions can result in long paths 1 2 3 4 5 6

Path compression for find find flattens trees Redirect nodes to point directly to the root Do this while traversing path from node to root. 1 3 4 2 5 1 4 5 2 3

Path compression find flattens trees Redirect nodes to point directly to the root Do this while traversing path from node to root. public int find(int x) { if (s[x]< 0) return x; return s[x] = find(s[x]); }

Union by size 1 3 2 Union-by-size 4 Representational trick Performance Join lesser size to greater Label with sum of sizes Find (with/without path comp.): No effect Representational trick Positive numbers: index of parent Negative numbers: root, with size -s[x] Performance When depth of a tree increases on union, it is always at least twice previous size. Hence maximum of log(N) steps that increase depth. Hence maximum time for find is O(log(N)). 4 1 3 2

union by height union shallow trees into deep trees Tree depth increases only when depths equal Track path length to root 3 -3 1 1 -1 -1 0 1 2 3 4 5 Tree depth at most O(log N) 3 1 1 3 4 2 5 1

Union by height, details Different heights Join lesser height to greater Do not change height values Equal heights Join either tree to the other Add one to height of result Find: Without path compression No effect With path compression Must recalculate height Can involve looking at many subtrees 1 3 2 2

Union by rank Path compression is easy to implement when we use union-by-size. However, union-by-height is problematic with path compression Definition Rank of a node is initialized to 0 Updated only during union operation Union-by-rank Union: Different ranks Join lesser rank to greater Do not change rank value Equal ranks Join either to the other Add one to rank of result Find, with path compression Yields good performance

All the code class UnionFind { int[] u; UnionFind(int n) { u = new int[n]; for (int i = 0; i < n; i++) u[i] = -1; } int find(int i) { int j,root; for (j = i; u[j] >= 0; j = u[j]) ; root = j; while (u[i] >= 0) { j = u[i]; u[i] = root; i = j; } return root; void union(int i,int j) { i = find(i); j = find(j); if (i !=j) { if (u[i] < u[j]) { u[i] += u[j]; u[j] = i; } else { u[j] += u[i]; u[i] = j; }

The UnionFind class class UnionFind { int[] u; UnionFind(int n) { u = new int[n]; for (int i = 0; i < n; i++) u[i] = -1; } int find(int i) { ... } void union(int i,int j) { ... }

Iterative find int find(int i) { int j, root; for (j = i; u[j] >= 0; j = u[j]) ; root = j; while (u[i] >= 0) { j = u[i]; u[i] = root; i = j; } return root; }

union by size i = find(i); j = find(j); if (i != j) { void union(int i,int j) { i = find(i); j = find(j); if (i != j) { if (u[i] < u[j]) { u[i] += u[j]; u[j] = i; } else { u[j] += u[i]; u[i] = j; } }

Analysis of UnionFind

Analysis of Union-Find The algorithm Union: by rank Find: with path compression 3 1 1 2 2 3 4 5 1 6

Analysis - Rank tree size Lemma. After a sequence of union instructions, a node of rank r will have at least 2r descendents, including itself. Proof. r = 0. 20 = 1. r > 0. Let T be the smallest rank-r tree and X be its root. Suppose T was result of union(T1, T2) and X was root of T1. The ranks of T1 and T2 must both be r-1. If one of rank of Ti were r then T could not be smallest rank-r tree.Also, since the union increased rank, the Ti ranks must be equal. By induction hypothesis, each Ti has at least 2r-1 descendents. Total must therefore be at least 2r. Note on path compression Path compression doesn’t affect rank Though it does affect height!

Analysis - Nodes of rank r Lemma. The number of nodes of rank r is at most N/2r. Proof. Each node of rank r roots a subtree of at least 2r nodes. No node within the subtree can be of rank r. So all subtrees of rank r are disjoint. At most N/2r subtrees. Examples: rank 0: at most N subtrees (i.e., every node is a root). rank log(N): at most 1 subtree (of size N).

Analysis - Ranks on a path Lemma. Node rank always increases from leaf to root. Proof. Obvious if no path compression. With path compression, nodes are promoted from lower levels and hence were of lesser rank.

Time bounds Variables M operations. N elements. Algorithms Simple forest representation Worst: find O(N). mixed operations O(MN). Average: tricky Union by height; Union by size Worst: find O(log N). mixed operations O(M log N). Average: mixed operations O(M) Path compression in find Worst: mixed operations: “nearly linear” [analysis in 15-451]

Maze Generator figure 24.2 Initial state: All walls are up, and all cells are in their own sets.

Shortest Paths

Airline routes BOS ORD PVD SFO JFK BWI LAX DFW MIA 2704 867 1846 187 849 PVD SFO 740 JFK 144 802 1464 337 621 1258 184 BWI 1391 LAX DFW 1090 1235 946 1121 MIA 2342

Single-source shortest path Suppose we live in Baltimore (BWI) and want the shortest path to San Francisco (SFO). Naïve Approach A Better way to solve this is to solve the single-source shortest path problem: That is, find the shortest path from BWI to every city.

Why Need to Find ALL Shortest Paths? While we may be interested only in BWI-to-SFO, there are no known algorithms that are asymptotically faster than solving the single-source problem for BWI-to-every-city.

Shortest paths What do we mean by “shortest path”? Minimize the number of layovers (i.e., fewest hops). Unweighted shortest-path problem. Minimize the total mileage (i.e., fewest frequent-flyer miles ;-). Weighted shortest-path problem.

Many applications Shortest paths model many useful real-world problems. Minimization of latency in the Internet. Minimization of cost in power delivery. Job and resource scheduling. Route planning. MapQuest, Google Maps

Unweighted Single-Source Shortest Path Algorithm

Unweighted shortest path In order to find the unweighted shortest path, we will mark vertices and edges so that: vertices can be marked with an integer, giving the number of hops from the source node, and edges can be marked as either explored or unexplored. Initially, all edges are unexplored.

Unweighted shortest path Algorithm: Set i to 0 and mark source node v with 0. Put source node v into a queue L0. While Li is not empty: Create new empty queue Li+1 For each w in Li do: For each unexplored edge (w,x) do: mark (w,x) as explored if x not marked, mark with i+1 and enqueue x into Li+1 Increment i.

Breadth-first search This algorithm is a form of breadth- first search. Performance: O(|V|+|E|). Why? Q: Use this algorithm to find the shortest route (in terms of number of hops) from BWI to SFO.

Use of a queue It is very common to use a queue to keep track of: nodes to be visited next, or nodes that we have already visited. Typically, use of a queue leads to a breadth-first visit order. Breadth-first visit order is “cautious” in the sense that it examines every path of length i before going on to paths of length i+1.

Greedy Algorithms

Greedy Algorithms In a greedy algorithm, during each phase, a decision is made that appears to be optimal, without regard for future consequences. This “take what you can get now” strategy is the source of the name for this class of algorithms. When a problem can be solved with a greedy algorithm, we are usually quite happy Greedy algorithms often match our intuition and make for relatively painless coding.

Greedy Algorithms 4 ingredients needed Optimization problem Maximization or minimization Can only proceed in stages No direct solution available Greedy Choice Property A locally optimal solution (greedy) will lead to a globally optimal solution Optimal Substructure An optimal solution to the problem contains, within it the optimal solution to the sub problem

Greedy Algorithms Minimize number of coins Find Huffman Code Prim’s and Kruskal’s Dijkstra’s algorithm for shortest path

Weighted Single-Source Shortest Path Algorithm (Dijkstra’s Algorithm)

Weighted shortest path Now suppose we want to minimize the total mileage. Breadth-first search does not work! Minimum number of hops does not mean minimum distance. Consider, for example, BWI-to-DFW:

Three 2-hop routes to DFW 2704 BOS 867 1846 187 ORD 849 PVD SFO 740 JFK 144 802 1464 337 621 1258 184 BWI 1391 LAX DFW 1090 1235 946 1121 MIA 2342

Intuition behind Dijkstra’s alg. For our airline-mileage problem, we can start by guessing that every city is  miles away. Mark each city with this guess. Find all cities one hop away from BWI, and check whether the mileage is less than what is currently marked for that city. If so, then revise the guess. Continue for 2 hops, 3 hops, etc.

Dijkstra’s: Greedy algorithm Assume that every city is infinitely far away. I.e., every city is  miles away from BWI (except BWI, which is 0 miles away). Now perform something similar to breadth-first search, and optimistically guess that we have found the best path to each city as we encounter it. If we later discover we are wrong and find a better path to a particular city, then update the distance to that city.

Dijkstra’s algorithm Algorithm initialization: Label each node with the distance , except start node, which is labeled with distance 0. D[v] is the distance label for v. Put all nodes into a priority queue Q, using the distances as labels.

Dijkstra’s algorithm, cont’d While Q is not empty do: u = Q.removeMin for each node z one hop away from u do: if D[u] + miles(u,z) < D[z] then D[z] = D[u] + miles(u,z) change key of z in Q to D[z] Note use of priority queue(Heap) allows “finished” nodes to be found quickly (in O(log |V|) time).

Shortest mileage from BWI 2704 BOS  867 1846 187 ORD  849 PVD  SFO  740 JFK  144 802 1464 337 621 1258 184 BWI 1391 LAX  DFW  1090 1235 946 1121 MIA  2342

Shortest mileage from BWI 2704 BOS  867 1846 187 ORD 621 849 PVD  SFO  740 JFK 184 144 802 1464 337 621 1258 184 BWI 1391 LAX  DFW  1090 1235 946 1121 MIA 946 2342

Shortest mileage from BWI 2704 BOS 371 867 1846 187 ORD 621 849 PVD 328 SFO  740 JFK 184 144 802 1464 337 621 1258 184 BWI 1391 LAX  DFW 1575 1090 1235 946 1121 MIA 946 2342

Shortest mileage from BWI 2704 BOS 371 867 1846 187 ORD 621 849 PVD 328 SFO  740 JFK 184 144 802 1464 337 621 1258 184 BWI 1391 LAX  DFW 1575 1090 1235 946 1121 MIA 946 2342

Shortest mileage from BWI 2704 BOS 371 867 1846 187 ORD 621 849 PVD 328 SFO 3075 740 JFK 184 144 802 1464 337 621 1258 184 BWI 1391 LAX  DFW 1575 1090 1235 946 1121 MIA 946 2342

Shortest mileage from BWI 2704 BOS 371 867 1846 187 ORD 621 849 PVD 328 SFO 2467 740 JFK 184 144 802 1464 337 621 1258 184 BWI 1391 LAX  DFW 1423 1090 1235 946 1121 MIA 946 2342

Shortest mileage from BWI 2704 BOS 371 867 1846 187 ORD 621 849 PVD 328 SFO 2467 740 JFK 184 144 802 1464 337 621 1258 184 BWI 1391 LAX 3288 DFW 1423 1090 1235 946 1121 MIA 946 2342

Shortest mileage from BWI 2704 BOS 371 867 1846 187 ORD 621 849 PVD 328 SFO 2467 740 JFK 184 144 802 1464 337 621 1258 184 BWI 1391 LAX 2658 DFW 1423 1090 1235 946 1121 MIA 946 2342

Shortest mileage from BWI 2704 BOS 371 867 1846 187 ORD 621 849 PVD 328 SFO 2467 740 JFK 184 144 802 1464 337 621 1258 184 BWI 1391 LAX 2658 DFW 1423 1090 1235 946 1121 MIA 946 2342

Shortest mileage from BWI 2704 BOS 371 867 1846 187 ORD 621 849 PVD 328 SFO 2467 740 JFK 184 144 802 1464 337 621 1258 184 BWI 1391 LAX 2658 DFW 1423 1090 1235 946 1121 MIA 946 2342

Shortest mileage from BWI 2704 BOS 371 867 1846 187 ORD 621 849 PVD 328 SFO 2467 740 JFK 184 144 802 1464 337 621 1258 184 BWI 1391 LAX 2658 DFW 1423 1090 1235 946 1121 MIA 946 2342

Find the Shortest Path from S b d e c g 4 2 5 1

Dijkstra’s Algorithm is greedy Optimization problem Of the many feasible solutions, finds the minimum or maximum solution. Can only proceed in stages no direct solution available Greedy-choice property: A locally optimal (greedy) choice will lead to a globally optimal solution. Optimal substructure: An optimal solution contains within it optimal solutions to subproblems

Features of Dijkstra’s Algorithm “Visits” every vertex only once, when it becomes the vertex with minimal distance amongst those still in the priority queue Distances may be revised multiple times: current values represent ‘best guess’ based on our observations so far Once a vertex is visited we are guaranteed to have found the shortest path to that vertex…. why?

Correctness (by contradiction) Prove by induction on stage k of the algorithm Assume v is the vertex visited at k+1 stage. Assume that dist(v) is not a shortest path. Thus the true shortest path must pass through a fringe vertex x. v x s visited fringe unreached By the inductive hypothesis, dist(x) must represent a shortest path to x, and so dist(x)  distshortest(v) < dist(v). But Dijkstra’s always visits the vertex with the smallest distance next, so we can’t possibly visit v before we visit x. A contradiction.

Performance (using a heap) Initialization: O(n) Visitation loop: n calls deleteMin(): O(log n) Each edge is considered only once during entire execution, for a total of e updates of the priority queue, each O(log n) Overall cost: O( (n+e) log n )

Dijkstra’s summary Dijkstra’s algorithm is greedy Dijkstra’s find shortest paths to all nodes from the origin even if we are interested only in the shortest path to a single node Dijkstra’s only finds the length of the shortest path It is possible to modify the Dijkstra’s to actually find out the nodes in the shortest path Dijkstra’s algorithm assumes that all distances are non-negative

Shortest Path in a DAG

Shortest Paths in a DAG How do we detect a graph is a DAG? Complexity? If we know the graph is a DAG, can we do better than Dijkstra?

The Idea Order the nodes in topological order Arrows can only point left to right Relax the edges in forward order Never have to worry about ancestors

Iteration 1

Iteration 2