CSE 331: Review
Main Steps in Algorithm Design Problem Statement Real world problem Problem Definition Precise mathematical def Algorithm “Implementation” Data Structures Analysis Correctness/Run time
Stable Matching Problem Gale-Shaply Algorithm
Stable Marriage problem Input: M and W with preferences Output: Stable Matching Set of men M and women W Preferences (ranking of potential spouses) Matching (no polygamy in M X W) Perfect Matching (everyone gets married) m w m’ w’ Instablity Stable matching = perfect matching+ no instablity
Gale-Shapley Algorithm At most n2 iterations Intially all men and women are free While there exists a free woman who can propose Let w be such a woman and m be the best man she has not proposed to w proposes to m O(1) time implementation If m is free (m,w) get engaged Else (m,w’) are engaged If m prefers w’ to w w remains free Else (m,w) get engaged and w’ is free Output the engaged pairs as the final output
GS algorithm: Firefly Edition Mal Inara 1 2 3 4 5 6 Wash Zoe 1 Simon Kaylee
GS algo outputs a stable matching Lemma 1: GS outputs a perfect matching S Lemma 2: S has no instability
Proof technique de jour Proof by contradiction Assume the negation of what you want to prove After some reasoning Source: 4simpsons.wordpress.com
Two obervations Obs 1: Once m is engaged he keeps getting engaged to “better” women Obs 2: If w proposes to m’ first and then to m (or never proposes to m) then she prefers m’ to m
Proof of Lemma 2 By contradiction w’ last proposed to m’ Assume there is an instability (m,w’) m w m prefers w’ to w w’ prefers m to m’ m’ w’
Contradiction by Case Analysis Depending on whether w’ had proposed to m or not Case 1: w’ never proposed to m By Obs 2 w’ prefers m’ to m Assumed w’ prefers m to m’ Source: 4simpsons.wordpress.com
Case 2: w’ had proposed to m Case 2.1: m had accepted w’ proposal m is finally engaged to w 4simpsons.wordpress.com Thus, m prefers w to w’ By Obs 1 Case 2.2: m had rejected w’ proposal m was engaged to w’’ (prefers w’’ to w’) By Obs 1 m is finally engaged to w (prefers w to w’’) By Obs 1 m prefers w to w’ 4simpsons.wordpress.com
Overall structure of case analysis Did w’ propose to m? Did m accept w’ proposal? 4simpsons.wordpress.com 4simpsons.wordpress.com 4simpsons.wordpress.com
Graph Searching BFS/DFS
O(m+n) BFS Implementation Input graph as Adjacency list BFS(s) Array CC[s] = T and CC[w] = F for every w≠ s Set i = 0 Set L0= {s} While Li is not empty Linked List Li+1 = Ø For every u in Li For every edge (u,w) Version in KT also computes a BFS tree If CC[w] = F then CC[w] = T Add w to Li+1 i++
An illustration 1 2 3 4 5 7 8 6 1 7 2 3 8 4 5 6
O(m+n) DFS implementation BFS(s) CC[s] = T and CC[w] = F for every w≠ s O(n) Intitialize Q= {s} O(1) Σu O(nu) = O(Σu nu) = O(m) While Q is not empty Repeated at most once for each vertex u Delete the front element u in Q For every edge (u,w) O(nu) Repeated nu times O(1) If CC[w] = F then O(1) CC[w] = T Add w to the back of Q
A DFS run using an explicit stack 7 8 1 7 7 6 3 2 3 5 8 4 4 5 5 3 6 2 3 1
Topological Ordering
Run of TopOrd algorithm
Greedy Algorithms
Interval Scheduling: Maximum Number of Intervals Schedule by Finish Time
End of Semester blues Can only do one thing at any day: what is the maximum number of tasks that you can do? Write up a term paper Party! Exam study 331 HW Project Monday Tuesday Wednesday Thursday Friday
Schedule by Finish Time O(n log n) time sort intervals such that f(i) ≤ f(i+1) O(n) time build array s[1..n] s.t. s[i] = start time for i Set A to be the empty set While R is not empty Choose i in R with the earliest finish time Add i to A Remove all requests that conflict with i from R Return A*=A Do the removal on the fly
Order tasks by their END time The final algorithm Order tasks by their END time Write up a term paper Party! Exam study 331 HW Project Monday Tuesday Wednesday Thursday Friday
Proof of correctness uses “greedy stays ahead”
Interval Scheduling: Maximum Intervals Schedule by Finish Time
Scheduling to minimize lateness All the tasks have to be scheduled GOAL: minimize maximum lateness Write up a term paper Exam study Party! 331 HW Project Monday Tuesday Wednesday Thursday Friday
The Greedy Algorithm f=s For every i in 1..n do (Assume jobs sorted by deadline: d1≤ d2≤ ….. ≤ dn) f=s For every i in 1..n do Schedule job i from s(i)=f to f(i)=f+ti f=f+ti
Proof of Correctness uses “Exchange argument”
Proved the following Any two schedules with 0 idle time and 0 inversions have the same max lateness Greedy schedule has 0 idle time and 0 inversions There is an optimal schedule with 0 idle time and 0 inversions
Shortest Path in a Graph: non-negative edge weights Dijkstra’s Algorithm
Shortest Path problem s 100 Input: Directed graph G=(V,E) w 15 5 s u w 100 Input: Directed graph G=(V,E) Edge lengths, le for e in E “start” vertex s in V 15 5 s u w 5 s u Output: All shortest paths from s to all nodes in V
Dijkstra’s shortest path algorithm 1 d’(w) = min e=(u,w) in E, u in R d(u)+le 1 2 4 3 y 4 3 u d(s) = 0 d(u) = 1 s x 2 4 d(w) = 2 d(x) = 2 d(y) = 3 d(z) = 4 w z 5 4 2 s w Input: Directed G=(V,E), le ≥ 0, s in V u R = {s}, d(s) =0 Shortest paths x While there is a x not in R with (u,x) in E, u in R z y Pick w that minimizes d’(w) Add w to R d(w) = d’(w)
Dijkstra’s shortest path algorithm (formal) Input: Directed G=(V,E), le ≥ 0, s in V S = {s}, d(s) =0 While there is a v not in S with (u,v) in E, u in S At most n iterations Pick w that minimizes d’(w) Add w to S d(w) = d’(w) O(m) time O(mn) time bound is trivial O(m log n) time implementation is possible
Proved that d’(v) is best when v is added
Minimum Spanning Tree Kruskal/Prim
Minimum Spanning Tree (MST) Input: A connected graph G=(V,E), ce> 0 for every e in E Output: A tree containing all V that minimizes the sum of edge weights
Kruskal’s Algorithm Input: G=(V,E), ce> 0 for every e in E T = Ø Sort edges in increasing order of their cost Joseph B. Kruskal Consider edges in sorted order If an edge can be added to T without adding a cycle then add it to T
Prim’s algorithm Similar to Dijkstra’s algorithm 2 1 3 51 50 0.5 Robert Prim 2 0.5 Input: G=(V,E), ce> 0 for every e in E 1 50 S = {s}, T = Ø While S is not the same as V Among edges e= (u,w) with u in S and w not in S, pick one with minimum cost Add w to S, e to T
Cut Property Lemma for MSTs Condition: S and V\S are non-empty V \ S S Cheapest crossing edge is in all MSTs Assumption: All edge costs are distinct
Divide & Conquer
Sorting Merge-Sort
Sorting Given n numbers order them from smallest to largest Works for any set of elements on which there is a total order
Mergesort algorithm Input: a1, a2, …, an Output: Numbers in sorted order MergeSort( a, n ) If n = 2 return the order min(a1,a2); max(a1,a2) aL = a1,…, an/2 aR = an/2+1,…, an return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) )
An example run 1 51 100 19 2 8 3 4 51 1 19 100 2 8 4 3 1 19 51 100 2 3 4 8 1 2 3 4 8 19 51 100 MergeSort( a, n ) If n = 2 return the order min(a1,a2); max(a1,a2) aL = a1,…, an/2 aR = an/2+1,…, an return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) )
Inductive step follows from correctness of MERGE Input: a1, a2, …, an Output: Numbers in sorted order By induction on n MergeSort( a, n ) If n = 2 return the order min(a1,a2); max(a1,a2) aL = a1,…, an/2 aR = an/2+1,…, an return MERGE ( MergeSort(aL, n/2), MergeSort(aR, n/2) ) If n = 1 return the order a1 Inductive step follows from correctness of MERGE
Counting Inversions Merge-Count
Mergesort-Count algorithm Input: a1, a2, …, an Output: Numbers in sorted order+ #inversion T(2) = c T(n) = 2T(n/2) + cn MergeSortCount( a, n ) If n = 1 return ( 0 , a1) If n = 2 return ( a1 > a2, min(a1,a2); max(a1,a2)) O(n log n) time aL = a1,…, an/2 aR = an/2+1,…, an (cL, aL) = MergeSortCount(aL, n/2) (cR, aR) = MergeSortCount(aR, n/2) O(n) Counts #crossing-inversions+ MERGE (c, a) = MERGE-COUNT(aL,aR) return (c+cL+cR,a)
Closest Pair of Points Algorithm
Closest pairs of points Input: n 2-D points P = {p1,…,pn}; pi=(xi,yi) d(pi,pj) = ( (xi-xj)2+(yi-yj)2)1/2 Output: Points p and q that are closest
Assume can be done in O(n) The algorithm O(n log n) + T(n) Input: n 2-D points P = {p1,…,pn}; pi=(xi,yi) Sort P to get Px and Py O(n log n) T(< 4) = c Closest-Pair (Px, Py) T(n) = 2T(n/2) + cn If n < 4 then find closest point by brute-force Q is first half of Px and R is the rest O(n) Compute Qx, Qy, Rx and Ry O(n) (q0,q1) = Closest-Pair (Qx, Qy) O(n log n) overall (r0,r1) = Closest-Pair (Rx, Ry) O(n) δ = min ( d(q0,q1), d(r0,r1) ) O(n) S = points (x,y) in P s.t. |x – x*| < δ return Closest-in-box (S, (q0,q1), (r0,r1)) Assume can be done in O(n)
Dynamic Programming
Weighted Interval Scheduling Scheduling Algorithm
Weighted Interval Scheduling Input: n jobs (si,ti,vi) Output: A schedule S s.t. no two jobs in S have a conflict Goal: max Σi in S vj Assume: jobs are sorted by their finish time
Proof of correctness by induction on j A recursive algorithm Proof of correctness by induction on j Correct for j=0 Compute-Opt(j) If j = 0 then return 0 return max { vj + Compute-Opt( p(j) ), Compute-Opt( j-1 ) } = OPT( p(j) ) = OPT( j-1 ) OPT(j) = max { vj + OPT( p(j) ), OPT(j-1) }
Exponential Running Time 1 2 3 4 5 p(j) = j-2 Only 5 OPT values! OPT(5) OPT(3) OPT(4) Formal proof: Ex. OPT(2) OPT(3) OPT(1) OPT(2) OPT(1) OPT(2) OPT(1)
Bounding # recursions M-Compute-Opt(j) If j = 0 then return 0 O(n) overall If j = 0 then return 0 If M[j] is not null then return M[j] M[j] = max { vj + M-Compute-Opt( p(j) ), M-Compute-Opt( j-1 ) } return M[j] Whenever a recursive call is made an M value of assigned At most n values of M can be assigned
Property of OPT Given OPT(1), …, OPT(j-1), one can compute OPT(j) OPT(j) = max { vj + OPT( p(j) ), OPT(j-1) } Given OPT(1), …, OPT(j-1), one can compute OPT(j)
Recursion+ memory = Iteration Iteratively compute the OPT(j) values Iterative-Compute-Opt M[0] = 0 M[j] = max { vj + M[p(j)], M[j-1] } For j=1,…,n M[j] = OPT(j) O(n) run time
Knapsack Problem Knapsack Algorithm
Subset Sum Problem Maximize the weight packed into a bag Capacity: *ref: Images from http://minecraft.gamepedia.com
Subset Sum Problem Input: A set of n items each with weight wi > 0 and a capacity W Output: A subset of the n items with maximum sum of weights under the constraint that the sum weights is ≤W
Knapsack Problem Maximize the value packed into the bag Capacity: W=40 *ref: Images from http://minecraft.gamepedia.com
Knapsack Problem Input: A set of n items each with weight wi > 0 and value vi>0 A capacity W Output: A subset of the n items with maximum sum of value under the constraint that the sum weights is ≤W
Dynamic Programming Algorithm: Subset Sum OPT(i,w) = Maximum weight packed given the first i item and capacity w For each OPT(i,w), decide if item i should be packed or not: If item i can’t fit (wi>w) then OPT(i,w) = OPT(i-1,w) If item i can fit: OPT(i,w) = max { OPT(i-1,w), wi + OPT(i-1, w-wi)} Don’t pack item i Pack item i Output OPT(n,W): With some book keeping, can also output the packed set
Dynamic Programming Algorithm: Knapsack Problem OPT(i,w) = Maximum value packed given the first i item and capacity w Decide if item i should be packed or not: If item i can’t fit (wi>w) then OPT(i,w) = OPT(i-1,w) If item i can fit: OPT(i,w) = max { OPT(i-1,w), vi + OPT(i-1, w-wi)} Don’t pack item i Pack item i Output OPT(n,W): With some book keeping, can also output the packed set
Runtime OPT(i,w) = max { OPT(i-1,w), vi + OPT(i-1, w-wi)} OPT(i,w) is computed in constant time nW entries in OPT to be computed for an O(nW) runtime
Shortest Path in a Graph Bellman-Ford
Shortest Path Problem Input: (Directed) Graph G=(V,E) and for every edge e has a cost ce (can be <0) t in V Output: Shortest path from every s to t Assume that G has no negative cycle 1 100 -1000 899 s t Shortest path has cost negative infinity
Best path through all neighbors Recurrence Relation OPT(i,u) = cost of shortest path from u to t with at most i edges OPT(i,u) = min { OPT(i-1,u), min(u,w) in E { cu,w + OPT(i-1, w)} } Path uses ≤ i-1 edges Best path through all neighbors
P vs NP
Alternate NP definition: Guess witness and verify! P vs NP question P: problems that can be solved by poly time algorithms Is P=NP? NP: problems that have polynomial time verifiable witness to optimal solution Alternate NP definition: Guess witness and verify!