CSE373: Data Structures & Algorithms Lecture 10: Amortized Analysis and Disjoint Sets Aaron Bauer Winter 2014.

Slides:



Advertisements
Similar presentations
Advanced Algorithm Design and Analysis Jiaheng Lu Renmin University of China
Advertisements

CSE 326: Data Structures Part 7: The Dynamic (Equivalence) Duo: Weighted Union & Path Compression Henry Kautz Autumn Quarter 2002 Whack!! ZING POW BAM!
Introduction to Algorithms Quicksort
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2010.
CSE332: Data Abstractions Lecture 21: Amortized Analysis Dan Grossman Spring 2010.
CSE 326: Data Structures Lecture #4 Alon Halevy Spring Quarter 2001.
Quicksort CS 3358 Data Structures. Sorting II/ Slide 2 Introduction Fastest known sorting algorithm in practice * Average case: O(N log N) * Worst case:
25 May Quick Sort (11.2) CSE 2011 Winter 2011.
UMass Lowell Computer Science Graduate Analysis of Algorithms Prof. Karen Daniels Spring, 2010 Lecture 3 Tuesday, 2/9/10 Amortized Analysis.
CSE 326: Data Structures Disjoint Union/Find Ben Lerner Summer 2007.
Disjoint Union / Find CSE 373 Data Structures Lecture 17.
Tirgul 10 Rehearsal about Universal Hashing Solving two problems from theoretical exercises: –T2 q. 1 –T3 q. 2.
CSE 326: Data Structures Disjoint Union/Find. Equivalence Relations Relation R : For every pair of elements (a, b) in a set S, a R b is either true or.
Tirgul 9 Amortized analysis Graph representation.
UMass Lowell Computer Science Graduate Analysis of Algorithms Prof. Karen Daniels Spring, 2009 Lecture 3 Tuesday, 2/10/09 Amortized Analysis.
©Brooks/Cole, 2003 Chapter 12 Abstract Data Type.
CSE 373, Copyright S. Tanimoto, 2002 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
CSE332: Data Abstractions Lecture 26: Amortized Analysis Tyler Robison Summer
CSE 373 Data Structures Lecture 15
Data Structures and Algorithms Graphs Minimum Spanning Tree PLSD210.
CS 473Lecture 121 CS473-Algorithms I Lecture 12 Amortized Analysis.
1 Time Analysis Analyzing an algorithm = estimating the resources it requires. Time How long will it take to execute? Impossible to find exact value Depends.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 5: Advanced Design Techniques.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
Analysis of Algorithms
The Selection Problem. 2 Median and Order Statistics In this section, we will study algorithms for finding the i th smallest element in a set of n elements.
CSE332: Data Abstractions Lecture 14: Beyond Comparison Sorting Dan Grossman Spring 2012.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Aaron Bauer Winter 2014.
CSE373: Data Structures & Algorithms Lecture 10: Disjoint Sets and the Union-Find ADT Lauren Milne Spring 2015.
CMSC 341 Disjoint Sets. 8/3/2007 UMBC CMSC 341 DisjointSets 2 Disjoint Set Definition Suppose we have an application involving N distinct items. We will.
CMSC 341 Disjoint Sets Textbook Chapter 8. Equivalence Relations A relation R is defined on a set S if for every pair of elements (a, b) with a,b  S,
Spanning Trees CSIT 402 Data Structures II 1. 2 Two Algorithms Prim: (build tree incrementally) – Pick lower cost edge connected to known (incomplete)
CSE373: Data Structures & Algorithms Lecture 6: Priority Queues Dan Grossman Fall 2013.
CSE373: Data Structures & Algorithms Lecture 12: Amortized Analysis and Memory Locality Lauren Milne Summer 2015.
CSE373: Data Structures & Algorithms Lecture 12: Amortized Analysis and Memory Locality Nicki Dell Spring 2014.
Advanced Algorithm Design and Analysis (Lecture 12) SW5 fall 2004 Simonas Šaltenis E1-215b
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Nicki Dell Spring 2014.
CSE373: Data Structures & Algorithms Lecture 10: Implementing Union-Find Dan Grossman Fall 2013.
CSE373: Data Structures & Algorithms Lecture 6: Priority Queues Kevin Quinn Fall 2015.
Minimum Spanning Trees CSE 373 Data Structures Lecture 21.
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
CSE373: Data Structures & Algorithms Lecture 9: Disjoint Sets and the Union-Find ADT Lauren Milne Summer 2015.
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees Linda Shapiro Winter 2015.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
CSE332: Data Abstractions Lecture 25: Amortized Analysis Sandra B. Fan Winter
CSE373: Data Structures & Algorithms Lecture 8: Amortized Analysis Dan Grossman Fall 2013.
CSE373: Data Structures & Algorithms Lecture 8: AVL Trees and Priority Queues Catie Baker Spring 2015.
CSE 373, Copyright S. Tanimoto, 2001 Up-trees - 1 Up-Trees Review of the UNION-FIND ADT Straight implementation with Up-Trees Path compression Worst-case.
CSE373: Data Structures & Algorithms Lecture Supplement: Amortized Analysis Hunter Zahn Summer 2016 CSE373: Data Structures & Algorithms1.
CSE373: Data Structures & Algorithms Priority Queues
CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)
CSE 373: Data Structures & Algorithms Disjoint Sets & Union-Find
CSE 373, Copyright S. Tanimoto, 2001 Up-trees -
CSE373: Data Structures & Algorithms
CSE 373: Data Structures & Algorithms Disjoint Sets & Union-Find
Instructor: Lilian de Greef Quarter: Summer 2017
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
Instructor: Lilian de Greef Quarter: Summer 2017
CSE373: Data Structures & Algorithms Lecture 10: Disjoint Sets and the Union-Find ADT Linda Shapiro Spring 2016.
CSE373: Data Structures & Algorithms Lecture 11: Implementing Union-Find Linda Shapiro Spring 2016.
CSE373: Data Structures & Algorithms Lecture 10: Disjoint Sets and the Union-Find ADT Linda Shapiro Winter 2015.
CSE373: Data Structures & Algorithms Lecture 5: AVL Trees
CSE373: Data Structures & Algorithms Lecture 9: Disjoint Sets & Union-Find Dan Grossman Fall 2013.
CSE373: Data Structures & Algorithms Implementing Union-Find
Disjoint Sets DS.S.1 Chapter 8 Overview Dynamic Equivalence Classes
CS203 Lecture 15.
Disjoint Sets Textbook Chapter 8
Disjoint Set Operations: “UNION-FIND” Method
Presentation transcript:

CSE373: Data Structures & Algorithms Lecture 10: Amortized Analysis and Disjoint Sets Aaron Bauer Winter 2014

Amortized Recall our plain-old stack implemented as an array that doubles its size if it runs out of room –How can we claim push is O( 1 ) time if resizing is O( n ) time? –We can’t, but we can claim it’s an O( 1 ) amortized operation What does amortized mean? When are amortized bounds good enough? How can we prove an amortized bound? Will just do two simple examples –Text has more sophisticated examples and proof techniques –Idea of how amortized describes average cost is essential Winter 20142CSE373: Data Structures & Algorithms

Amortized Complexity If a sequence of M operations takes O( M f(n) ) time, we say the amortized runtime is O( f(n) ) Amortized bound: worst-case guarantee over sequences of operations –Example: If any n operations take O( n ), then amortized O( 1 ) –Example: If any n operations take O( n 3 ), then amortized O( n 2 ) The worst case time per operation can be larger than f(n) –As long as the worst case is always “rare enough” in any sequence of operations Amortized guarantee ensures the average time per operation for any sequence is O( f(n) ) Winter 20143CSE373: Data Structures & Algorithms

“Building Up Credit” Can think of preceding “cheap” operations as building up “credit” that can be used to “pay for” later “expensive” operations Because any sequence of operations must be under the bound, enough “cheap” operations must come first –Else a prefix of the sequence, which is also a sequence, would violate the bound Winter 20144CSE373: Data Structures & Algorithms

Example #1: Resizing stack A stack implemented with an array where we double the size of the array if it becomes full Claim: Any sequence of push / pop / isEmpty is amortized O( 1 ) Need to show any sequence of M operations takes time O( M ) –Recall the non-resizing work is O( M ) (i.e., M* O( 1 )) –The resizing work is proportional to the total number of element copies we do for the resizing –So it suffices to show that: After M operations, we have done < 2M total element copies (So average number of copies per operation is bounded by a constant) Winter 20145CSE373: Data Structures & Algorithms

Amount of copying After M operations, we have done < 2M total element copies Let n be the size of the array after M operations –Then we have done a total of: n/2 + n/4 + n/8 + … INITIAL_SIZE < n element copies –Because we must have done at least enough push operations to cause resizing up to size n : M  n/2 –So 2M  n > number of element copies Winter 20146CSE373: Data Structures & Algorithms

Other approaches If array grows by a constant amount (say 1000), operations are not amortized O( 1 ) –After O( M ) operations, you may have done  ( M 2 ) copies If array shrinks when 1/2 empty, operations are not amortized O( 1 ) –Terrible case: pop once and shrink, push once and grow, pop once and shrink, … If array shrinks when 3/4 empty, it is amortized O( 1 ) –Proof is more complicated, but basic idea remains: by the time an expensive operation occurs, many cheap ones occurred Winter 20147CSE373: Data Structures & Algorithms

Example #2: Queue with two stacks A clever and simple queue implementation using only stacks Winter 20148CSE373: Data Structures & Algorithms class Queue { Stack in = new Stack (); Stack out = new Stack (); void enqueue(E x){ in.push(x); } E dequeue(){ if(out.isEmpty()) { while(!in.isEmpty()) { out.push(in.pop()); } return out.pop(); } CBACBA in out enqueue: A, B, C

Example #2: Queue with two stacks A clever and simple queue implementation using only stacks Winter 20149CSE373: Data Structures & Algorithms class Queue { Stack in = new Stack (); Stack out = new Stack (); void enqueue(E x){ in.push(x); } E dequeue(){ if(out.isEmpty()) { while(!in.isEmpty()) { out.push(in.pop()); } return out.pop(); } in out dequeue BCBC A

Example #2: Queue with two stacks A clever and simple queue implementation using only stacks Winter CSE373: Data Structures & Algorithms class Queue { Stack in = new Stack (); Stack out = new Stack (); void enqueue(E x){ in.push(x); } E dequeue(){ if(out.isEmpty()) { while(!in.isEmpty()) { out.push(in.pop()); } return out.pop(); } in out enqueue D, E BCBC A EDED

Example #2: Queue with two stacks A clever and simple queue implementation using only stacks Winter CSE373: Data Structures & Algorithms class Queue { Stack in = new Stack (); Stack out = new Stack (); void enqueue(E x){ in.push(x); } E dequeue(){ if(out.isEmpty()) { while(!in.isEmpty()) { out.push(in.pop()); } return out.pop(); } in out dequeue twice C B A EDED

Example #2: Queue with two stacks A clever and simple queue implementation using only stacks Winter CSE373: Data Structures & Algorithms class Queue { Stack in = new Stack (); Stack out = new Stack (); void enqueue(E x){ in.push(x); } E dequeue(){ if(out.isEmpty()) { while(!in.isEmpty()) { out.push(in.pop()); } return out.pop(); } in out dequeue again D C B A E

Correctness and usefulness If x is enqueued before y, then x will be popped from in later than y and therefore popped from out sooner than y –So it is a queue Example: –Wouldn’t it be nice to have a queue of t-shirts to wear instead of a stack (like in your dresser)? –So have two stacks in: stack of t-shirts go after you wash them out: stack of t-shirts to wear if out is empty, reverse in into out Winter CSE373: Data Structures & Algorithms

Analysis dequeue is not O( 1 ) worst-case because out might be empty and in may have lots of items But if the stack operations are (amortized) O( 1 ), then any sequence of queue operations is amortized O( 1 ) –The total amount of work done per element is 1 push onto in, 1 pop off of in, 1 push onto out, 1 pop off of out –When you reverse n elements, there were n earlier O( 1 ) enqueue operations to average with Winter CSE373: Data Structures & Algorithms

Amortized useful? When the average per operation is all we care about (i.e., sum over all operations), amortized is perfectly fine –This is the usual situation If we need every operation to finish quickly (e.g., in a web server), amortized bounds may be too weak While amortized analysis is about averages, we are averaging cost-per-operation on worst-case input –Contrast: Average-case analysis is about averages across possible inputs. Example: if all initial permutations of an array are equally likely, then quicksort is O( n log n ) on average even though on some inputs it is O( n 2 )) Winter CSE373: Data Structures & Algorithms

Not always so simple Proofs for amortized bounds can be much more complicated Example: Splay trees are dictionaries with amortized O( log n ) operations –No extra height field like AVL trees –See Chapter 4.5 if curious For more complicated examples, the proofs need much more sophisticated invariants and “potential functions” to describe how earlier cheap operations build up “energy” or “money” to “pay for” later expensive operations –See Chapter 11 if curious But complicated proofs have nothing to do with the code! Winter CSE373: Data Structures & Algorithms

The plan What are disjoint sets –And how are they “the same thing” as equivalence relations The union-find ADT for disjoint sets Applications of union-find Next lecture: Basic implementation of the ADT with “up trees” Optimizations that make the implementation much faster Winter CSE373: Data Structures & Algorithms

Disjoint sets A set is a collection of elements (no-repeats) Two sets are disjoint if they have no elements in common –S 1  S 2 =  Example: {a, e, c} and {d, b} are disjoint Example: {x, y, z} and {t, u, x} are not disjoint Winter CSE373: Data Structures & Algorithms

Partitions A partition P of a set S is a set of sets {S1,S2,…,Sn} such that every element of S is in exactly one Si Put another way: –S 1  S 2 ...  S k = S –i  j implies S i  S j =  (sets are disjoint with each other) Example: –Let S be {a,b,c,d,e} –One partition: {a}, {d,e}, {b,c} –Another partition: {a,b,c}, , {d}, {e} –A third: {a,b,c,d,e} –Not a partition: {a,b,d}, {c,d,e} –Not a partition of S: {a,b}, {e,c} Winter CSE373: Data Structures & Algorithms

Binary relations S x S is the set of all pairs of elements of S –Example: If S = {a,b,c} then S x S = {(a,a),(a,b),(a,c),(b,a),(b,b),(b,c), (c,a),(c,b),(c,c)} A binary relation R on a set S is any subset of S x S –Write R(x,y) to mean (x,y) is “in the relation” –(Unary, ternary, quaternary, … relations defined similarly) Examples for S = people-in-this-room –Sitting-next-to-each-other relation –First-sitting-right-of-second relation –Went-to-same-high-school relation –Same-gender-relation –First-is-younger-than-second relation Winter CSE373: Data Structures & Algorithms

Properties of binary relations A binary relation R over set S is reflexive means R(a,a) for all a in S A binary relation R over set S is symmetric means R(a,b) if and only if R(b,a) for all a,b in S A binary relation R over set S is transitive means If R(a,b) and R(b,c) then R(a,c) for all a,b,c in S Examples for S = people-in-this-room –Sitting-next-to-each-other relation –First-sitting-right-of-second relation –Went-to-same-high-school relation –Same-gender-relation –First-is-younger-than-second relation Winter CSE373: Data Structures & Algorithms

Equivalence relations A binary relation R is an equivalence relation if R is reflexive, symmetric, and transitive Examples –Same gender –Connected roads in the world –Graduated from same high school? –…–… Winter CSE373: Data Structures & Algorithms

Punch-line Every partition induces an equivalence relation Every equivalence relation induces a partition Suppose P={S1,S2,…,Sn} be a partition –Define R(x,y) to mean x and y are in the same Si R is an equivalence relation Suppose R is an equivalence relation over S –Consider a set of sets S1,S2,…,Sn where (1) x and y are in the same Si if and only if R(x,y) (2) Every x is in some Si This set of sets is a partition Winter CSE373: Data Structures & Algorithms

Example Let S be {a,b,c,d,e} One partition: {a,b,c}, {d}, {e} The corresponding equivalence relation: (a,a), (b,b), (c,c), (a,b), (b,a), (a,c), (c,a), (b,c), (c,b), (d,d), (e,e) Winter CSE373: Data Structures & Algorithms

The plan What are disjoint sets –And how are they “the same thing” as equivalence relations The union-find ADT for disjoint sets Applications of union-find Next lecture: Basic implementation of the ADT with “up trees” Optimizations that make the implementation much faster Winter CSE373: Data Structures & Algorithms

The operations Given an unchanging set S, create an initial partition of a set –Typically each item in its own subset: {a}, {b}, {c}, … –Give each subset a “name” by choosing a representative element Operation find takes an element of S and returns the representative element of the subset it is in Operation union takes two subsets and (permanently) makes one larger subset –A different partition with one fewer set –Affects result of subsequent find operations –Choice of representative element up to implementation Winter CSE373: Data Structures & Algorithms

Example Let S = {1,2,3,4,5,6,7,8,9} Let initial partition be (will highlight representative elements red) {1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9} union (2,5): {1}, {2, 5}, {3}, {4}, {6}, {7}, {8}, {9} find (4) = 4, find (2) = 2, find (5) = 2 union (4,6), union (2,7) {1}, {2, 5, 7}, {3}, {4, 6}, {8}, {9} find (4) = 6, find (2) = 2, find (5) = 2 union (2,6) {1}, {2, 4, 5, 6, 7}, {3}, {8}, {9} Winter CSE373: Data Structures & Algorithms

No other operations All that can “happen” is sets get unioned –No “un-union” or “create new set” or … As always: trade-offs – implementations will exploit this small ADT Surprisingly useful ADT: list of applications after one example surprising one –But not as common as dictionaries or priority queues Winter CSE373: Data Structures & Algorithms

Example application: maze-building Build a random maze by erasing edges –Possible to get from anywhere to anywhere Including “start” to “finish” –No loops possible without backtracking After a “bad turn” have to “undo” Winter CSE373: Data Structures & Algorithms

Maze building Pick start edge and end edge Winter CSE373: Data Structures & Algorithms Start End

Repeatedly pick random edges to delete One approach: just keep deleting random edges until you can get from start to finish Winter CSE373: Data Structures & Algorithms Start End

Problems with this approach 1.How can you tell when there is a path from start to finish? –We do not really have an algorithm yet 2.We have cycles, which a “good” maze avoids –Want one solution and no cycles Winter CSE373: Data Structures & Algorithms Start End

Revised approach Consider edges in random order But only delete them if they introduce no cycles (how? TBD) When done, will have one way to get from any place to any other place (assuming no backtracking) Notice the funny-looking tree in red Winter CSE373: Data Structures & Algorithms Start End

Cells and edges Let’s number each cell –36 total for 6 x 6 An (internal) edge (x,y) is the line between cells x and y –60 total for 6x6: (1,2), (2,3), …, (1,7), (2,8), … Winter CSE373: Data Structures & Algorithms

The trick Partition the cells into disjoint sets: “are they connected” –Initially every cell is in its own subset If an edge would connect two different subsets: –then remove the edge and union the subsets –else leave the edge because removing it makes a cycle Winter CSE373: Data Structures & Algorithms Start End

The algorithm P = disjoint sets of connected cells, initially each cell in its own 1-element set E = set of edges not yet processed, initially all (internal) edges M = set of edges kept in maze (initially empty) while P has more than one set { –Pick a random edge (x,y) to remove from E –u = find (x) –v = find (y) –if u==v then add (x,y) to M // same subset, do not create cycle else union (u,v) // do not put edge in M, connect subsets } Add remaining members of E to M, then output M as the maze Winter CSE373: Data Structures & Algorithms

Example step Winter CSE373: Data Structures & Algorithms Start End Pick (8,14) P {1,2,7,8,9,13,19} {3} {4} {5} {6} {10} {11,17} {12} {14,20,26,27} {15,16,21} {18} {25} {28} {31} {22,23,24,29,30,32 33,34,35,36}

Example step Winter CSE373: Data Structures & Algorithms P {1,2,7,8,9,13,19} {3} {4} {5} {6} {10} {11,17} {12} {14,20,26,27} {15,16,21} {18} {25} {28} {31} {22,23,24,29,30,32 33,34,35,36} Find(8) = 7 Find(14) = 20 Union(7,20) P {1,2,7,8,9,13,19,14,20,26,27} {3} {4} {5} {6} {10} {11,17} {12} {15,16,21} {18} {25} {28} {31} {22,23,24,29,30,32 33,34,35,36}

Add edge to M step Winter CSE373: Data Structures & Algorithms P {1,2,7,8,9,13,19,14,20,26,27} {3} {4} {5} {6} {10} {11,17} {12} {15,16,21} {18} {25} {28} {31} {22,23,24,29,30,32 33,34,35,36} Pick (19,20) Start End

At the end Stop when P has one set Suppose green edges are already in M and black edges were not yet picked –Add all black edges to M Winter CSE373: Data Structures & Algorithms Start End P {1,2,3,4,5,6,7,… 36}

Other applications Maze-building is: –Cute –Homework 4 –A surprising use of the union-find ADT Many other uses (which is why an ADT taught in CSE373): –Road/network/graph connectivity (will see this again) “connected components” e.g., in social network –Partition an image by connected-pixels-of-similar-color –Type inference in programming languages Not as common as dictionaries, queues, and stacks, but valuable because implementations are very fast, so when applicable can provide big improvements Winter CSE373: Data Structures & Algorithms