Presentation is loading. Please wait.

Presentation is loading. Please wait.

More complexity analysis & Binary Search

Similar presentations


Presentation on theme: "More complexity analysis & Binary Search"— Presentation transcript:

1 More complexity analysis & Binary Search
Marcus Frean Thomas Kuehne School of Engineering and Computer Science, Victoria University of Wellington 2016-T2 Lecture 09

2 RECAP-TODAY RECAP “Big O” notation, ArrayList costs TODAY
2 RECAP “Big O” notation, ArrayList costs TODAY ArraySet Costs get, set, contains Binary search: “findIndexOf” method of ArraySet

3 What about ArraySet ? Order is not significant Duplicates not allowed
3 Order is not significant can add a new item anywhere can replace removed element with any element Duplicates not allowed Order doesn't matter, so we can move things if that helps. Adding: So put new items at the end: O(1). But we do need to check if it's there already, and that's going to cost O(n). Remove looks easy: can remove it and put the last item in its place (don't need to move everything over) BUT that's not all there is to it (see next slide).

4 ArraySet Algorithms (pseudocode)
4 Add(value) if not contains(value), place value at end, (doubling array if necessary) increment size Remove(value) search through array if value equals item replace item by item at end decrement size return Contains(value) search through array, return true return false Costs? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 31 Contains: IF item is there, average is n/2. If it's not there, it's always n. So it's O(n) anyway. Add: it's a set so we have to check if it contains it already: this is O(n). If not, add at end, maybe increasing size, is O(1). So is O(n) : tough! Remove: similarly, taking it out is easy O(1), but you have to find it first so it's O(n). So they're all slow! NB. if it's a bag then add is fast . But contains and remove are still slow. CAN WE DO BETTER?

5 ArraySet Cost Summary Which are the expensive (sub-) operations?
5 Which are the expensive (sub-) operations? can add a new item anywhere ⇒ At end: O(1)  but also need to first search for duplicates: O(n)  can replace removed element with any element ⇒ Replace by last element: O(1)  but first need to find element to be removed: O(n)  All the cost is in the search! Can we speed up the search? Order doesn't matter, so we can move things if that helps. Adding: So put new items at the end: O(1). But we do need to check if it's there already, and that's going to cost O(n). Remove looks easy: can remove it and put the last item in its place (don't need to move everything over) BUT that's not all there is to it (see next slide).

6 Divide & Conquer 6 How many guesses do you need to find my secret animal?  eliminate as many as possible in each step! Mammal Egg Laying Feline Canine Bird Reptile Toby Tiger Lea Lion Bully Bulldog Colin Collie Tanja Tui Kurt Kaka Tim Turtle Sally Snake

7 Divide & Conquer What if there are no natural categories? Cat Gnu Ant
7 What if there are no natural categories? Cat Gnu Ant Dog Hen Fox Eel Bee

8 Divide & Conquer What if there are no natural categories?
8 What if there are no natural categories?  can use lexicographical sorting for quick elimination!  Dog ? > Dog ?  Bee ? > Bee ?  Fox ? > Fox ? Ant Bee Cat Dog Eel Fox Gnu Hen

9 Making ArraySet faster
9 Binary Search: Finding “Fox” Algorithm description Look in the middle: if item is middle item ⇒ return item position if item is before middle item ⇒ look in left half if item is after middle item ⇒ look in right half 1 2 3 4 5 6 7 8 Ant Bee Cat Dog Eel Fox Gnu Hen 9 Ibis mid low hi Only works if items are sorted! You can do this recursively. Or iteratively (we'll do that).

10 with this helper, “contains” is trivial & fast!
Binary Search 10 private int findIndexOf(Comparable<E> item) { int low = 0;    // min possible index of item   int high  =  count-1;      // max possible index of item while (low <= high) { int mid  =  (low + high) / 2; // calculate best guess   int comp = item.compareTo(data[mid]); if (comp == 0) return mid; // found the item!     if (comp > 0)         low = mid + 1;          // item should be in [mid+1..high]     else                            high = mid - 1;        // item should be in [low..mid-1] } return low;   // return insertion position with this helper, “contains” is trivial & fast!

11 Binary Search: Cost 11 What is the cost of searching if there are n items? key step = Iteration Size of range 1 n 2 k-1 k 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

12 Log2(n ) : Every time you double n, you add only one step to the cost!
12 The number of times you can divide a set of n things in half: log2(1000) 10, log2(1,000,000)  20, log2(1,000,000,000)  30 Every time you double n, you add only one step to the cost! Logarithms often arise in analysing algorithms, especially with “Divide and Conquer” algorithms Problem Solve Solve Solution

13 Summary: ArraySet vs SortedArraySet
13 ArraySet: unordered All cost in the searching: O(n) contains: O(n) // simple, linear search add: O(n) // cost of searching to see if there’s a duplicate remove: O(n) // cost of searching the item to remove SortedArraySet: with Binary Search Binary Search is fast: O(log n ) contains: O(log n) // uses binary search add: O(n) // cost of keeping it sorted remove: O(n) // cost of keeping it sorted Most of the cost is caused by keeping it sorted!

14 Making SortedArraySet fast
14 If you have to call add() and/or remove() many times, then SortedArraySet is no better than ArraySet Both O(n) Either we... pay to search Or we... pay to keep it in order If set modifications are few but calls to contains() are frequent, then SortedArraySet is much better than ArraySet! SortedArraySet contains() is O(log n ) to find1-in-a-billion, if sorted, takes roughly the time that 1-in-30 would, if unsorted Efficiently construct a SortedArraySet from an unordered collection? yes  use an efficient sorting algorithm in the constructor


Download ppt "More complexity analysis & Binary Search"

Similar presentations


Ads by Google