More complexity analysis & Binary Search

Slides:



Advertisements
Similar presentations
CSE 373: Data Structures and Algorithms Lecture 5: Math Review/Asymptotic Analysis III 1.
Advertisements

2014-T2 Lecture 25 School of Engineering and Computer Science, Victoria University of Wellington  Lindsay Groves, Marcus Frean, Peter Andreae, and Thomas.
Introduction to Analysing Costs 2015-T2 Lecture 10 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Rashina.
Week 11 Introduction to Computer Science and Object-Oriented Programming COMP 111 George Basham.
CPT: Search/ Computer Programming Techniques Semester 1, 1998 Objectives of these slides: –to discuss searching: its implementation,
CS 2430 Day 28. Announcements We will have class in ULR 111 on Monday Exam 2 next Friday (sample exam will be distributed next week)
COMP 103 Priority Queues, Partially Ordered Trees and Heaps.
FASTER SORTING using RECURSION : QUICKSORT COMP 103.
Today  Table/List operations  Parallel Arrays  Efficiency and Big ‘O’  Searching.
SEARCHING UNIT II. Divide and Conquer The most well known algorithm design strategy: 1. Divide instance of problem into two or more smaller instances.
COMP 103 Hashing 2013-T2 Lecture 28 Thomas Kuehne School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay.
CS 162 Intro to Programming II Searching 1. Data is stored in various structures – Typically it is organized on the type of data – Optimized for retrieval.
Searching. Linear (Sequential) Search Search an array or list by checking items one at a time. Linear search is usually very simple to implement, and.
Analysing Costs: ArraySet Binary Search COMP 103.
Peter Andreae Computer Science Victoria University of Wellington Copyright: Peter Andreae, Victoria University of Wellington ArraySet and Binary Search.
FASTER SORTING using RECURSION : QUICKSORT 2014-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus.
An introduction to costs (continued), and Binary Search 2013-T2 Lecture 11 School of Engineering and Computer Science, Victoria University of Wellington.
SORTING 2014-T2 Lecture 13 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean.
CS261 Data Structures Ordered Bag Dynamic Array Implementation.
More about costs: cost of “ensureCapacity”, cost of ArraySet, Binary Search 2014-T2 Lecture 12 School of Engineering and Computer Science, Victoria University.
1 Searching and Sorting Searching algorithms with simple arrays Sorting algorithms with simple arrays –Selection Sort –Insertion Sort –Bubble Sort –Quick.
2015-T2 Lecture 30 School of Engineering and Computer Science, Victoria University of Wellington  Lindsay Groves, Marcus Frean, Peter Andreae, and Thomas.
QUICKSORT 2015-T2 Lecture 16 School of Engineering and Computer Science, Victoria University of Wellington COMP 103 Marcus Frean.
Chapter 9 Sorting. The efficiency of data handling can often be increased if the data are sorted according to some criteria of order. The first step is.
2014-T2 Lecture 29 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae and Thomas.
2015-T2 Lecture 20 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae, John Lewis,
Searching Topics Sequential Search Binary Search.
2014-T2 Lecture 27 School of Engineering and Computer Science, Victoria University of Wellington  Lindsay Groves, Marcus Frean, Peter Andreae, and Thomas.
Peter Andreae Computer Science Victoria University of Wellington Copyright: Peter Andreae, Victoria University of Wellington Fast Sorting COMP
 Introduction to Search Algorithms  Linear Search  Binary Search 9-2.
2015-T2 Lecture 28 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Lindsay Groves, Peter Andreae and Thomas.
Introduction to Analysing Costs 2013-T2 Lecture 10 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Rashina.
COMP 103 Course Review. 2 Menu  A final word on hash collisions in Open Addressing / Probing  Course Summary  What we have covered  What you should.
COMP 103 Binary Search Trees II Marcus Frean 2014-T2 Lecture 26
Searching and Sorting Searching algorithms with simple arrays
CMPT 438 Algorithms.
19 Searching and Sorting.
COP 3503 FALL 2012 Shayan Javed Lecture 15
COMP 103 Introduction to Trees Thomas Kuehne 2013-T2 Lecture 19
Introduction to complexity
Introduction to Analysing Costs
COMP 53 – Week Seven Big O Sorting.
FASTER SORT using RECURSION : MERGE SORT
JAVA COLLECTIONS LIBRARY
Searching & Sorting "There's nothing hidden in your head the sorting hat can't see. So try me on and I will tell you where you ought to be." -The Sorting.
COMP 103 SORTING Lindsay Groves 2016-T2 Lecture 26
COMP 103 Sorting with Binary Trees: Tree sort, Heap sort Alex Potanin
COMP 103 HeapSort Thomas Kuehne 2013-T1 Lecture 27
RECURSION COMP 103 Thomas Kuehne 2016-T2 Lecture 16
Binary Search Trees (I)
COMP 103 Binary Search Trees.
Teach A level Computing: Algorithms and Data Structures
Searching CSCE 121 J. Michael Moore.
CS 3343: Analysis of Algorithms
Building Java Programs
Adapted from Pearson Education, Inc.
CSE 143 Lecture 5 Binary search; complexity reading:
Algorithm design and Analysis
CSE 373: Data Structures and Algorithms
Standard Version of Starting Out with C++, 4th Edition
CSE 373: Data Structures and Algorithms
Searching: linear & binary
CSE 373 Data Structures and Algorithms
Searching CLRS, Sections 9.1 – 9.3.
24 Searching and Sorting.
Data Structures Sorted Arrays
Data Structures: Searching
Data Structures Unsorted Arrays
CSC 143 Binary Search Trees.
Sum this up for me Let’s write a method to calculate the sum from 1 to some n public static int sum1(int n) { int sum = 0; for (int i = 1; i
Presentation transcript:

More complexity analysis & Binary Search Marcus Frean Thomas Kuehne School of Engineering and Computer Science, Victoria University of Wellington 2016-T2 Lecture 09

RECAP-TODAY RECAP “Big O” notation, ArrayList costs TODAY 2 RECAP “Big O” notation, ArrayList costs TODAY ArraySet Costs get, set, contains Binary search: “findIndexOf” method of ArraySet

What about ArraySet ? Order is not significant Duplicates not allowed 3 Order is not significant can add a new item anywhere can replace removed element with any element Duplicates not allowed Order doesn't matter, so we can move things if that helps. Adding: So put new items at the end: O(1). But we do need to check if it's there already, and that's going to cost O(n). Remove looks easy: can remove it and put the last item in its place (don't need to move everything over) BUT that's not all there is to it....... (see next slide).

ArraySet Algorithms (pseudocode) 4 Add(value) if not contains(value), place value at end, (doubling array if necessary) increment size Remove(value) search through array if value equals item replace item by item at end decrement size return Contains(value) search through array, return true return false Costs? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 31 Contains: IF item is there, average is n/2. If it's not there, it's always n. So it's O(n) anyway. Add: it's a set so we have to check if it contains it already: this is O(n). If not, add at end, maybe increasing size, is O(1). So is O(n) : tough! Remove: similarly, taking it out is easy O(1), but you have to find it first so it's O(n). So they're all slow! NB. if it's a bag then add is fast . But contains and remove are still slow. CAN WE DO BETTER?

ArraySet Cost Summary Which are the expensive (sub-) operations? 5 Which are the expensive (sub-) operations? can add a new item anywhere ⇒ At end: O(1)  but also need to first search for duplicates: O(n)  can replace removed element with any element ⇒ Replace by last element: O(1)  but first need to find element to be removed: O(n)  All the cost is in the search! Can we speed up the search? Order doesn't matter, so we can move things if that helps. Adding: So put new items at the end: O(1). But we do need to check if it's there already, and that's going to cost O(n). Remove looks easy: can remove it and put the last item in its place (don't need to move everything over) BUT that's not all there is to it....... (see next slide).

Divide & Conquer 6 How many guesses do you need to find my secret animal?  eliminate as many as possible in each step! Mammal Egg Laying Feline Canine Bird Reptile Toby Tiger Lea Lion Bully Bulldog Colin Collie Tanja Tui Kurt Kaka Tim Turtle Sally Snake

Divide & Conquer What if there are no natural categories? Cat Gnu Ant 7 What if there are no natural categories? Cat Gnu Ant Dog Hen Fox Eel Bee

Divide & Conquer What if there are no natural categories? 8 What if there are no natural categories?  can use lexicographical sorting for quick elimination!  Dog ? > Dog ?  Bee ? > Bee ?  Fox ? > Fox ? Ant Bee Cat Dog Eel Fox Gnu Hen

Making ArraySet faster 9 Binary Search: Finding “Fox” Algorithm description Look in the middle: if item is middle item ⇒ return item position if item is before middle item ⇒ look in left half if item is after middle item ⇒ look in right half 1 2 3 4 5 6 7 8 Ant Bee Cat Dog Eel Fox Gnu Hen 9 Ibis mid low hi Only works if items are sorted! You can do this recursively. Or iteratively (we'll do that).

with this helper, “contains” is trivial & fast! Binary Search 10 private int findIndexOf(Comparable<E> item) { int low = 0;    // min possible index of item   int high  =  count-1;      // max possible index of item while (low <= high) { int mid  =  (low + high) / 2; // calculate best guess   int comp = item.compareTo(data[mid]); if (comp == 0) return mid; // found the item!     if (comp > 0)         low = mid + 1;          // item should be in [mid+1..high]     else                            high = mid - 1;        // item should be in [low..mid-1] } return low;   // return insertion position with this helper, “contains” is trivial & fast!

Binary Search: Cost 11 What is the cost of searching if there are n items? key step = Iteration Size of range 1 n 2 k-1 k 1 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Log2(n ) : Every time you double n, you add only one step to the cost! 12 The number of times you can divide a set of n things in half: log2(1000) 10, log2(1,000,000)  20, log2(1,000,000,000)  30 Every time you double n, you add only one step to the cost! Logarithms often arise in analysing algorithms, especially with “Divide and Conquer” algorithms Problem Solve Solve Solution

Summary: ArraySet vs SortedArraySet 13 ArraySet: unordered All cost in the searching: O(n) contains: O(n) // simple, linear search add: O(n) // cost of searching to see if there’s a duplicate remove: O(n) // cost of searching the item to remove SortedArraySet: with Binary Search Binary Search is fast: O(log n ) contains: O(log n) // uses binary search add: O(n) // cost of keeping it sorted remove: O(n) // cost of keeping it sorted Most of the cost is caused by keeping it sorted!

Making SortedArraySet fast 14 If you have to call add() and/or remove() many times, then SortedArraySet is no better than ArraySet Both O(n) Either we... pay to search Or we... pay to keep it in order If set modifications are few but calls to contains() are frequent, then SortedArraySet is much better than ArraySet! SortedArraySet contains() is O(log n ) to find1-in-a-billion, if sorted, takes roughly the time that 1-in-30 would, if unsorted Efficiently construct a SortedArraySet from an unordered collection? yes  use an efficient sorting algorithm in the constructor