Searching – Linear and Binary Searches
Comparing Algorithms Should we use Program 1 or Program 2? Is Program 1 “fast”? “Fast enough”?
You and Igor: the empirical approach Implement each candidate Run it Time it That could be lots of work – also error-prone. Which inputs? What machine/OS?
Toward an analytic approach … How to solve “which algorithm” problems without machines, test data, or Igor!
The Big Picture Algorithm Input (n = 3) Input (n = 4) Input (n = 8) Algorithm Output Output Output How long does it take for the algorithm to finish?
Primitives Primitive operations Others x = 4 assignment ... x + 5 ... arithmetic if (x < y) ... comparison x[4] index an array *x dereference (C) x.foo( ) calling a method Others new/malloc memory usage
How many foos? for (j = 1; j <= N; ++j) { foo( ); } Σ N j = 1 1 = N
How many foos? Σ Σ for (j = 1; j <= N; ++j) { for (k = 1; k <= M; ++k) { foo( ); } Σ N j = 1 Σ M k = 1 1 = NM
How many foos? Σ Σ Σ for (j = 1; j <= N; ++j) { for (k = 1; k <= j; ++k) { foo( ); } N (N + 1) Σ N j = 1 Σ j k = 1 Σ N j = 1 1 = j = 2
How many foos? for (j = 0; j < N; ++j) { for (k = 0; k < j; ++k) { foo( ); } for (k = 0; k < M; ++k) { N(N + 1)/2 NM
How many foos? void foo(int N) { if(N <= 2) return; foo(N / 2); } T(0) = T(1) = T(2) = 1 T(n) = 1 + T(n/2) if n > 2 T(n) = 1 + (1 + T(n/4)) = 2 + T(n/4) = 2 + (1 + T(n/8)) = 3 + T(n/8) = 3 + (1 + T(n/16)) = 4 + T(n/16) … ≈ log2 n
What is Big-O? Big-O refers to the order of an algorithm runtime growth in relation to the number of items Focus on dominant terms and ignore less significant terms and constants as n grows I. O(l) - constant time This means that the algorithm requires the same fixed number of steps regardless of the size of the task. Examples (assuming a reasonable implementation of the task): A. Push and Pop operations for a stack (containing n elements); B. Insert and Remove operations for a queue. II. O(n) - linear time This means that the algorithm requires a number of steps proportional to the size of the task. A. Traversal of a list (a linked list or an array) with n elements; B. Finding the maximum or minimum element in a list, or sequential search in an unsorted list of n elements; C. Traversal of a tree with n nodes; D. Calculating iteratively n-factorial; finding iteratively the nth Fibonacci number. III. O(n2) - quadratic time The number of operations is proportional to the size of the task squared. Examples: A. Some more simplistic sorting algorithms, for instance a selection sort of n elements; B. Comparing two two-dimensional arrays of size n by n; C. Finding duplicates in an unsorted list of n elements (implemented with two nested loops). IV. O(log n) - logarithmic time A. Binary search in a sorted list of n elements; B. Insert and Find operations for a binary search tree with n nodes; C. Insert and Remove operations for a heap with n nodes. V. O(n log n) - "n log n " time A. More advanced sorting algorithms - quicksort, mergesort SortingBigOh
What is Big-O? runtime growth in relation to the number of items Big-O refers to the order of an algorithm runtime growth in relation to the number of items I. O(l) - constant time (Push and pop elements on a stack) II. O(n) - linear time The algorithm requires a number of steps proportional to the size of the task. (Finding the minimum of a list) III. O(n2) - quadratic time The number of operations is proportional to the size of the task squared. (Selection and Insertion sort) IV. O(log n) - logarithmic time (Binary search on a sorted list) V. O(n log n) - "n log n " time (Merge sort and quicksort) I. O(l) - constant time This means that the algorithm requires the same fixed number of steps regardless of the size of the task. Examples (assuming a reasonable implementation of the task): A. Push and Pop operations for a stack (containing n elements); B. Insert and Remove operations for a queue. II. O(n) - linear time This means that the algorithm requires a number of steps proportional to the size of the task. A. Traversal of a list (a linked list or an array) with n elements; B. Finding the maximum or minimum element in a list, or sequential search in an unsorted list of n elements; C. Traversal of a tree with n nodes; D. Calculating iteratively n-factorial; finding iteratively the nth Fibonacci number. III. O(n2) - quadratic time The number of operations is proportional to the size of the task squared. Examples: A. Some more simplistic sorting algorithms, for instance a selection sort of n elements; B. Comparing two two-dimensional arrays of size n by n; C. Finding duplicates in an unsorted list of n elements (implemented with two nested loops). IV. O(log n) - logarithmic time A. Binary search in a sorted list of n elements; B. Insert and Find operations for a binary search tree with n nodes; C. Insert and Remove operations for a heap with n nodes. V. O(n log n) - "n log n " time A. More advanced sorting algorithms - quicksort, mergesort SortingBigOh
How would you find the first locker with a chinchilla in it?
It’s the 6th one from the left
Searching Linear search: also called sequential search Examines all values in an array until it finds a match or reaches the end Number of visits for a linear search of an array of n elements: The average search visits n/2 elements The maximum visits is n A linear search locates a value in an array in O(n) steps
Pick a Number Pick a number between 1 and 100. You want to determine my number in the least amount of guesses possible. How do you do it?
Binary Search Locates a value in a sorted array by Determining whether the value occurs in the first or second half Then repeating the search in one of the halves
Binary Search 15 ≠ 17: we don't have a match
Binary Search Count the number of visits to search an sorted array of size n We visit one element (the middle element) then search either the left or right subarray Thus: T(n) = T(n/2) + 1 where T(n) is time to search an array of size n If n is n/2, then T(n/2) = T(n/4) + 1 Substituting into the original equation: T(n) = T(n/4) + 2 This generalizes to: T(n) = T(n/2k) + k
Binary Search Assume n is a power of 2, n = 2m where m = log2(n) Then: T(n) = 1 + log2(n) Binary search is an O(log(n)) algorithm
Comparison of Sequential and Binary Searches
Assignment Program a linear/sequential search of an array of Strings Output the index where the element was found. Output must be user friendly “Your value was found at index 13”; “Your value was found at index -1” or “Your value was not found” Program a binary search of an ArrayList of Integer Recursive “Your value was found in the array” or “Your value was not found in the array”
Assignment Program a linear/sequential search of an array of Strings Output the index where the element was found. Output must be user friendly “Your value was found at index 13”; “Your value was found at index -1” or “Your value was not found” Can do this in the main() – no tester class needed Build an array of at least 15 String elements Ask user to input the word to search for