ADSA: IntroAlgs/ Advanced Data Structures and Algorithms Objective –introduce algorithm design using basic searching and sorting, and remind students about T() and Big-Oh running time. Semester 2, Intro. to Algorithms
ADSA: IntroAlgs/1 2 Contents 1.Selection Sort 2. An Array Sublist 3.Sequential Search 4.Binary Search 5.System/Memory Efficiency 6.Running Time Analysis 7.Big-Oh Analysis 8.The Timing Class
ADSA: IntroAlgs/ Selection Sort Go through an array position by position, starting at index 0. At the current position, select the smallest element from the rest of the array. Swap it with the value in the current position.
ADSA: IntroAlgs/1 4 An Example
ADSA: IntroAlgs/1 5 Number of passes = size of array – 1 –e.g. made 4 passes over the 5-element arr[] –stopped sorting when finshed comparing arr[3] and arr[4]
ADSA: IntroAlgs/1 6 selectionSort() public static void selectionSort(int[] arr) { // index of smallest elem in sublist int smallIndex; int idx; int n = arr.length; // idx has range 0 to n-2 for (idx = 0; idx < n-1; idx++) { // scan sublist starting at idx smallIndex = idx; :
ADSA: IntroAlgs/1 7 /* j goes through sublist from arr[idx+1] to arr[n-1] */ for (int j = idx+1; j < n; j++) /* if smaller element found, assign smallIndex to that posn */ if (arr[j] < arr[smallIndex]) smallIndex = j; //* swap next smallest elem into arr[idx] int temp = arr[idx]; arr[idx] = arr[smallIndex]; arr[smallIndex] = temp; } } // end of selectionSort()
ADSA: IntroAlgs/1 8 Usage Example // an integer array int[] arr = {66, 20, 33, 55, 53, 57, 69, 11, 67, 70}; // use selectionSort() to order array selectionSort(arr); System.out.print("Sorted: "); for (int i=0; i < arr.length; i++) System.out.print(arr[i] + " "); Sorted:
ADSA: IntroAlgs/ An Array Sublist An array sublist is a sequence of elements whose indices begin at index first and go up to, but not including, last. Uses the notation: [first, last).
ADSA: IntroAlgs/ Sequential Search Begin with a target value and index range [first, last). Go through sublist item by item, looking for target. Return the index position of the match or -1 if target is not in sublist.
ADSA: IntroAlgs/1 11 Examples
ADSA: IntroAlgs/1 12 seqSearch() public static int seqSearch(int[] arr, int first, int last, int target) { /* scan first <= i < last; return index for position if a match occurs */ for (int i = first; i < last; i++) if (arr[i] == target) return i; return -1; // target not found }
ADSA: IntroAlgs/ Binary Search Binary search requires an ordered list, so large sections of the list can be skipped during the search. Calculate midpoint of the current sublist [first,last). If target matches midpoint value, then the search is finished. continued
ADSA: IntroAlgs/1 14 If target is less than midpoint value, look in the lower sublist; otherwise, look in the upper sublist. Continue until target is found or sublist size is 0.
ADSA: IntroAlgs/1 15 Binary Search Case 1 target == midpoint value. The search is complete –mid is the index of the midpoint value
ADSA: IntroAlgs/1 16 Binary Search Case 2 target < midValue –so search in lower sublist Index range becomes [ first, mid ). Set index last to be end of lower sublist ( last = mid ).
ADSA: IntroAlgs/1 17 Binary Search Case 3 target > midValue –so search upper sublist Index range becomes [mid+1,last), because the upper sublist starts to the right of mid Set index first to be front of the upper sublist (first = mid+1).
ADSA: IntroAlgs/1 18 Binary Search Finish The binary search stops when a match is found, or when the sublist is 'empty'. –an empty sublist for [first,last) is when first >= last.
ADSA: IntroAlgs/1 19 A Successful Search: Step 1 target = 23 continued
ADSA: IntroAlgs/1 20 Step 2 target = 23 The sublist is roughly halved. continued
ADSA: IntroAlgs/1 21 Step 3 target = 23 The sublist is roughly halved. continued
ADSA: IntroAlgs/1 22 Failure Example Step 1 target = 4 continued
ADSA: IntroAlgs/1 23 Step 2 target = 4 The sublist is roughly halved. continued
ADSA: IntroAlgs/1 24 Step 3 target = 4 The sublist is roughly halved. continued
ADSA: IntroAlgs/1 25 Step 4 Index range [2,2). first ≥ last, so search fails. The return value is -1.
ADSA: IntroAlgs/1 26 binSearch() public static int binSearch(int arr[], int first, int last, int target) { int mid; // index of midpoint int midValue; // value from arr[mid] // test for nonempty sublist while (first < last) { mid = (first+last)/2; midValue = arr[mid]; :
ADSA: IntroAlgs/1 27 if (target == midValue) return mid; // have a match // determine which sublist to search else if (target < midValue) // search lower sublist; set last last = mid; else // search upper sublist; set first first = mid+1; } return -1; // target not found } // end of binSearch()
ADSA: IntroAlgs/ System/Memory Efficiency System efficiency is how fast an algorithm runs on a particular machine. Memory efficiency is the amount of memory an algorithm uses –if an algorithm uses too much memory, it can be too slow, or may not execute at all, on a particular system.
ADSA: IntroAlgs/ Running Time Analysis Machine-independent algorithm efficiency is measured in terms of the number of operations used in the code. The complexity of the algorithm usually depends on some size measure –usually the size of the input data
ADSA: IntroAlgs/1 30 min() public static int min(int[] arr) // return the smallest elem. in arr[] { int n = arr.length; if (n == 0) { System.out.println("Array has 0 size"); return 0; } else { int min = arr[0]; for (int i = 1; i < n; i++) if (arr[i] < min) min = arr[i]; return min; } input data is the array, arr[]
ADSA: IntroAlgs/1 31 Running Time The number of comparison operations, T(n), required to find the smallest element in an n-element array. T(n) = n-1 T() was explained in "Discrete Maths", part 4 Could count all operations, but T() would still be linear in n.
ADSA: IntroAlgs/1 32 Running Time: Selection Sort Count the number of comparison operations used to sort an array of size n –there are n-1 passes altogether –in the first pass there are n-1 comparisons –in the 2 nd pass, n-2 comparisons,... T(n) = (n-1) + (n-2) = n(n-1)/2 = n 2 /2 - n/2
ADSA: IntroAlgs/1 33 Running Time: seqSearch() Best case: Find target at index 0. T(n) = 1 Worst case: Find target at index n-1 or not finding it. T(n) = n Average case: Average of the number of comparisons to find a target at any position. T(n) = ( n)/n = n(n+1)/2 * (1/n) = (n+1)/2
ADSA: IntroAlgs/1 34 Running Time: binSearch() Best case: Target found at first midpoint. T(n) = 1 Worst case: Length of sublists halves at each iteration. T(n) = (int) log 2 n + 1 Average case: A fancy analysis shows: T(n) = (int) log 2 n
ADSA: IntroAlgs/ Big-Oh Notation Big-Oh, O(n) is a simpler version of T(n) that only uses the 'biggest' term of the T(n) equation, without constants. –e.g. if T(n) = 8n 3 +5n 2 -11n+1, then T(n) is O(n 3 ). –Selection sort is O(n 2 ). –The average case for seqSearch() is O(n). –The worst case for binSearch() is O(log 2 n). O() was explained in "Discrete Maths", part 4
ADSA: IntroAlgs/1 36 Common Big-Oh's Constant time: T(n) is O(1) when its running time is independent of the n value. –e.g. find the smallest value in an ordered n-element array continued
ADSA: IntroAlgs/1 37 Linear: T(n) is O(n) when running time is proportional to n, the size of the data. If n doubles, T() doubles. –e.g. find the smallest value in an unordered n-element array, as in min() continued
ADSA: IntroAlgs/1 38 Quadratic: T(n) is O(n 2 ). If n doubles, T() increases by a factor of 4 –e.g. selection sort Cubic: T(n) is O(n 3 ). Doubling n increases T() by a factor of 8 –e.g. multiplication of two n*n matricies continued
ADSA: IntroAlgs/1 39 Logarithmic: T() is O(log 2 n) or O(n log 2 n). –occurs when the algorithm repeatedly subdivides the data into sublists whose sizes are 1/2, 1/4, 1/8,... of the original size n –e.g. binary search is O(log 2 n) –e.g. quicksort is O(n log 2 n) continued
ADSA: IntroAlgs/1 40 Exponential: T(n) is O(a n ). –These algorithms deal with problems that require searching through a large number of potential solutions before finding an answer. –e.g. the traveling salesman problem mentioned in the Discrete Maths subject
ADSA: IntroAlgs/1 41 T() Graphs
ADSA: IntroAlgs/1 42 T() Equations
ADSA: IntroAlgs/ The Timing Class A class in the Ford&Topp ds.time package.
ADSA: IntroAlgs/1 44 Timing Class Example Timing sortTimer = new Timing(); sortTimer.start();// start timing selectionSort(arr);// sort double timeInSec = sortTimer.stop(); // get sorting time in secs
ADSA: IntroAlgs/1 45 Search Time Comparison Compare sequential and binary search on the folllowing problem: ,999 target list , listSeq (listBin after sorting) search for each inside
ADSA: IntroAlgs/1 46 SearchTimes import java.util.Random; import java.text.DecimalFormat; import ds.util.Arrays; // Ford & Topp packages import ds.time.Timing; public class SearchTimes { public static void main(String[] args) { int ARRAY_SIZE = ; int TARGET_SIZE = 50000; // arrays for searches int[] listSeq = new int[ARRAY_SIZE], listBin = new int[ARRAY_SIZE], targetList = new int[TARGET_SIZE]; :
ADSA: IntroAlgs/1 47 // use Timing object t to compute times Timing t = new Timing(); // random number object Random rnd = new Random(); // format real numbers with 3 dps DecimalFormat fmt = new DecimalFormat("#.000"); :
ADSA: IntroAlgs/1 48 // initialize arrays with random numbers for (int i = 0; i < ARRAY_SIZE; i++) listSeq[i]=listBin[i]= rnd.nextInt( ); // initialize targetList with random numbers for (int i=0; i < TARGET_SIZE; i++) targetList[i] = rnd.nextInt( ); // time seq. search for targets in listSeq t.start(); for (int i = 0; i < TARGET_SIZE; i++) Arrays.seqSearch(listSeq, 0, ARRAY_SIZE, targetList[i]); :
ADSA: IntroAlgs/1 49 double seqTime = t.stop(); System.out.println("Sequential Search takes " + fmt.format(seqTime) + " seconds."); // sort listBin Arrays.selectionSort(listBin); // time binary search for targets in listBin t.start(); for (int i = 0; i < TARGET_SIZE; i++) Arrays.binSearch(listBin, 0, ARRAY_SIZE, targetList[i]); :
ADSA: IntroAlgs/1 50 double binTime = t.stop(); System.out.println("Binary Search takes " + fmt.format(binTime) + " seconds."); System.out.println("Ratio of sequential to binary search time is " + fmt.format(seqTime/binTime)); } // end of main() } // end of SearchTimes class
ADSA: IntroAlgs/1 51 Compilation and Execution Ford and Topp libraries Must add in sorting time.