CPT: Search/ Computer Programming Techniques Semester 1, 1998 Objectives of these slides: –to discuss searching: its implementation, and some complexity analyses 17. Searching
CPT: Search/172 Overview: 1. Searching Definition 2. External/Internal Searching 3. Simplifying the Search 4. Analysing Searching 5. Linear Search 6. Binary Search 7. Comparison of Searching Algorithms
CPT: Search/ Searching Definition l We are given a collection of n elements: a 0, a 1,... a n-1 l Each a i consists of two parts – a unique ID, or key, and some data. l Searching is the process of finding an a i such that its key equals some specified key value, k.
CPT: Search/ External/Internal Searching l Searching algorithms fall into two broad categories: –external searching (large amount of data on disk) –internal searching (small amount of data in memory)
CPT: Search/ External Searching l The data is too large to be all loaded into memory at once. l Scan all the data files to find the requested item. l Try to minimise the number of blocks read.
CPT: Search/ Internal Searching l Traverse the data structure holding the items. l Arrays, lists and trees can be used as data structures. l Try to minimise the number of key comparisons.
CPT: Search/ Simplifying the Search 3.1. Internal Search (arrays) 3.2. Simplified Search Data Structure 3.3. Search Function Interface 3.4. Search Driver
CPT: Search/ Internal Search (arrays) l Array-based search data structure: #define SIZE 100 typedef data_item ??/* e.g. string */ struct elem { int key; data_item s; } struct elem a[SIZE];
CPT: Search/179 Pictorially: jimjanebobannjanejim key data item a[0]a[1]a[2]a[3]a[4]
CPT: Search/ Simplified Search Data Structure l Remove keys l Represent the data items as integers l Each integer (data item) is unique: no duplicates –can use data items as keys l e.g a[0]a[1]a[2]a[3]...
CPT: Search/ Search Function Interface int search(int array[], int key, int size) { /*... */ } l The function examines the array looking for an item containing key. –Success: return array index –Failure: return -1
CPT: Search/ Search Driver /* Search an integer array */ #include #define SIZE 100 int search(int [], int, int); void main() { int a[SIZE], x, searchkey, index; for (x = 0; x < SIZE; x++) a[x] = 2 * x; printf("Enter integer search key:\n"); scanf("%d", &searchkey); : continued
CPT: Search/1713 index = search(a, searchkey, SIZE); if (index != -1) printf("Array index %d\n", index); else printf("Value not found\n"); } int search(int array[], int key, int size) { /* search implementation */ }
CPT: Search/ Analysing Search l Searching algorithms should be both space and time efficient. l With internal searching, the critical factor is the number of key comparisons, C. l For each searching algorithm, consider its best, worst and average case performance in terms of C.
CPT: Search/ Linear Search 5.1. Linear Search Algorithm 5.2. Linear Search Function 5.3. Linear Search Program 5.4. Analysis of Linear Search 5.5. Sorted Linear Search 5.6. Recursive Linear Search
CPT: Search/ Linear Search Algorithm l Move along the data structure, comparing the search key, k, to the key value of the current item until: –a match is found or –all the data structure has been considered
CPT: Search/ Linear Search Function int linear_search(int array[], int key, int size) { int n; for (n = 0; n < size; n++) if (array[n] == key) return n; return -1; }
CPT: Search/ Linear Search Program Fig 6.18 /* Linear search of an array */ #include #define SIZE 100 int linear_search(int [], int, int); void main() { int a[SIZE], x, searchkey, index; for (x = 0; x < SIZE; x++) a[x] = 2 * x; printf("Enter integer search key:\n"); scanf("%d", &searchkey); : continued
CPT: Search/1719 index = linear_search(a, searchkey, SIZE); if (index != -1) printf("Array index %d\n", index); else printf("Value not found\n"); }
CPT: Search/1720 Execution Enter Integer Search Key: 36 Array index 18 Enter integer search key: 37 Value not found
CPT: Search/ Analysis of Linear Search l Consider an array with n items. l In the best case, find a match at the start of the array: C min = 1 l In the worst case, all items are examined: C max = n
CPT: Search/1722 l In the average case, about half of the items are considered: C average = n/2 = O(n)
CPT: Search/1723 Meaning of O() l Read O() as “about”, where constants and small values are ignored. Concentrate on large changes. l For example: –5n + 2= O(n) –5n 2 + n= O(n 2 ) –6= O(1)/* constant */ l O() is useful for giving rough estimates.
CPT: Search/ Sorted Linear Search int sl_search(int array[], int key, int size) { int n; for (n = 0; n key) return -1;/* larger key found */ return -1; }
CPT: Search/1725 l The actual speed of the function will increase (for some arrays) but the average complexity remains at O(n)
CPT: Search/ Recursive Linear Search int rl_search(int array[], int key, int index, int size) { if (index >= size) return -1; if (array[index] == key) return index; else return( rl_search(array, key, index+1, size) ); }
CPT: Search/1727 Call from main() : element = rl_search(a, searchkey, 0, SIZE) /* 0 is initial index */
CPT: Search/ Binary Search 6.1. Binary Search Algorithm 6.2. Execution of Binary Search 6.3. Binary Search Function 6.4. Analysis of Binary Search 6.5. Recursive Binary Search Function
CPT: Search/ Binary Search Algorithm l A divide-and-conquer algorithm l Search for key value k: 1. Take middle item of array segment: a mid 2. If k == a mid 's key, then the search is successful :
CPT: Search/ Otherwise, if the range of array items under consideration is empty then the search has failed 4. If k < a mid 's key then restrict search to lower half of array and go to step 1 5. If k > a mid 's key then restrict search to upper half of array and go to step 1
CPT: Search/ Execution of Binary Search l Searching for 15 in 2, 4, 6, 7, 10, 11, 15, 17, 20, 29, 30 l Since 15 > the middle key (11), consider its upper half: 15, 17, 20, 29, 30 l Since 15 < the middle key (20), consider its lower half: 15, 17 l 15 == middle item, success.
CPT: Search/ Binary Search Function int binary_search(int array[], int key, int size) { int low = 0, high = size - 1, mid; while ( low array[mid]) low = mid + 1; else /* found match */ return mid; } return -1; /* no match */ }
CPT: Search/ Analysis of Binary Search l Consider an array with 2 k items. l At each iteration: –either one or two key comparisons are made –the number of items under consideration is halved l At an array range of size 1, either: –found the item, or –item not in array.
CPT: Search/1734 l It takes k iterations to reach array range of size 1: 2 k items ฎ 1 itemin k steps so k items ฎ 1 itemin log 2 k steps l Thus, for an array of size n, it takes (about) log 2 n steps/iterations.
CPT: Search/1735 Best Case l Find a match in the first iteration: C min = 2
CPT: Search/1736 Worst Case l Key is not in the array l Array is divided in half log 2 n times l Each iteration requires roughly 2 key comparisons l C max ญ 2 * no. of iterations 2 * log 2 n =O(log 2 n)
CPT: Search/1737 Average Case l Need to consider all possible cases: –matches at all positions –misses at all positions l The result: C average ญ 1.8 * logn = O(log 2 n) l Close to worst case performance
CPT: Search/ Recursive Binary Search Function int rb_search(int array[], int left, int right, int key) { int mid; if (left > right) /* nowhere to look */ return -1; mid = (left+right)/2; if (array[mid] == key) return mid; if (array[mid) < key) return( rb_search(array, mid+1, right, key) ); else return( rb_search(array, left, mid-1, key) ); }
CPT: Search/1739 The initial call from main(): element = rb_search(a, 0, SIZE-1, searchkey); l Average complexity remains at O(log 2 n)
CPT: Search/ Comparison of Searching Algorithms l Linear search is O(n) l Binary search is O(log 2 n) l Plug in values of n, to see that binary search is better (faster).
CPT: Search/1741 l The actual speed of linear search can be improved by ordering elements. However it is still O(n). l For an array of sorted elements, binary search is preferable l If the array is unsorted, binary search cannot be used.