Download presentation
Presentation is loading. Please wait.
1
Winter 2018 CISC101 12/2/2018 CISC101 Reminders Last quiz this week in lab. Topics in slides from last week. Last assignment due last day of class (April 6). Winter 2018 CISC101 - Prof. McLeod Prof. Alan McLeod
2
Today Algorithms, Cont. - Searching: Sequential Search. Binary Search.
Start Sorting? Winter 2018 CISC101 - Prof. McLeod
3
Algorithms, Cont. - Searching
CISC101 Algorithms, Cont. - Searching We will look at two algorithms: Sequential Search Brute force! Works on a dataset in any order. Binary Search Fast! Only works on sorted datasets. Winter 2018 CISC101 - Prof. McLeod Prof. Alan McLeod
4
Searching in Python We already have searching methods as well as the keywords in and not in. Lists: count(), index(), find() Strings: various find(), count() and index() methods. A search could result in a count of occurrences, a True or False or just the location of the first match. So, why do we need to write our own searching functions? Winter 2018 CISC101 - Prof. McLeod
5
Searching in Python, Cont.
You might need to search datasets in a programming language that does not have this code built-in. Or your dataset structure might not be amenable for use with the built-in methods. So, you need to know these algorithms! Winter 2018 CISC101 - Prof. McLeod
6
Searching, Cont. Sequential search pseudocode:
Loop through dataset starting at the first element until the value of target matches one of the elements. Return the location of the match If a match is not found, raise a ValueError exception. (Note that the list.index() method also raises a ValueError exception if the value is not located…) Winter 2018 CISC101 - Prof. McLeod
7
Sequential Search As code:
def sequentialSearch(numsList, target) : i = 0 size = len(numsList) while i < size : if numsList[i] == target : return i i = i + 1 raise ValueError("Target not found.") Note how len(numsList) is done outside loop. Winter 2018 CISC101 - Prof. McLeod
8
Sequential Search, Version 2
def sequentialSearch2(numsList, target) : for i in range(len(numsList)) : if numsList[i] == target : return i raise ValueError("Target not found.") Using our trusty for loop. Is this one faster? Can this be coded without using the index, i? Winter 2018 CISC101 - Prof. McLeod
9
Timing our Search See: TimingSeqSearchDemo.py
Look at normal case and worst-case. Note how exception is raised and caught. The farther the target is from the beginning of the dataset, the longer the search takes. Makes sense! Our fastest sequential search is still about 4 times slower than index(). Why? Winter 2018 CISC101 - Prof. McLeod
10
Other Searches Return True if a match exists.
Return a count of how many values match. Return a list of locations that match (not built-in to Python). Return the location of the match searching from the end of the list. … Winter 2018 CISC101 - Prof. McLeod
11
Aside – Comparing Real Numbers
Does equal 1? To compare floating point numbers you need to know something about the accuracy of the numbers. Suppose that your numbers are accurate to 1 part in 10,000. Instead of comparing using ==, use something like: if abs(numsList[i] – target) <= 1e-4: return i Winter 2018 CISC101 - Prof. McLeod
12
Summary The sequential search algorithm works for any kind of dataset in any order. It is a “brute force” technique. You now know how to build your own sequential search function if Python’s built-in searches won’t do the trick. Next: binary search. Winter 2018 CISC101 - Prof. McLeod
13
Searching an Ordered Dataset
How do you find a name in a telephone book? In Exercise 4, part 4.3 you (may have) coded the guessing game. What is the most effective way of guessing the unknown number? Winter 2018 CISC101 - Prof. McLeod
14
Binary Search Binary search pseudocode. Only works on ordered collections: Locate midpoint of dataset to search. b) Determine if target is in lower half or upper half of dataset. If in lower half, make this half the dataset to search. If in upper half, make this half the dataset to search. Loop back to step a), unless the size of the dataset to search is one, and this element does not match, in which case raise a ValueError exception. Winter 2018 CISC101 - Prof. McLeod
15
Binary Search – Cont. def binarySearch(numsList, target): low = 0
high = len(numsList) - 1 while low <= high : mid = (low + high) // 2 if target < numsList[mid]: high = mid - 1 elif target > numsList[mid]: low = mid + 1 else: return mid raise ValueError("Target not found.") Winter 2018 CISC101 - Prof. McLeod
16
Binary Search Animation
See also: Winter 2018 CISC101 - Prof. McLeod
17
Binary Search – Cont. For the best case, the element matches right at the middle of the dataset, and the loop only executes once. For the worst case, target will not be found and the maximum number of iterations will occur. Note that the loop will execute until there is only one element left that does not match. Each time through the loop the number of elements left is halved, giving the progression below for the number of elements: Winter 2018 CISC101 - Prof. McLeod
18
Binary Search – Cont. Number of elements to be searched - progression:
The last comparison is for n/2m, when the number of elements is one (worst case). So, n/2m = 1, or n = 2m. Or, m = log2(n). So, the algorithm loops log(n) times for the worst case. Winter 2018 CISC101 - Prof. McLeod
19
Binary Search - Cont. Binary search with log(n) iterations for the worst case is much better than n iterations for sequential search! Major reason to sort datasets! Winter 2018 CISC101 - Prof. McLeod
20
Binary Search – Timings
See TimingBothSearchesDemo.py Does index() work any faster with a sorted() list? Can index() assume the list is sorted? Compare comparison counts between sequential and binary. Winter 2018 CISC101 - Prof. McLeod
21
Summary You sort datasets just so you can use binary search – it is so much faster! The search functionality built into Python cannot assume that the dataset is sorted, so it will always take longer than our coded binary search. Winter 2018 CISC101 - Prof. McLeod
22
Sorting – Overview We will look at three simple sorts: Insertion sort
CISC101 Sorting – Overview We will look at three simple sorts: Insertion sort Selection sort Bubble sort Insertion and Selection can be very useful in certain situations. Bubble sort may be the *worst* sorting algorithm known! (See: Any of these would be easy to code for smaller datasets. Winter 2018 CISC101 - Prof. McLeod Prof. Alan McLeod
23
Before We Start… You need to learn these algorithms for the same reasons you needed to learn searching algorithms. The sort() in Python is way faster – it uses Quicksort, which uses recursion – both topics are outside the scope of this course (see CISC121!). Even if we coded Quicksort it would still be slower because of the interpreted vs. compiled issue. Winter 2018 CISC101 - Prof. McLeod
24
Sorting – Overview – Cont.
The first step in sorting is to select the criteria used for the sort and the direction of the sort. It could be ascending numeric order, or alphabetic order by last name, etc. Winter 2018 CISC101 - Prof. McLeod
25
Sorting – Overview – Cont.
How to sort (which algorithm to use)?: How large is the dataset? What will be critical: memory usage or execution time? Is it necessary that a new element be inserted in order or can it be added at the end and the sort deferred? How often will the algorithm be asked to sort a dataset that is already in order, except for a few newly added elements? Or, will it always have to re-sort a completely disordered dataset? Winter 2018 CISC101 - Prof. McLeod
26
Sorting – Overview – Cont.
Sorting algorithms can be compared on the basis of: The number of comparisons for a dataset of size n, The number of data movements (“swaps”) necessary, and How these measures change with n (Analysis of Complexity…). Often need to consider these measures for best case (data almost in order), average case (random order), and worst case (reverse order). Some algorithms behave the same regardless of the state of the data, and others do better depending on how well the data is initially ordered. Winter 2018 CISC101 - Prof. McLeod
27
Sorting – Overview – Cont.
If sorting simple values like integers or key values, then comparisons are easy to carry out and the comparison efficiency of the algorithm may not be as important as the number of data movements. However, if strings or objects are being compared then the number of comparisons would be best kept to a minimum. Finally, the only real measure of what algorithm is the best is an actual measure of elapsed time. The initial choice can be based on theory alone, but the final choice for a time-critical application must be by actual experimental measurement. Winter 2018 CISC101 - Prof. McLeod
28
Sorting – Overview – Cont.
I will be presenting code samples that sort lists of integers into ascending order because this is easiest to understand. However the logic of the algorithm can be applied directly to lists of strings or other objects, provided the search criteria are specified. Descending order usually only requires that you change the comparison from “>” to “<“. Winter 2018 CISC101 - Prof. McLeod
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.