Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMPT 120 Lecture 32 – Unit 5 – Internet and Big Data

Similar presentations


Presentation on theme: "CMPT 120 Lecture 32 – Unit 5 – Internet and Big Data"— Presentation transcript:

1 CMPT 120 Lecture 32 – Unit 5 – Internet and Big Data
Algorithm – Searching and Complexity Analysis

2 Review - Selection Sort Algorithm
How it works: Repeatedly selects the next smallest (or largest) element from the unsorted section of the list and swaps it into the correct position in the already sorted section of the list: for index = 0 to len(data) – 2 do select: let x = location of element with smallest value in data from index to len(data) – 1 if index != x swap: tempVar = data[index] data[index] = data[x] data [x] = tempVar

3 Review - Selection Sort Function

4 Last Lecture – Little Activity
The story of Jean-Dominique Bauby locked-in syndrome (due to a stroke) which almost completely paralyzed him … he could only blink one eye Yet, Bauby, with the help of a speech therapist, “wrote” the book The Diving Bell and the Butterfly

5 One possible solution Bauby wants to “write” the word elephant Question 1: does the word start with a letter < M? Answer 1: 1 blink -> Yes - So we can ignore ½ of the alphabet: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Question 2: does the word start with a letter < F? Answer 2: 1 blink -> Yes - So we can ignore ½ of ½ of the alphabet: A B C D E F G H I J K L Question 3: does the word start with a letter < C? Answer 3: 2 blinks -> No - So we can ignore ½ of ½ of ½ of the alphabet: A B C D E Question 4: does the word start with a letter < D? Answer 4: 2 blinks -> No - So we can ignore ½ of ½ of ½ of ½ of the alphabet: C D E Question 5: does the word start with the letter D? Answer 5: 2 blinks -> No Question 6: does the word start with the letter E? Answer 6: 1 blink -> Yes - Bingo!

6 This solution has a name …
… Binary search algorithm

7 Another example Suppose we have a sorted list 1 3 4 7 9 11 12 14 21
Using Binary Search algorithm, we can search for target = 7 without having to look at every element

8 How binary search works – In a nutshell
Binary search uses the fact that the data is sorted! Find element in the middle -> 9 Since we are looking for 7 and 7 != 9 and 7 < 9, then there is no need to search the second half of the list We can ignore half of the list right away! Then we repeat the above steps Bottom line: using binary search, we do not need to look at every element to search a list So binary search is more time efficient than linear search

9 Visualising Binary Search

10 Let’s look at some code!

11 Algorithm Complexity The amount of resources (time and space) required to execute an algorithm

12 So, which is fastest? Linear search or binary search?
In order to answer this question, we need to analyze the complexity of these two algorithms (or code) Once we have done so, we express it using something called the Big O Notation In this course, we shall focus on the time required to run an algorithm

13 How to do Complexity Analysis?
We analyze the time an algorithm requires to execute by counting the number of operations it executes when the algorithm executes the worst-case scenario Actually, we are interested in 1 particular operation that is common to both algorithms: What on earth is that?

14 Worst-Case Scenario Remember this slide in Lecture 29

15 How many comparison operations are performed in linear search?
Let’s give this a go! How many comparison operations are performed in linear search? Test Case: # of Comparisons n (size of data) data = [1, 3, 4, 7, 9, 11, 12, 14, 21] target =

16 How many comparison operations are performed in binary search?
Let’s give this a go! How many comparison operations are performed in binary search? Test Case: # of Comparisons n (size of data) data = [1, 3, 4, 7, 9, 11, 12, 14, 21] target =

17 Log2 n Pattern D Size of data in 1st iteration: n
Binary Search Algorithm At each iteration: We compare the middle element with target and if target not found We partition the list in half, ignoring one half and searching the other Size of data in 1st iteration: n Size of data in 2nd iteration : n/2 D Size of data in 3rd iteration : n/4 Size of data in Tth iteration : 1

18 Log2 n Pattern (cont’d) Note, on this slide, N is used instead of n. D

19 Big O Notation Expresses time complexity of algorithms (also called algorithm “efficicency”) as they execute worst-case scenarios

20 Big O Notation O(n) - Linear search O(log n) - Binary search

21 Let’s have some fun! https://repl.it/repls/ShortEducatedSearchengine
A smart music app Let’s have some fun!

22 End-of-Semester Party
The end-of-semester party is coming up! To prep this party, we need to build our playlist -> most “danceable” songs. Problem Statement: Let’s say we want to build an application that searches for most danceable songs from our favourite musical artist. FYI, you can create a YouTube playlist if you want with this. Download the dataset: (sourced from

23 Spotify Music Data


Download ppt "CMPT 120 Lecture 32 – Unit 5 – Internet and Big Data"

Similar presentations


Ads by Google