COMP 2402/2002 Abstract Data Types and Algorithms Prof: Office: Office hours: Eduardo Mesa HP 5347 Tuesday and Thursday 4:30pm – 5:30pm Web Site: and also WebCT Textbook: Open Data Structures (in Java). The pdf can be downloaded from the website
TAs Andrew Trenholm No office hours for TAs Tawfic Abdul-Fatah Prompt answer to questions in: − Web-CT discussion − Carleton Computer Science Society forum.
Evaluation Assignments Midterm: Final Exam: Active Participation: 30% 40% 5% 3 assignments 10% each 2 Midterms 15% each bonus
Assignments 1 Theory assignment 2 Programming Assignments − Must be handled first thing in class. − All pages must be stapled No late assignment will be accepted. − Must be uploaded on Web CT − Make a folder ( _ _ _ ) − Put all the source files in the folder − Add a text file with your Id and full name − Zip the folder
End of the Introduction Begining of the Lecture
a data storage format that can contain a specific type or range of values characterized by a set of operations that satisfy a set of specific properties. Data Type
Example Data Type Int Range of Value: -(2 31 ) to (2 31 ) Properties: Symmetry: a+b = b+a, a*b = b*a Associative a+(b+c) = (a+b)+c Definition: a/b, b must not be 0 Operations: +, -, *, /
Data Structures How to organize data to be able to perform operations on those data efficiently. A variable is the simplest data structure.
Example Electronic Phone Book Contains different DATA: - Names - phone numbers - addresses Need to perform certain OPERATIONS: - add - delete - look for a phone number - look for an address
Example Electronic Phone Book How to organize the data so to optimize the efficiency of the operations A B X Z ListBinary Search TreeDictionary
Example Finding the best route for an message in a network Contains DATA: - Network + Traffic Need to perform certain OPERATIONS: - Find the best route
Example Electronic Phone Book How to represent the data Adjacency MatrixAdjacency List
Abstract Data Type (ADT) (interfaces) Define what operation can be done. Define how each operation is performed. Implementations (algorithms) A same ADT could have several different implementations.
Lucy GetBook ( Book b ) Binary Search ADT (Shelf) Peter Random Search
Identify your data Identify the operations you need to perform (and how often each operation is performed) So to perform the operations efficiently we need: Data Structures Define efficiency Choose the best structure for your data.
Algorithm Input Output Analysis of Algorithms An algorithm is a step-by-step procedure for solving a problem in a finite amount of time. Analyze an algorithm = determine its efficiency
Efficiency ? Running time … Memory … Quality of the result …. Simplicity …. Generally, while improving the efficiency in one of these aspects we diminish the efficiency in the others
Running time Running Time Input Size Best case Average case Worst case The running time depends on the input size It also depends on the input data: Different inputs can have different running times
19 Running Time of an algorithm Average case time is often difficult to determine. We focus on the worst case running time. – Easier to analyze – Crucial to applications such as games, finance and robotics
20 If x is odd return x If x is even compute the sum S of the first x integers return S Example ….
21 Measuring the Running Time How should we measure the running time of an algorithm? Approach 1: Experimental Study
22 Beyond Experimental Studies Experimental studies have several limitations: – need to implement – limited set of inputs – hardware and software environments.
23 Theoretical Analysis We need a general methodology that:- Uses a high-level description of the algorithm (independent of implementation). Characterizes running time as a function of the input size. Takes into account all possible inputs. Is independent of the hardware and software environment.
24 Analysis of Algorithms Primitive Operations: Low-level computations independent from the programming language can be identified in pseudocode. Examples: – calling a method and returning from a method – arithmetic operations (e.g. addition) – comparing two numbers, etc. By inspecting the pseudo-code, we can count the number of primitive operations executed by an algorithm.
25 Example: Algorithm arrayMax(A, n): Input: An array A storing n integers. Output: The maximum element in A. currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax
26 currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax currentMax A
27 currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax currentMax A 5
28 currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax A currentMax 5
29 What are the primitive operations to count ? currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax Comparisons Assignments to currentMax A 13
30 currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax 1 assignment n-1 comparisons n-1 assignments (worst case)
currentMax A[0] for i 1 to n -1 do if currentMax < A[i] then currentMax A[i] return currentMax In the best case ? assignment n-1 comparisons 0 assignments
Summarizing: Worst Case: n-1 comparisons n assignments Best Case: n-1 comparisons 1 assignment Compute the exact number of primitive operations could be difficult We compare the asymptotic behaviour of the running time when the size of the input rise.
33 Big-Oh – given two functions f(n) and g(n), we say that f(n) is O(g(n)) if and only if there are positive constants c and n 0 such that f(n) c g(n) for n n 0 n0n0 c g(n) f(n) n (upper bound)
34 g(n) = n n What does it mean c g(n) ? Example: 34 2 g(n) = 2 n n 3 g(n) = 3 n
35 f(n) = 2n+1 g(n) = n 2 Graphical example … f(n) is O(n 2 ) f(n) c g(n) for n n 0 n n 0 ≈2.5 c = 1 ?
36 f(n) = 2n+1 g(n) = n But also f(n) c g(n) for n n 0 n
37 f(n) = 2n+1 2 g(n) = 2 n But also f(n) c g(n) for n n 0 n
38 f(n) = 2n+1 3 g(n) = 3 n But also f(n) c g(n) for n n 0 f(n) is O(n) n c = 3 and n 0 = 1
39 On the other hand… n 2 is not O(n) because there is no c and n 0 such that: n 2 cn for n n 0 ( no matter how large a c is chosen there is an n big enough (n > c) that n 2 > c n ). n2n2 n n0n0 n 2n3n4n
O(g(n)) = {f(n) : there exists positive constants c and n 0 such that f(n) cg(n) for all n n 0 } Notice: O(g(n)) is a set of functions When we say f(n) = O(g(n)) we really mean f(n) O(g(n)) Formal definition of big-Oh:
41 Prove that f(n) = 60n 2 + 5n + 1 is O(n 2 ) We must find a constant c and a constant n 0 such that: 60n 2 + 5n + 1 ≤ c n 2 for all n≥n 0 5n ≤ 5n 2 for all n≥1 1 ≤ n 2 for all n≥1 f(n) ≤ 60n 2 +5n 2 + n 2 for all n≥1 f(n) ≤ 66n 2 for all n≥1c= 66 et n 0 =1 => f(n) = O(n 2 ) Example:
f(n) ≤ 13n log 2 n for all n ≥ 2 f(n) ϵ O(n log 2 n ) [ c = 13, n 0 = 2 ] Prove f(n) = 5n log 2 n + 8n = O(n log 2 n) 5n log 2 n + 8n ≤ 5n log 2 n + 8n ≤ 5n log 2 n + 8n log 2 n for n ≥ 2 (log 2 n ≥ 1) ≤ 13n log 2 n Example:
We can multiply these to learn about other functions, O(an) = O(n) ⊂ O(n log n) ⊂ O(n 1+b ) ⊂ O(n cn ) Some commons relations O(n c 1 ) ⊂ O(n c 2 ) for any c 1 < c 2 For any constants a; b; c > 0, O(a) ⊂ O(log n) ⊂ O(n b ) ⊂ O(c n ) These make things faster 2 log 2 n + 2 = O(log n) n + 2 = O(n) 2n + 15n 1/2 = O(n) Examples: O(n 1/5 ) ⊂ O(n 1/5 log n)
Ex 1: 2n 3 + 3n 2 = O (max(2n 3, 3n 2 )) = O(2n 3 ) = O(n 3 ) Theorem: If g(n) is O(f(n)), then for any constant c > 0 g(n) is also O(c f(n)) Theorem: O(f(n) + g(n)) = O(max(f(n), g(n))) Ex 2: n log n – 7 = O(max(n 2, 3 log n – 7)) = O(n 2 )
45 Drop lower order terms and constant factors 7n-3 is O(n) 8n 2 log n + 5n 2 + n is O(n 2 log n) 12n n 2 + 2n 4 is O(n 4 ) Simple Big Oh Rule:
46 Use the smallest possible class of functions –Say “2n is O(n)” instead of “2n is O(n 2 )” Other Big Oh Rules: Use the simplest expression of the class –Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”
Asymptotic Notation (terminology) Special classes of algorithms: constant:O(1) logarithmic:O(log n) linear:O(n) quadratic:O(n 2 ) cubic:O(n 3 ) polynomial:O(n k ), k >0 exponential:O(a n ), n > 1
48 The i -th prefix average of an array X is average of the first (i 1) elements of X X[0] X[1] … X[i] Example of Asymptotic Analysis An algorithm for computing prefix averages (i 1) ……………… A[i] A[i]
49 Example of Asymptotic Analysis Algorithm prefixAverages1(X, n) Input array X of n integers Output array A of prefix averages of X#operations A new array of n integers for i 0 to n 1 do s X[0] for j 1 to i do s s + X[j] A[i] s / (i + 1) return A
i j = 0 5
i j = 0,1 59
i j = 0,1,
i j = 0,1,2,
54 Example of Asymptotic Analysis Algorithm prefixAverages1(X, n) Input array X of n integers Output array A of prefix averages of X#operations A new array of n integers n for i 0 to n 1 do n s X[0] n for j 1 to i do …+ (n 1) s s + X[j] …+ (n 1) A[i] s / (i + 1) n return A 1
55 The running time of prefixAverages1 is O( …+ n) The sum of the first n integers is n(n + 1) / 2 – There is a simple visual proof of this fact
56 Thus, algorithm prefixAverages1 runs in time O(n(n + 1) / 2) which is O(n 2 ) …+ n = n(n+1) TO REMEMBER 2
57 Another Example: A better algorithm for computing prefix averages Algorithm prefixAverages2(X): Input: An n-element array X of numbers. Output: An n -element array A of numbers such that A[i] is the average of elements X[0],..., X[i]. Let X be an array of n numbers. s 0 for i 0 to n-1 do s s + X[i] A[i] s/(i+ 1) return array A
i s=0 5
i s=5 59
i s=
i s=
62 Let X be an array of n numbers.# operations s 01 for i 0 to n-1 do n s s + X[i] n A[i] s/(i+ 1)n return array A1 O(n) time Another Example: A better algorithm for computing prefix averages
63 big-Omega (lower bound) f(n) is (g(n)) if there exist c > 0 and n 0 > 0 such that f(n) c g(n) for all n n 0 n n0n0 c g(n) f(n) (thus, f(n) is (g(n)) iff g(n) is O(f(n)) )
64 … is big theta … g(n) is (f(n)) if g(n) O(f(n)) AND f(n) O(g(n)) big-Theta
65 We have seen that f(n) = 60n 2 + 5n + 1 is O(n 2 ) but 60n 2 + 5n + 1 60n 2 for n 1 So: with c = 60 and n 0 = 1 f(n) c n 2 for all n 1 f(n) is O(n 2 ) AND f(n) is (n 2 ) f(n) is (n 2 ) f(n) is ( n 2 ) Example:
66 Intuition for Asymptotic Notation Big-Oh –f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n) big-Omega –f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n) big-Theta –f(n) is (g(n)) if f(n) is asymptotically equal to g(n)
67 Math You Need to Review Logarithms and Exponents properties of logarithms: log b (xy) = log b x + log b y log b (x/y) = log b x - log b y log b x a = alog b x log b a x = (1/a)log b x log b a= log x a/log x b Natural logarithm : ln k = ∫ 1 k (1/x)dx e = lim k →∞ (1+1/n) n ≈
68 Math You Need to Review Logarithms and Exponents properties of exponentials: a (b+c) = a b a c a bc = (a b ) c a b /a c = a (b-c) b = a log a b b c = a c*log a b
69 More Math to Review Floor: x = the largest integer ≤ x Ceiling: x = the smallest integer ≥ x Summations: – Arithmetic progression: – Geometric progression :
70 More Math to Review Arithmetic Progression n S = di = 0 + d + 2d + … + nd i=0 = nd+(n-1)d+(n-2)d + … + 0 S = d/2 n(n+1) for d=1 S = 1/2 n(n+1) 2S = nd + nd + nd + …+ nd = (n+1) nd
71 More Math to Review Geometric Progression n S = r i = 1 + r + r 2 + … + r n i=0 rS = r + r 2 + … + r n + r n+1 If r=2, S = (2 n+1 -1) rS - S = (r-1)S = r n S = (r n+1 -1)/(r-1) n rS - S = r i = -1 - r - r 2 - … - r n i=0 r + r 2 + … + r n + r n+1
72 Math You Need to Review Randomization and Probability Expected value E[X] = ∑ x ϵ U ( x*Pr{X=x} ) Properties E[X + Y ] = E[X] + E[Y ] E[∑ i =1..k (X i ) ] = ∑ i =1..k (E[ X i ])