CSCI 3333 Data Structures Algorithm Analysis

Slides:



Advertisements
Similar presentations
Analysis of Algorithms
Advertisements

CSC401 – Analysis of Algorithms Lecture Notes 1 Introduction
Analysis of Algorithms1 Running Time Pseudo-Code Analysis of Algorithms Asymptotic Notation Asymptotic Analysis Mathematical facts COMP-2001 L4&5 Portions.
© 2004 Goodrich, Tamassia 1 Lecture 01 Algorithm Analysis Topics Basic concepts Theoretical Analysis Concept of big-oh Choose lower order algorithms Relatives.
Analysis of Algorithms Algorithm Input Output. Analysis of Algorithms2 Outline and Reading Running time (§1.1) Pseudo-code (§1.1) Counting primitive operations.
Analysis of Algorithms1 CS5302 Data Structures and Algorithms Lecturer: Lusheng Wang Office: Y6416 Phone:
Analysis of Algorithms (Chapter 4)
Analysis of Algorithms1 Estimate the running time Estimate the memory space required. Time and space depend on the input size.
Analysis of Algorithms
Analysis of Algorithms (pt 2) (Chapter 4) COMP53 Oct 3, 2007.
Fall 2006CSC311: Data Structures1 Chapter 4 Analysis Tools Objectives –Experiment analysis of algorithms and limitations –Theoretical Analysis of algorithms.
The Seven Functions. Analysis of Algorithms. Simple Justification Techniques. 2 CPSC 3200 University of Tennessee at Chattanooga – Summer Goodrich,
Analysis of Algorithms1 CS5302 Data Structures and Algorithms Lecturer: Lusheng Wang Office: Y6416 Phone:
PSU CS 311 – Algorithm Design and Analysis Dr. Mohamed Tounsi 1 CS 311 Design and Algorithms Analysis Dr. Mohamed Tounsi
Analysis of Performance
CS2210 Data Structures and Algorithms Lecture 2:
Analysis of Algorithms Algorithm Input Output © 2014 Goodrich, Tamassia, Goldwasser1Analysis of Algorithms Presentation for use with the textbook Data.
Analysis of Algorithms Lecture 2
Analysis of Algorithms
Analysis Tools Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.
Algorithm Efficiency CS 110: Data Structures and Algorithms First Semester,
Analysis of Algorithms
Analysis of Algorithms1 The Goal of the Course Design “good” data structures and algorithms Data structure is a systematic way of organizing and accessing.
Algorithm Input Output An algorithm is a step-by-step procedure for solving a problem in a finite amount of time. Chapter 4. Algorithm Analysis (complexity)
Data Structures Lecture 8 Fang Yu Department of Management Information Systems National Chengchi University Fall 2010.
Analysis of Algorithms1 Running Time Pseudo-Code Analysis of Algorithms Asymptotic Notation Asymptotic Analysis Mathematical facts.
Analysis of Algorithms Algorithm Input Output Last Update: Aug 21, 2014 EECS2011: Analysis of Algorithms1.
1 Dr. J. Michael Moore Data Structures and Algorithms CSCE 221 Adapted from slides provided with the textbook, Nancy Amato, and Scott Schaefer.
Dr. Alagoz CHAPTER 2 Analysis of Algorithms Algorithm Input Output An algorithm is a step-by-step procedure for solving a problem in a finite amount of.
Analysis of Algorithms Algorithm Input Output © 2010 Goodrich, Tamassia1Analysis of Algorithms.
Analysis of algorithms. What are we going to learn? Need to say that some algorithms are “better” than others Criteria for evaluation Structure of programs.
Analysis Tools. Acknowledgement Ms. Krishani Abeysekera Dr. Bun Yue Mr. Charles Moen Dr. Wei Ding Dr. Michael Goodrich.
Announcement We will have a 10 minutes Quiz on Feb. 4 at the end of the lecture. The quiz is about Big O notation. The weight of this quiz is 3% (please.
GC 211:Data Structures Week 2: Algorithm Analysis Tools Slides are borrowed from Mr. Mohammad Alqahtani.
1 COMP9024: Data Structures and Algorithms Week Two: Analysis of Algorithms Hui Wu Session 2, 2014
Analysis of Algorithms Algorithm Input Output © 2014 Goodrich, Tamassia, Goldwasser1Analysis of Algorithms Presentation for use with the textbook Data.
Data Structures I (CPCS-204) Week # 2: Algorithm Analysis tools Dr. Omar Batarfi Dr. Yahya Dahab Dr. Imtiaz Khan.
(Complexity) Analysis of Algorithms Algorithm Input Output 1Analysis of Algorithms.
Analysis of Algorithms
Analysis of Algorithms
COMP9024: Data Structures and Algorithms
COMP9024: Data Structures and Algorithms
GC 211:Data Structures Week 2: Algorithm Analysis Tools
Computer Science 212 Title: Data Structures and Algorithms
Introduction to Algorithms
GC 211:Data Structures Algorithm Analysis Tools
03 Algorithm Analysis Hongfei Yan Mar. 9, 2016.
Algorithms Algorithm Analysis.
Analysis of Algorithms
COMP9024: Data Structures and Algorithms
Lecture 3 of Computer Science II
Analysis of Algorithms
Analysis of Algorithms
CS 3343: Analysis of Algorithms
Analysis of Algorithms
Analysis of Algorithms
Analysis of Algorithms
GC 211:Data Structures Algorithm Analysis Tools
Analysis of Algorithms
Analysis of Algorithms
Analysis of Algorithms
Analysis of Algorithms
Analysis of Algorithms
GC 211:Data Structures Algorithm Analysis Tools
Analysis of Algorithms
CS210- Lecture 2 Jun 2, 2005 Announcements Questions
Recall Some Algorithms Lots of Data Structures Arrays Linked Lists
Analysis of Algorithms
Analysis of Algorithms
Presentation transcript:

CSCI 3333 Data Structures Algorithm Analysis by Dr. Bun Yue Professor of Computer Science yue@uhcl.edu http://sce.uhcl.edu/yue/ 2013

Acknowledgement Mr. Charles Moen Dr. Wei Ding Ms. Krishani Abeysekera Dr. Michael Goodrich

Algorithm “A finite set of instructions that specify a sequence of operations to be carried out in order to solve a specific problem or class of problems.”

Example Algorithm What do you think about this algorithm? Input, output and side effects can be more clearly specified. No discussion on how this is actually implemented in a given programming language.

Some Properties of Algorithms A deterministic sequential algorithm should be: Finiteness: Algorithm must complete after a finite number of instructions. Absence of ambiguity: Each step must be clearly defined, having only one interpretation. Definition of sequence: Each step must have a unique defined preceding & succeeding step. The first step (start step) & last step (halt step) must be clearly noted. Input/output: There must be a specified number of input values and result values. Feasibility: It must be possible to perform each instruction.

Specifying Algorithms There are no universal formal. By English. By Pesudocode By a programming language (not usually desirable.)

Specifying Algorithms Must be included: Input: type and meaning Output: type and meaning Instruction sequences. May be included: Assumptions Side effects Pre-conditions and post-conditions

Specifying Algorithms Like sub-programs, algorithms can be broken down into sub-algorithms to reduce complexity. High level algorithms may use English more. Low level algorithms may use pseudocode more.

Pseudocode High-level description of an algorithm More structured than English prose Less detailed than a program Preferred notation for describing algorithms Hide program design issues

Example: ArrayMax Algorithm arrayMax(A, n) Input array A of n integers Output maximum element value of A currentMax  A[0] for i  1 to n  1 do if A[i]  currentMax then currentMax  A[i] return currentMax What do you think? What happens when n is 0?

Example Algorithm: BogoSort Algorithm BogoSort(A); Do while (A is not sorted) randomize(A); End do; What do you think?

Better Specification Algorithm BogoSort(A) Input: A: an array of comparable elements. Output: none Side Effect: A is sorted. while not sorted(A) randomize(A); end while;

Sorted(A) Algorithm Sorted(A) Input: A: an array of comparable elements. Output: a boolean value indicating whether A is sorted (in non-descending order); for (i=0; i<A.size-2; i++) if A[i] > A[i+1] return false; end for; return true;

BogoSort Is correct! Terrible performance!

Criteria for Judging Algorithms Correctness Accuracy (especially for numerical methods and approximation problems) Performance Simplicity Ease of implementations

How Bogosort Fare? Correctness: yes Accuracy: yes (not really applicable) Performance: terrible even for small arrays Simplicity: yes Ease of implementation: medium

Summation Algorithm Correctness: ok. Performance: bad, proportional to N. Simplicity: not too bad Ease: not too bad

Summation Algorithm A better summation algorithm Algorithm Summation(N) Input: N: Natural Number Output: Summation of 1 to N. return N * (N+1) / 2; Performance: constant time independent of N.

Performance of Algorithms Performance of algorithms can be measured by runtime experiments, called benchmark analysis.

Experimental Studies Write a program implementing the algorithm Run the program with inputs of varying size and composition Use a method like System.currentTimeMillis() to get an accurate measure of the actual running time Plot the results

Limitations of Experiments It is necessary to implement the algorithm, which may: be difficult, introduce implementation related performance issues. Results may not be indicative of the running time on other inputs not included in the experiment. In order to compare two algorithms, the same hardware and software environments must be used.

Theoretical Analysis Uses a high-level description of the algorithm instead of an implementation Characterizes running time as a function of the input sizes (usually n) Takes into account all possible inputs Allows us to evaluate the speed of an algorithm independent of the hardware/software environment

Analysis Tools (Goodrich, 166)‏ Asymptotic Analysis Developed by computer scientists for analyzing the running time of algorithms Describes the running time of an algorithm in terms of the size of the input – in other words, how well does it “scale” as the input gets larger The most common metric used is: Upper bound of an algorithm Called the Big-Oh of an algorithm, O(n)‏

Algorithm Analysis Use asymptotic analysis Analysis Tools (Goodrich web page)‏ Algorithm Analysis Use asymptotic analysis Assume that we are using an idealized computer called the Random Access Machine (RAM)‏ CPU Unlimited amount of memory Accessing a memory cell takes one unit of time Primitive operations take one unit of time Perform the analysis on the pseudocode

Pseudocode High-level description of an algorithm Structured Analysis Tools (Goodrich, 48, 166)‏ Pseudocode High-level description of an algorithm Structured Not as detailed as a program Easier to understand (written for a human)‏ Algorithm arrayMax(A,n): Input: An array A storing n >= 1 integers. Output: The maximum element in A. currentMax  A[0] for i  1 to n – 1 do if currentMax < A[i] then currentMax  A[i] return currentMax

Pseudocode Details Declaration Control flow Return value Expressions Analysis Tools (Goodrich, 48)‏ Pseudocode Details Declaration Algorithm functionName(arg,…)‏ Input:… Output:… Control flow if…then…[else…] while…do… repeat…until… for…do… Indentation instead of braces Return value return expression Expressions Assignment (like  in Java)‏ Equality testing (like  in Java)‏

Tips for writing an algorithm Analysis Tools (Goodrich, 48)‏ Pseudocode Details Declaration Algorithm functionName(arg,…)‏ Input:… Output:… Control flow if…then…[else…] while…do… repeat…until… for…do… Indentation instead of braces Return value return expression Expressions Assignment (like  in Java)‏ Equality testing (like  in Java)‏ Tips for writing an algorithm Use the correct data structure Use the correct ADT operations Use object-oriented syntax Indent clearly Use a return statement at the end

Analysis Tools (Goodrich, 164–165)‏ Primitive Operations To do asymptotic analysis, we can start by counting the primitive operations in an algorithm and adding them up (need not be very accurate). Primitive operations that we assume will take a constant amount of time: Assigning a value to a variable Calling a function Performing an arithmetic operation Comparing two numbers Indexing into an array Returning from a function Following a pointer reference

Counting Primitive Operations Analysis Tools (Goodrich, 166)‏ Counting Primitive Operations Inspect the pseudocode to count the primitive operations as a function of the input size (n)‏ Algorithm arrayMax(A,n): currentMax  A[0] for i  1 to n – 1 do if currentMax < A[i] then currentMax  A[i] return currentMax Count Array indexing + Assignment 2 Initializing i 1 Verifying i<n n Array indexing + Comparing 2(n-1)‏ Array indexing + Assignment 2(n-1) worst Incrementing the counter 2(n-1)‏ Returning 1 Best case: 2+1+n+4(n–1)+1 = 5n Worst case: 2+1+n+6(n–1)+1 = 7n-2

Best, Worst, or Average Case Analysis Tools (Goodrich, 165)‏ Best, Worst, or Average Case A program may run faster on some input data than on others Best case, worst case, and average case are terms that characterize the input data Best case – the data is distributed so that the algorithm runs fastest Worst case – the data distribution causes the slowest running time Average case – very difficult to calculate We will concentrate on analyzing algorithms by identifying the running time for the worst case data

Running Time of arrayMax Analysis Tools (Goodrich, 166)‏ Running Time of arrayMax Worst case running time of arrayMax f(n) = (7n – 2) primitive operations Actual running time depends on the speed of the primitive operations—some of them are faster than others Let t = speed of the slowest primitive operation Let f(n) = the worst-case running time of arrayMax f(n) = t (7n – 2)‏

Growth Rate of arrayMax Analysis Tools (Goodrich, 166)‏ Growth Rate of arrayMax Growth rate of arrayMax is linear Changing the hardware alters the value of t, so that arrayMax will run faster on a faster computer Growth rate does not change. Slow PC 10(7n-2)‏ Fast PC 5(7n-2)‏ Fastest PC 1(7n-2)‏

Tests of arrayMax Does it really work that way? Analysis Tools Tests of arrayMax Does it really work that way? Used arrayMax algorithm from p.166, Goodrich Tested on two PCs Yes, results both show a linear growth rate Pentium III, 866 MHz Pentium 4, 2.4 GHz

Big-Oh Notation f(n)  cg(n) for n  n0 Given functions f(n) and g(n), we say that f(n) is O(g(n)) if there are positive constants c and n0 such that f(n)  cg(n) for n  n0 Example: 2n + 10 is O(n) 2n + 10  cn (c  2) n  10 n  10/(c  2) Pick c = 3 and n0 = 10

Big-Oh Meaning Big-O expresses an upper bound on the growth rate of a function, for sufficiently large values of n. It shows the asymptotic behavior. Example: 3n2 + n lg n +2000 = O(n2) 3n2 + n lg n +2000 = O(n3)

Big-Oh Example Example: the function n2 is not O(n) n2  cn n  c The above inequality cannot be satisfied since c must be a constant

Big-Oh Proof Example 50n2+100n lgn + 1500 is O(n2) Proof: Need to find c such that 50n2+100n lgn + 1500 < cn2 For example, pick c = 151 and n0 = 40. You may complete the proof. You may also pick c = 1650 and n0 = 1

Big-Oh and Growth Rate The big-Oh notation gives an upper bound on the growth rate of a function The statement “f(n) is O(g(n))” means that the growth rate of f(n) is no more than the growth rate of g(n) We can use the big-Oh notation to rank functions according to their growth rate f(n) is O(g(n)) g(n) is O(f(n)) g(n) grows more Yes No f(n) grows more Same growth

Big-O Theorems For all the following theorems, assume that f(n) is a function of n and that K is an arbitrary constant. Theorem1: K is O(1) Theorem 2: A polynomial is O(the term containing the highest power of n) f(n) = 7n4 + 3n2 + 5n + 1000 is O(n4) Theorem 3: K*f(n) is O(f(n)) [that is, constant coefficients can be dropped] g(n) = 7n4 is O(n4) Theorem 4: If f(n) is O(g(n)) and g(n) is O(h(n)) then f(n) is O(h(n)). [transitivity]

Big-O Theorems Theorem 5: Each of the following functions is strictly big-O of its successors: K [constant] logb(n) [always log base 2 if no base is shown] n nlogb(n) n2 n to higher powers 2n 3n larger constants to the n-th power n! [n factorial] nn For Example: f(n) = 3nlog(n) is O(nlog(n)) and O(n2) and O(2n) smaller larger

Big-O Theorems Theorem 6: In general, f(n) is big-O of the dominant term of f(n), where “dominant” may usually be determined from Theorem 5. f(n) = 7n2+3nlog(n)+5n+1000 is O(n2) g(n) = 7n4+3n+106 is O(3n) h(n) = 7n(n+log(n)) is O(n2) Theorem 7: For any base b, logb(n) is O(log(n)).

Asymptotic Algorithm Analysis The asymptotic analysis of an algorithm determines the running time in big-Oh notation To perform the asymptotic analysis We find the worst-case number of primitive operations executed as a function of the input size We express this function with big-Oh notation Example: We determine that algorithm arrayMax executes at most 7n  2 primitive operations We say that algorithm arrayMax “runs in O(n) time” Since constant factors and lower-order terms are eventually dropped anyhow, we can disregard them when counting primitive operations

Example: Computing Prefix Averages We further illustrate asymptotic analysis with two algorithms for prefix averages The i-th prefix average of an array X is average of the first (i + 1) elements of X: A[i] = (X[0] + X[1] + … + X[i])/(i+1) Computing the array A of prefix averages of another array X has applications to financial analysis

Prefix Average (Quadratic) The following algorithm computes prefix averages in quadratic time by applying the definition Algorithm prefixAverages1(X, n) Input array X of n integers Output array A of prefix averages of X #operations A  new array of n integers n for i  0 to n  1 do n s  X[0] n for j  1 to i do 1 + 2 + …+ (n  1) s  s + X[j] 1 + 2 + …+ (n  1) A[i]  s / (i + 1) n return A 1

Arithmetic Progression The running time of prefixAverages1 is O(1 + 2 + …+ n) The sum of the first n integers is n(n + 1) / 2 There is a simple visual proof of this fact Thus, algorithm prefixAverages1 runs in O(n2) time

Prefix Averages (Linear) The following algorithm computes prefix averages in linear time by keeping a running sum Algorithm prefixAverages2(X, n) Input array X of n integers Output array A of prefix averages of X #operations A  new array of n integers n s  0 1 for i  0 to n  1 do n s  s + X[i] n A[i]  s / (i + 1) n return A 1 Algorithm prefixAverages2 runs in O(n) time

Math you need to Review Summations Logarithms and Exponents Proof techniques Basic probability properties of logarithms: logb(xy) = logbx + logby logb (x/y) = logbx - logby logbxa = alogbx logba = logxa/logxb properties of exponentials: a(b+c) = aba c abc = (ab)c ab /ac = a(b-c) b = a logab bc = a c*logab

Relatives of Big-Oh big-Omega (Lower bound) f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0 big-Theta (order) f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an integer constant n0  1 such that c’•g(n)  f(n)  c’’•g(n) for n  n0

Intuition for Asymptotic Notation Big-Oh f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n) big-Omega f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n) big-Theta f(n) is (g(n)) if f(n) is asymptotically equal to g(n)

Relatives of Big-Oh Examples 5n2 is (n2) f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0 let c = 5 and n0 = 1 5n2 is also (n) f(n) is (g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n)  c•g(n) for n  n0 let c = 1 and n0 = 1 5n2 is (n2) f(n) is (g(n)) if it is (n2) and O(n2). We have already seen the former, for the latter recall that f(n) is O(g(n)) if there is a constant c > 0 and an integer constant n0  1 such that f(n) < c•g(n) for n  n0 Let c = 5 and n0 = 1

Algorithm Complexity Complexity analysis (using Big-O) applies to algorithms, not problems. May be applied to measure: Best case: not very useful. Worst case: most useful and common. Average case: very useful but can be very difficult: Depend on the input distribution.

What can be analyzed? Primitive operations: Primary memory usage CPU/primary memory I/O Function call Communication Primary memory usage Secondary memory usage Number of processor used …

Questions and Comments?