CSCI 256 Data Structures and Algorithm Analysis Lecture 3 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.

Slides:



Advertisements
Similar presentations
HST 952 Computing for Biomedical Scientists Lecture 10.
Advertisements

Fall 2006CENG 7071 Algorithm Analysis. Fall 2006CENG 7072 Algorithmic Performance There are two aspects of algorithmic performance: Time Instructions.
The Growth of Functions
CSC401 – Analysis of Algorithms Lecture Notes 1 Introduction
© 2004 Goodrich, Tamassia 1 Lecture 01 Algorithm Analysis Topics Basic concepts Theoretical Analysis Concept of big-oh Choose lower order algorithms Relatives.
Introduction to Analysis of Algorithms
Analysis of Algorithms
© 2006 Pearson Addison-Wesley. All rights reserved6-1 More on Recursion.
CSE 421 Algorithms Richard Anderson Lecture 4. What does it mean for an algorithm to be efficient?
Algorithm Analysis CS 201 Fundamental Structures of Computer Science.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 23 Algorithm Efficiency.
Data Structure Algorithm Analysis TA: Abbas Sarraf
Elementary Data Structures and Algorithms
Analysis of Algorithms COMP171 Fall Analysis of Algorithms / Slide 2 Introduction * What is Algorithm? n a clearly specified set of simple instructions.
CSE 421 Algorithms Richard Anderson Lecture 3. Classroom Presenter Project Understand how to use Pen Computing to support classroom instruction Writing.
© 2006 Pearson Addison-Wesley. All rights reserved10 A-1 Chapter 10 Algorithm Efficiency and Sorting.
DATA STRUCTURES AND ALGORITHMS Lecture Notes 1 Prepared by İnanç TAHRALI.
Algorithm Analysis (Big O)
Algorithm Design and Analysis Liao Minghong School of Computer Science and Technology of HIT July, 2003.
COMPSCI 102 Introduction to Discrete Mathematics.
Liang, Introduction to Java Programming, Seventh Edition, (c) 2009 Pearson Education, Inc. All rights reserved Chapter 23 Algorithm Efficiency.
Program Performance & Asymptotic Notations CSE, POSTECH.
Week 2 CS 361: Advanced Data Structures and Algorithms
1 Chapter 24 Developing Efficient Algorithms. 2 Executing Time Suppose two algorithms perform the same task such as search (linear search vs. binary search)
Lecture 2 Computational Complexity
Algorithm Efficiency CS 110: Data Structures and Algorithms First Semester,
Mathematics Review and Asymptotic Notation
1 Computer Algorithms Lecture 3 Asymptotic Notation Some of these slides are courtesy of D. Plaisted, UNC and M. Nicolescu, UNR.
2.1 Computational Tractability
DISCRETE MATHEMATICS I CHAPTER 11 Dr. Adam Anthony Spring 2011 Some material adapted from lecture notes provided by Dr. Chungsim Han and Dr. Sam Lomonaco.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 4.
2.1 Computational Tractability. 2 Computational Tractability Charles Babbage (1864) As soon as an Analytic Engine exists, it will necessarily guide the.
Complexity & Analysis of Data Structures & Algorithms Piyush Kumar (Lecture 2: Algorithmic Analysis) Welcome to COP4531 Based on slides from J. Edmonds,
CS 221 Analysis of Algorithms Instructor: Don McLaughlin.
Chapter 10 A Algorithm Efficiency. © 2004 Pearson Addison-Wesley. All rights reserved 10 A-2 Determining the Efficiency of Algorithms Analysis of algorithms.
Analysis of Algorithms These slides are a modified version of the slides used by Prof. Eltabakh in his offering of CS2223 in D term 2013.
CMPT 438 Algorithms. Why Study Algorithms? Necessary in any computer programming problem ▫Improve algorithm efficiency: run faster, process more data,
Asymptotic Analysis-Ch. 3
Introduction to Analysis of Algorithms COMP171 Fall 2005.
Algorithm Analysis (Algorithm Complexity). Correctness is Not Enough It isn’t sufficient that our algorithms perform the required tasks. We want them.
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
MS 101: Algorithms Instructor Neelima Gupta
CSC – 332 Data Structures Generics Analysis of Algorithms Dr. Curry Guinn.
Algorithm Analysis Part of slides are borrowed from UST.
1 Algorithms  Algorithms are simply a list of steps required to solve some particular problem  They are designed as abstractions of processes carried.
1/6/20161 CS 3343: Analysis of Algorithms Lecture 2: Asymptotic Notations.
Algorithm Analysis (Big O)
CSC 430: Basics of Algorithm Runtime Analysis 1. What is Runtime Analysis? Fundamentally, it is the process of counting the number of “steps” it takes.
CS 150: Analysis of Algorithms. Goals for this Unit Begin a focus on data structures and algorithms Understand the nature of the performance of algorithms.
 2007 Pearson Education, Inc. All rights reserved Sorting: A Deeper Look.
E.G.M. PetrakisAlgorithm Analysis1  Algorithms that are equally correct can vary in their utilization of computational resources  time and memory  a.
CSE 421 Algorithms Richard Anderson Winter 2009 Lecture 4.
CSCI 256 Data Structures and Algorithm Analysis Lecture 10 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
1 Chapter 2 Algorithm Analysis Reading: Chapter 2.
Ch03-Algorithms 1. Algorithms What is an algorithm? An algorithm is a finite set of precise instructions for performing a computation or for solving a.
Chapter 3 Chapter Summary  Algorithms o Example Algorithms searching for an element in a list sorting a list so its elements are in some prescribed.
1 COMP9024: Data Structures and Algorithms Week Two: Analysis of Algorithms Hui Wu Session 2, 2014
Lecture 3COMPSCI.220.S1.T Running Time: Estimation Rules Running time is proportional to the most significant term in T(n) Once a problem size.
CS 3343: Analysis of Algorithms
Algorithm Analysis (not included in any exams!)
Chapter 2 Basics of Algorithm Analysis
Complexity & Analysis of Data Structures & Algorithms
CS 201 Fundamental Structures of Computer Science
Richard Anderson Lecture 3
Chapter 2.
Chapter 2 Basics of Algorithm Analysis
Richard Anderson Winter 2019 Lecture 4
Estimating Algorithm Performance
Richard Anderson Autumn 2015 Lecture 4
Presentation transcript:

CSCI 256 Data Structures and Algorithm Analysis Lecture 3 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some by Iker Gondra

Computational Tractability A major focus of algorithm design is to find efficient algorithms for computational problems. What does it mean for an algorithm to be efficient? As soon as an Analytic Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will arise - By what course of calculation can these results be arrived at by the machine in the shortest time? - Charles Babbage Charles Babbage (1864) Analytic Engine (schematic)

Some Initial Attempts at Defining Efficiency Proposed Definition of Efficiency (1): An algorithm is efficient if, when implemented, it runs quickly on real input instances –Where does it run? Even bad algorithms can run quickly when applied to small test cases on extremely fast processors –How is it implemented? Even good algorithms can run slowly when they are coded sloppily –What is a “real” input instance? Some instances can be much harder than others –How well, or badly, does the running time scale as problem sizes grow to unexpected levels? –We need a concrete definition that is platform-independent, instance-independent, and of predictive value with respect to increasing input sizes

Worst-Case Analysis Worst case running time: Obtain bound on largest possible running time of algorithm on input of a given size N –Draconian view, but hard to find effective alternative –Generally captures efficiency in practice Average case running time: Obtain bound on some averaged running time of algorithm on some random input as a function of input size N –Hard (or impossible) to accurately model real instances by random distributions –Algorithm tuned for a certain distribution may perform poorly on other inputs

Brute-Force Search Ok, but what is a reasonable analytical benchmark that can tell us whether a running time bound is impressive or weak? –For many non-trivial problems, there is a natural brute force search algorithm that checks every possible solution (i.e., try all possibilities, see if any one works, n! for Stable Matching (n = number of pairs of men and women) –Note that this is an intellectual cop-out; it provides us with absolutely no insight into the problem structure –Thus, a first simple guide is by comparison with brute-force search Proposed Definition of Efficiency (2): An algorithm is efficient if it achieves qualitatively better worst-case performance than brute-force search –Still vague, what is “qualitatively better performance”?

Polynomial Time as a Definition of Efficiency Desirable scaling property: Algorithms with polynomial run time have the property that: increasing the problem size by a constant factor increases the run time by at most a constant factor An algorithm is polynomial-time if the above scaling property holds There exists constants c > 0 and d > 0 such that on every input of size N, its running time is bounded by cN d steps.

Polynomial Time as a Definition of Efficiency In the above if we increase input from N to 2N then the running time is c(2N) d = c2 d N d, which is a slow down of a factor of 2 d which is a constant since d is a constant Note: for stable matching problem N the size of the input is really 2n 2, where n = number of men –Why? – well there are n men and n women and each has a preference list of size n so the size of the input N = 2n x n = 2n 2

Polynomial Time as a Definition of Efficiency Proposed Definition of Efficiency (3): An algorithm is efficient if it has a polynomial running time. (n 100 is polynomial and n ( (logn)) is not; generally we would be happier with the second running time; BUT: Polynomial running time as a notion of efficiency really works in practice! –Generally, polynomial time seems to capture the algorithms which are efficient in practice –Although 6.02   N 20 is technically polynomial-time, it would be useless in practice –In practice, the polynomial-time algorithms that people develop almost always have low constants and low exponents –Breaking through the exponential barrier of brute force typically exposes some crucial structure of the problem

Polynomial Time as a Definition of Efficiency One further reason why the mathematical formalism and the empirical evidence seem to line up well in the case of polynomial-time solvability is that the gulf between the growth rates of polynomial and exponential functions is enormous

Asymptotic Order of Growth We could give a very concrete statement about the running time of an algorithm on inputs of size N such as: On any input of size N, the algorithm runs for at most 1.62N N + 8 steps –Finding such a precise bound may be an exhausting activity, and more detail than we wanted anyway –Extremely detailed statements about the number of steps an algorithm executes are often meaningless. Why?

Why Ignore Constant Factors? Constant factors are arbitrary –Depend on the implementation –Depend on the details of the architecture being used to do the computing Determining the constant factors is tedious Determining constant factors provides little insight –A more course grained approach allows us to uncover similarities among different classes of algorithms

Why Emphasize Growth Rates? The algorithm with the lower growth rate will be faster for all but a finite number of cases (small inputs) Performance is most important for larger problem size As memory prices continue to fall, bigger problem sizes become feasible Improving growth rate often requires new techniques

Formalizing Growth Rates Let T(n) = the worst case running time for an algorithm with input of size n Upper bounds: T(n) is O(f(n)) ( say: T(n) is order f(n) or T is asymptotically upper bounded by f(n) ) if there exist constants c > 0 and n 0  0 such that for all n  n 0 we have T(n)  c · f(n) To express that an upper bound is the best possible: Lower bounds: T(n) is  (f(n)) (say: T is asymptotically lower bounded by f(n) ) if there exist constants c > 0 and n 0  0 such that for all n  n 0 we have T(n)  c · f(n)

Formalizing Growth Rates Tight bounds: T(n) is  (f(n)) ( say: f(n) is an asymptotically tight bound for T(n) ) if T(n) is both O(f(n)) and  (f(n)) Ex: T(n) = 32n n + 37 show: –T(n) is O(n 2 ), O(n 3 ),  (n 2 ),  (n), and  (n 2 ) –T(n) is not O(n), not  (n 3 ), not  (n), and not  (n 3 )

Growth Rates ctd –Note for n ≥ 1; c = ( ) works to show O(n 2 ) –How would we do this? Try induction!!! –Note for n ≥ 1; c = 32 works to show  (n 2 ) –You should be able to show the following: –Clearly if T(n) is not O(n), it is not  (n) –Similarly, if T(n) is not  (n 3 ), it is not  (n 3 ) –If T(n) is O(n), it is O(n 2 ) –If T(n) is  (n 3 ), it is  (n 2 )

Properties of Asymptotic Growth Rates

Of course in the previous problem the definition of lim n → ∞ is the key!!! And what is that???

Properties of Asymptotic Growth Rates Transitivity (proved in class) –If f = O(g) and g = O(h) then f = O(h) –If f =  (g) and g =  (h) then f =  (h) –If f =  (g) and g =  (h) then f =  (h) Additivity (proved in class) –If f = O(h) and g = O(h) then f + g = O(h) –If f =  (h) and g =  (h) then f + g =  (h) –If f =  (h) and g = O(h) then f + g =  (h) –If f i = O(h) i = 1,…,k, then f 1 + f 2 + …+ f k = O(h)

Recall: Properties of Exponentials (a x ) y = a xy a x+y = a x a y a x - y = a x /a y

Recall: Properties of Logs Recall log a m = x if and only if m = a x Ex: Show log a a x = x and a log a x = x log a mn = log a m + log a n log a m/n = log a m - log a n log a m x = x log a m log a a = 1 Change of base: log a m = log a b log b m so: log a m/ log a b = log b m Here, a, b > 1; m, n, x > 0

Asymptotic Bounds for Some Common Functions Polynomials: a 0 + a 1 n + … + a d n d is  (n d ) if a d > 0 –Show why, even when a i < 0, for any i < n –Polynomial time: Running time is O(n d ) for some constant d independent of the input size n Logarithms: O(log a n) = O(log b n) any a, b > 0 –Show why –For every x > 0, log n = O(n x ) (x any positive number) can avoid specifying the base log grows slower than every polynomial

Exponentials: f(n) = r n Every exponential grows faster than every polynomial: –For every r > 1 and every d > 0, n d = O(r n ) –In contrast to the situation for logs, to say a function is exponential is somewhat sloppy: if r > s >1, it is never the case that r n =  (s n ) –But we do often say that running time is exponential without specifying the base –In general logs grow more slowly than polynomials which grow more slowly than exponentials

Exercise Take the following list of functions and arrange them in ascending order of growth rate. That is, if function g(n) immediately follows function f(n) in your list, then it should be the case that f(n) is O(g(n)) –f 1 (n) = 10 n –f 2 (n) = n 1/3 –f 3 (n) = n n –f 4 (n) = log 2 n –f 5 (n) = 2 n/100

Common Running Times: Linear Time: O(n) Linear time: Running time is at most a constant factor times the size of the input –E.g., Computing the maximum: Compute maximum of n numbers a 1, …, a n max  a 1 for i = 2 to n { if (a i > max) max  a i }

Common Running Times: Linear Time: O(n) –E.g., Merge: Combine two sorted lists A = a 1,a 2,…,a n with B = b 1,b 2,…,b n into sorted whole –Claim: Merging two lists of size n takes O(n) time Pf: After each comparison, the length of output list increases by 1 i = 1, j = 1 while (both lists are nonempty) { if (a i  b j ) append a i to output list and increment i else append b j to output list and increment j } append remainder of nonempty list to output list

Common Running Times: O(n log n) Time O(n log n) time: Arises in divide-and-conquer algorithms (we will see that this is an algorithm that splits its input in half, solves each recursively and combines the pieces in linear time) Sorting: Mergesort and heapsort are sorting algorithms that perform O(n log n) comparisons Frequently we find algorithms with running time O(n log n) because there are many algorithms where the most expensive step is to sort the input.

Common Running Times: Quadratic Time: O(n 2 ) Quadratic time: –E.g., Closest pair of points: Given a list of n points in the plane (x 1, y 1 ), …, (x n, y n ), find the pair that is closest –Here’s a O(n 2 ) solution: Try all pairs of points –Remark:  (n 2 ) seems inevitable but, as we will see, this is just an illusion min  (x 1 - x 2 ) 2 + (y 1 - y 2 ) 2 for i = 1 to n { for j = i+1 to n { d  (x i - x j ) 2 + (y i - y j ) 2 if (d < min) min  d }

Common Running Times: Cubic Time: O(n 3 ) Cubic time: E.g., Set disjointness: Given n sets S 1, …, S n each of which is a subset of 1, 2, …, n, is there some pair of these which are disjoint? –O(n 3 ) solution: For each pair of sets, determine if they are disjoint (assume that each set S i is represented in such a way that all elements of S i can be listed in constant time per element and we can also check in constant time whether a given number p belongs to S i ) foreach set S i { foreach other set S j { foreach element p of S i { determine whether p also belongs to S j } if (no element of S i belongs to S j ) report that S i and S j are disjoint }

Common Running Times: Polynomial Time: O(n k ) Time O(n k ) time: Occurs when we search over all subsets of size k –E.g., Independent set of size k: Given a graph, are there k nodes such that no two are joined by an edge? To do this we can enumerate all subsets of k nodes and check if each is independent –Check whether S is an independent set O(k 2 ) –Number of k element subsets –O(k 2 n k / k!) = O(n k ) foreach subset S of k nodes { check whether S in an independent set if (S is an independent set) report S is an independent set } poly-time for k=17, but not practical k is a constant

Common Running Times: Exponential Time Exponential time: occurs if we wish to enumerate all subsets of n-element set –E.g., Independent set: Given a graph, what is maximum size of an independent set? –O(n 2 2 n ) solution: Enumerate all subsets and determine max independent set S*   foreach subset S of nodes { check whether S in an independent set if (S is largest independent set seen so far) update S*  S }

Review assignment for the student This chapter also reviews priority queues and uses heap operations to implement priority queue – review these pages in text 57-65