Algorithms and data structures: basic definitions An algorithm is a precise set of instructions for solving a particular task. A data structure is any.

Slides:



Advertisements
Similar presentations
Fundamentals of Python: From First Programs Through Data Structures
Advertisements

Fall 2006CENG 7071 Algorithm Analysis. Fall 2006CENG 7072 Algorithmic Performance There are two aspects of algorithmic performance: Time Instructions.
CSC401 – Analysis of Algorithms Lecture Notes 1 Introduction
Introduction to Analysis of Algorithms
Scott Grissom, copyright 2004 Chapter 5 Slide 1 Analysis of Algorithms (Ch 5) Chapter 5 focuses on: algorithm analysis searching algorithms sorting algorithms.
Complexity Analysis (Part I)
Analysis of Algorithms1 Estimate the running time Estimate the memory space required. Time and space depend on the input size.
Object (Data and Algorithm) Analysis Cmput Lecture 5 Department of Computing Science University of Alberta ©Duane Szafron 1999 Some code in this.
© 2006 Pearson Addison-Wesley. All rights reserved6-1 More on Recursion.
Cmpt-225 Algorithm Efficiency.
1 Algorithm Efficiency, Big O Notation, and Role of Data Structures/ADTs Algorithm Efficiency Big O Notation Role of Data Structures Abstract Data Types.
Analysis of Recursive Algorithms
Algorithm Analysis CS 201 Fundamental Structures of Computer Science.
Algorithm Efficiency and Sorting Bina Ramamurthy CSE116A,B.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 23 Algorithm Efficiency.
Analysis of Algorithms 7/2/2015CS202 - Fundamentals of Computer Science II1.
Analysis of Algorithm.
Elementary Data Structures and Algorithms
Analysis of Algorithms COMP171 Fall Analysis of Algorithms / Slide 2 Introduction * What is Algorithm? n a clearly specified set of simple instructions.
Cmpt-225 Simulation. Application: Simulation Simulation  A technique for modeling the behavior of both natural and human-made systems  Goal Generate.
Analysis of Algorithms Spring 2015CS202 - Fundamentals of Computer Science II1.
Abstract Data Types (ADTs) Data Structures The Java Collections API
COMP s1 Computing 2 Complexity
Liang, Introduction to Java Programming, Seventh Edition, (c) 2009 Pearson Education, Inc. All rights reserved Chapter 23 Algorithm Efficiency.
Algorithm Analysis & Complexity We saw that a linear search used n comparisons in the worst case (for an array of size n) and binary search had logn comparisons.
Analysis of Algorithm Lecture 3 Recurrence, control structure and few examples (Part 1) Huma Ayub (Assistant Professor) Department of Software Engineering.
Program Performance & Asymptotic Notations CSE, POSTECH.
C. – C. Yao Data Structure. C. – C. Yao Chap 1 Basic Concepts.
1 Chapter 24 Developing Efficient Algorithms. 2 Executing Time Suppose two algorithms perform the same task such as search (linear search vs. binary search)
1 Recursion Algorithm Analysis Standard Algorithms Chapter 7.
Algorithm Efficiency CS 110: Data Structures and Algorithms First Semester,
Chapter 12 Recursion, Complexity, and Searching and Sorting
Analysis of Algorithms
1 7.Algorithm Efficiency What to measure? Space utilization: amount of memory required  Time efficiency: amount of time required to process the data Depends.
Chapter 10 A Algorithm Efficiency. © 2004 Pearson Addison-Wesley. All rights reserved 10 A-2 Determining the Efficiency of Algorithms Analysis of algorithms.
Searching. RHS – SOC 2 Searching A magic trick: –Let a person secretly choose a random number between 1 and 1000 –Announce that you can guess the number.
Analysis of Algorithms These slides are a modified version of the slides used by Prof. Eltabakh in his offering of CS2223 in D term 2013.
1 7.Algorithm Efficiency What to measure? Space utilization: amount of memory required  Time efficiency: amount of time required to process the data.
CMPT 438 Algorithms. Why Study Algorithms? Necessary in any computer programming problem ▫Improve algorithm efficiency: run faster, process more data,
1 Analysis of Algorithms CS 105 Introduction to Data Structures and Algorithms.
Complexity of Algorithms
Analysis of Algorithms CSCI Previous Evaluations of Programs Correctness – does the algorithm do what it is supposed to do? Generality – does it.
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
Data Structure Introduction.
Algorithm Analysis CS 400/600 – Data Structures. Algorithm Analysis2 Abstract Data Types Abstract Data Type (ADT): a definition for a data type solely.
3.3 Complexity of Algorithms
Data Structure and Algorithms. Algorithms: efficiency and complexity Recursion Reading Algorithms.
Algorithm Analysis Part of slides are borrowed from UST.
1 Algorithms  Algorithms are simply a list of steps required to solve some particular problem  They are designed as abstractions of processes carried.
Algorithm Analysis (Big O)
27-Jan-16 Analysis of Algorithms. 2 Time and space To analyze an algorithm means: developing a formula for predicting how fast an algorithm is, based.
Algorithm Analysis. What is an algorithm ? A clearly specifiable set of instructions –to solve a problem Given a problem –decide that the algorithm is.
E.G.M. PetrakisAlgorithm Analysis1  Algorithms that are equally correct can vary in their utilization of computational resources  time and memory  a.
Analysis of Algorithms Spring 2016CS202 - Fundamentals of Computer Science II1.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
1 7.Algorithm Efficiency These factors vary from one machine/compiler (platform) to another  Count the number of times instructions are executed So, measure.
Algorithm Analysis 1.
CMPT 438 Algorithms.
Analysis of Algorithms
Analysis of Algorithms
Analysis of Algorithms
Algorithm Analysis (not included in any exams!)
Algorithm An algorithm is a finite set of steps required to solve a problem. An algorithm must have following properties: Input: An algorithm must have.
What is CS 253 about? Contrary to the wide spread belief that the #1 job of computers is to perform calculations (which is why the are called “computers”),
CS 201 Fundamental Structures of Computer Science
Analysis of Algorithms
Algorithm Analysis Bina Ramamurthy CSE116A,B.
Analysis of Algorithms
Algorithms and data structures: basic definitions
Algorithm Analysis How can we demonstrate that one algorithm is superior to another without being misled by any of the following problems: Special cases.
Presentation transcript:

Algorithms and data structures: basic definitions An algorithm is a precise set of instructions for solving a particular task. A data structure is any data type (or representation) with its associated operations. Example data structures: – primitive data types (such as int, double, char) are data structures, because they have built-in algorithms for comparison, arithmetic, etc. – More typical data structures are meant to organize and structure collections of data items. A sorted list of integers stored in an array is an example of such a data structure. Classes in JAVA, where data items are defined by means of instance variables, and associated operations are implemented by class methods, is another example. In many cases, the same operation can be carried out in different ways, by means of different algorithms. One of the most important tasks for the program designer at the initial stage of software development is to identify the most appropriate algorithm for any operation associated with the DS.

Introduction to algorithm analysis Algorithms can be compared and evaluated based on different criteria depending on the purposes of the analysis. Among them are: – Execution (or running) time. – Space (or memory) needed. – Correctness. – Clarity. – Etc. In the majority of cases, the execution time and correctness are the most important criteria upon which a decision is made about how good or bad (with respect to that particular case) an algorithm is. This is why, we must know how to analyze and classify the execution time of an algorithm, and how to demonstrate its correctness.

How algorithm correctness relates to the quality of software implementation? Fundamental implementation goals for any software are: 1. Robustness. The program must generate the correct output for any possible input including those not explicitly defined. 2. Adaptability. The program needs to be able to evolve over time to reflect changes in hardware and software environments. 3. Reusability. The same code should be usable in different applications. In object-oriented programming there are three more goals in addition to these: 4. Abstraction, i.e. identifying certain properties (both, procedural and declarative) of an object and then using them to specify a new object which represents a more detailed embodiment of the original one. Example: a JAVA interface specifies what an object does, but not how it does it, while the class implementing it handles the “how” part. 5. Encapsulation, i.e. software components should implement an abstraction without revealing its internal details. 6. Modularity, i.e. dividing the program into separate functional components (classes).

What affects the execution time of an algorithm? Execution time depends upon: – the size of the input (the number of steps performed for different inputs is different); – computer characteristics (mostly processor speed); – implementation details (programming language, compiler, etc.). Taking these characteristics into account makes it very hard to define how efficient a given algorithm is in general. Therefore, we want to ignore all machine- and problem-dependent considerations in our analysis, and focus on the analysis of the algorithm’s structure. The first step in this analysis is to identify a a small number of operations that are executed most often and thus affect the execution time the most.

Example 1: In the following program, which operation affects the run time of the program the most? class lec1ex1 { public static void main (String [] args) { int[][] table = {{1, 2, 3, 4}, {5, 6, 7, 8}, {9, 10, 11, 12}}; int[] sum = new int[3]; int total_sum = 0; for (int i = 0; i < table.length; i++) { sum[i] = 0; //compute the sum of all entries of a given row as well as the total sum of all entries for (int j = 0; j < table[i].length; j++) { sum[i] = sum[i] + table[i][j]; total_sum = total_sum + table[i][j]; } System.out.println ("The sum of the entries of row " + i + " is " + sum[i]); } System.out.println ("The total sum of entries is " + total_sum); }

Example 1 (cont.): Consider the following change in the above algorithm.... for (int i = 0; i < table.length; i++) { sum[i] = 0; //compute the sum of all entries of a given row as well as the total sum of all entries for (int j = 0; j < table[i].length; j++) { sum[i] = sum[i] + table[i][j]; } System.out.println ("The sum of the entries of row " + i + " is " + sum[i]); total_sum = total_sum + sum[i];.... Question 1: How this change affects the run time of the algorithm? Question 2: How significant is the difference?

Example 2: Compare the run times of the following two algorithms (performing linear and binary search in an array of integers) public static boolean linearSearch (int[] list, int target) { boolean result = false; for (int i = 0; i < list.length; i++) if (list[i] == target) result = true; return result; } public static boolean binarySearch (int[] list, int target) { boolean result = false; int low = 0, high = list.length - 1, middle; while (low <= high) { middle = (low + high) / 2; if (list[middle] == target) { result = true; return result; } else if (list[middle] < target) low = middle + 1; else high = middle - 1; } return result; }

That is, the run time of an algorithm can be determined by analyzing its structure and counting the number of operations affecting its performance. Mathematically, this can be expressed by the polynomial C 0 + C 1 *f 1 (N) + C 2 *f 2 (N) C n *f n (N) Typically, one of the terms of this polynomial is much bigger than the other terms. This term is called the leading term, and it defines the run time; C i is called a constant of proportionality, and in most cases it can be ignored. In general, the run time behavior of an algorithm is dominated by its behavior in the loops. Therefore, by analyzing the loop structure we can define the number and the type of operations that affect algorithm performance the most.

A majority of algorithms have a run time proportional to one of the following functions (defined by the leading term with the constant of proportionality ignored): 1 All instructions are executed only once or at most several times. In this case, we say that the algorithm has a constant execution time. logN If the algorithm solves the original problem by transforming it into a smaller problem by cutting the size of the input by some constant fraction, then the program gets slightly slower if N grows. In this case we say that the algorithm has a logarithmic execution time. N If a small amount of processing is done on each input element, we say that the algorithm has a linear execution time. NlogN If the algorithm solves the original problem by breaking it into sub-problems which can be solved independently, and then combines those solutions to get the solution of the original problem, its execution time is said to be NlogN. N^2 If the algorithm processes all input data in a double nested loop, it is said to have a quadratic execution time. N^3 If the algorithm processes all input data in a triple nested loop, it is said to have a cubic execution time. 2^N If the execution time squares when the input size doubles, we say that the algorithm has an exponential execution time.

Example 1 (cont.) Define and compare the run times of the two versions of the “sum problem”: version 1 for (int i = 0; i < table.length; i++) { sum[i] = 0; for (int j = 0; j < table[i].length; j++) { sum[i] = sum[i] + table[i][j]; total_sum = total_sum + table[i][j]; } Number of additions: 2 * (i ^ 2) Algorithm efficiency: N^2 version 2 for (int i = 0; i < table.length; i++) { sum[i] = 0; for (int j = 0; j < table[i].length; j++) sum[i] = sum[i] + table[i][j]; total_sum = total_sum + sum[i]; } Number of additions: (i ^ 2) + i Algorithm efficiency: N^2

Example 2 (cont.) Define and compare the run times of the two versions of the “search problem”: version 1 for (int i = 0; i < list.length; i++) if (list[i] == target) result = true; Number of comparisons: i Algorithm efficiency: N version 2 while (low <= high) { middle = (low + high) / 2; if (list[middle] == target) { result = true; return result; } else if (list[middle] < target) low = middle + 1; else high = middle - 1; } Number of comparisons: log list.length Algorithm efficiency: log N

Average case and worst case analysis In the search problem, it will take at most N or log N (for linear and binary search, respectively) steps to find the target or to show that the target is not on the list. These cases are the worst cases and most often we want to know algorithm efficiency in exactly this case; this is called the worst case run time efficiency. In most cases, it will take less than N (or log N) steps for the algorithm to find the solution (it may even take just one step in the best case). How much “less”, however, is often difficult to determine. The average run time of an algorithm can only be an estimate, because it depends on the input. This is why it is a less important characteristic of algorithm efficiency.

The big-O notation To more precisely express the run time efficiency of an algorithm, we use the so-called big-O notation which is defined as follows: Definition A function g(N) is said to be O(f(N)) is there exist constants C 0 and N 0 such that g(N) N 0. Consider the summing problem. It takes (N^2 + N) steps for version 2 to find the two sums. Here, g(N) = N^2 + N < N^2 + N^2 = 2 * N^2. Let C 0 = 2. Therefore, for both versions the run time of an algorithm is O(f(N^2)). The goal of the efficiency analysis is to show that the running time of an algorithm under consideration is O(f(N)) for some f.

Notes on big-O notation 1. The statement that the running time of an algorithm is O(f(N)) does not mean that the algorithm ever takes that long. 2. The input that causes the worst case may be unlikely to occur in practice. 3. Almost always the constants C 0 and N 0 are unknown and need not be small. These constants may hide implementation details which are important in practice. 4. For small N, there is usually a little difference in the performance of different algorithms. 5. The constant of proportionality, C 0, makes a difference only for comparing algorithms with the same O(f(N)).

To illustrate these notes, consider the following actual algorithms and their efficiencies: Algorithm # Run time efficiency 33N 46Nlog N 13N^2 3N^3 2^n Actual run time for the following input sizes (in sec., stated otherwise) N = N = *10^14 centuries N = hours N = min 39 days N = min 1.5 days 108 years

Efficiency of recursive algorithms Example 3: Consider the following recursive version on the binary search algorithm. public static boolean binarySearchR (int[] list, int target, int low, int high) { int middle = (low + high) / 2; if (list[middle] == target) return true; else if (low > high) return false; else if (list[middle] < target) return binarySearchR (list, target, middle+1, high); else return binarySearchR (list, target, low, middle-1); }

Efficiency of recursive algorithms (contd.) Two factors define the efficiency of a recursive algorithm: 1The number of levels to which recursive calls are made before reaching the condition which triggers the return. 2The amount of space and time consumed at any given recursive level. The number of levels can be explicated by a tree of recursive calls. For the binary search example, we have the following tree of recursive calls (assume a list with 15 elements): low = 0 LEVEL high = 14 0 low = 0 OR low = 8 1 high = 6 high = 14 low = 0 O R low = 4 OR low = 8 OR low = 12 2 high = 2 high = 6 high = 10 high = 14 OR OR OR OR low = 0 low = 2 low = 4 low = 6 low = 8 low = 10 low = 12 low = 14 3 hight = 0 high = 2 high = 4 high = 6 high = 8 high = 10 high = 12 high = 14

Efficiency of recursive binary search (contd.) The efficiency of recursive binary search is: - At each level, the work done is O(1); - The overall efficiency is proportional to the number of levels, i.e. O(log n + 1). Assume that always middle = low. The tree of recursive calls becomes: low = 0 high = 14 low = 0 high = 14 N levels, i.e. O(N) eff. low = 0 high = low = 0 high = 14

Efficiency of recursive binary search (contd.) An alternative way to define the efficiency of a recursive algorithm is by means of the so-called recurrence relations. A recurrence relation is an equation that expresses the time or space efficiency of an algorithm for data set of size N in terms of the efficiency of the algorithm on a smaller data set. For recursive binary search, the recurrence relation is: C N = C (N/2) + 1 for N >= 2 with C1 = 0 To define the efficiency, we have to solve this relation. Assume N = 2 n. Then, C (2^n) = C (2^(n-1)) + 1 = C (2^(n-2)) = C (2^(n -3)) = = C (2^1) + (n - 1) = C (2^0) + n = 0 + n = log N

Efficiency of recursive binary search (contd.) The recurrence relation for binary search with middle = low is: C N = C (N-1) + 1 for N >= 2 with C1 = 1 To define the efficiency, we have to solve this relation. C N = C N = C N = C N = = C 1 + (N - 1) = 1 + N - 1 = N

Efficiency of recursive algorithms (contd.) Consider the “tower of Hanoi” algorithm: public static void towerOfHanoi (int numberOfDisks, char from, char temp, char to) { if (numberOfDisks == 1) System.out.println ("Disk 1 moved from " + from + " to " + to); else { towerOfHanoi (numberOfDisks-1, from, to, temp); System.out.println ("Disk " + numberOfDisks + " moved from " + from + " to " + to); towerOfHanoi (numberOfDisks-1, temp, from, to); }

Efficiency of the “tower of Hanoi” algorithm (contd.) Notice that two new recursive calls are initiated at each step. This suggests an exponential efficiency, i.e. O(2 N ). This result is obvious from the tree of recursive calls, which for four disks is the following: N = 4 N = 3 AND N = 3 N = 2 AND N = 2 N = 2 AND N = 2 N = 1 N = 1 N = 1 N = 1 N = 1 N = 1 N = 1 N =1

Efficiency of the “tower of Hanoi” algorithm (contd.) The recurrence relation describing this algorithm is the following: C N = 2 * C (N-1) + 1 for N >= 1 with C1 = 1 The solution of this relation gives the efficiency of the “tower of Hanoi” algorithm. C N = 2 * C N = 2 * (2 * C N-2 + 1) + 1 = 2 2 * C N = = 2 2 * (2 * C N-3 + 1) = 2 3 * C N = = 2 3 * (2 * C N-4 + 1) = 2 4 * C N =... = 2 (N-1) * C (N-2) + 2 (N-3) = = 2 (N-1) + 2 (N-2) + 2 (N-3) = 2 N - 1

A note on space efficiency The amount of space used by a program, like the number of seconds, depends on a particular implementation. However, some general analysis of space needed for a given program can be made by examining the algorithm. A program requires storage space for instructions, input data, constants, variables and objects. If input data have one natural form (for example, an array) we can analyze the amount of extra space used, aside from the space needed for the program and its data. If the amount of extra space is constant w.r.t. the input size, the algorithm is said to work in place. If the input can be represented in different forms, then we must consider the space required for the input itself plus the extra space.