E-mail: cs415spring11@gmail.com CS 415 Algorithm Analysis Instructor: B. (Ravi) Ravikumar Office: 116 I Darwin Hall Phone: 664 3335 E-mail: cs415spring11@gmail.com Course Web site: http://ravi.cs.sonoma.edu/cs415sp11 1
Text book: Published in 2006. Other references: Cormen et al. Introduction to Algorithms Das Gupta, Papadimitriou and Vazirani, Algorithms (free online version) Numerous other sources Please report any errors to Kevin Wayne (wayne@cs.princeton.edu). Thanks!
Course policies, grading etc. Short quizzes and class participation – 5 to 10 points (There will be about 8 quizzes, each roughly 10 minutes long. The quizzes will not require difficult problem solving, but will require a basic understanding of the topic being covered.) Home work assignments - 15 points. Most home work problems will be from the text. They are intended to deepen your understanding of the material presented in class and will play a key role in prepare you for the mid-term and final exams. You can discuss the home work problems with others, but you should write up the solution on your own. Two mid-semester tests - 30 points (The mid-semester tests will be open-book/notes, and will be in class.) Project - 15 points. This will be an implementation project. Details will be discussed in class. Final Examination - 25 points. This will be comprehensive and will take place at the pre-specified time slot. You can find the date from the university’s calendar. Maintenance of course folder – 5 points.
Review of CS 242 and CS 315 handout contains questions/problems on CS 242 and CS 315.
What is an algorithm? In fact our field computer science is often described as a study of algorithms. At least, algorithm is the viewed as equivalent to “software”. We all have a good intuitive understanding of what an algorithm is. But what is the precise definition? Algorithms are much older than computers. How old are they? They are the ideas behind software and so don’t need a hardware to run them. (Or, can be run by “human computers”.)
Rhind Mathematical Papyrus An old algorithm – more than 3500 years old Rhind Mathematical Papyrus British Museum, London One of the algorithms presented is for multiplying two numbers: 59 41 _________________ 1 41 x 2 82 x 4 164 8 328 x 16 656 x 32 1312 x _________________ 2419
An algorithm from Al-Khwarizmi’s book (about 1200 years old)
Historically famous algorithms Classical (~ 2000 years) Finding area of triangles and other shapes (Heron’s formula) Finding GCD of integers (Euclid’s algorithm) Solving linear system of equations, quadratic equation etc. Intermediate (~ 500 years old) logarithm tables Roots of polynomials, solution of differential equations etc. Applications? Dividing land (after floods) Calculating trajectories of planets architecture (buildings, dams etc.) navigation (finding direction in deep sea) town planning, governance (tax rate etc.)
Famous modern algorithms (20th century algorithms) quick-sort Dijkstra’s algorithm RSA (for encryption) Fast Fourier Transform Simplex algorithm Huffman coding algorithm, LZW algorithm (both for compression) Hamming code (corrects errors) page-rank algorithm (of Brin and Page) There are probably many other candidates.
Euclid’s algorithm for GCD computation Euclid’s algorithm for finding the GCD of two positive integers. What is the definition of GCD? Algorithm GCD input: x, y – positive integers output: m – GCD of x and y
Euclid’s algorithm for GCD computation To define GCD, we first need the definition of divisor, then common divisor and finally greatest common divisor. p is a divisor of q if p is a multiple of q, or the remainder when q is divided by p is 0. (We write p | q) Example: 4 | 12, 5 | 30, but 12 | 30. p is a common divisor of x and y if p | x and p | y. p is the GCD of x and y if p is the largest among the common divisors of x and y.
Euclid’s algorithm for GCD computation Algorithm GCD input: x, y – positive integers output: m – GCD of x and y Step 1: if y > x, swap x and y. Step 2: if y = 0, set m = x and stop. Step 3: r = remainder when x divided by y x = y y = r Step 4: go to step 2 Exercise: compute the GCD of 948 and 156.
Correctness of the algorithm Does the algorithm stop on all inputs? (i.e., can we show that there is no infinite loop?) if the algorithm has no loops, this is not a problem. But most algorithms do. (Euclid’s does!) Does it output the correct result for all inputs? Work out some small test cases and see if we get the correct result. Even this is tricky. We need to know a different way (than using the present algorithm) to get the answer so that we can cross-check. But proving correctness requires mathematical argument, not just some test cases.
Another algorithm for GCD Find the prime factorization of both numbers. i.e., keep factoring the number until it is written as a product of primes. If a prime appears in both lists, choose it and remove it from both lists and repeat the process until there are no common elements. The product of all the chosen numbers is the GCD. Question: How old is this algorithm? Example: 948 and 156 948 = 2 x 2 x 3 x 79 156 = 2 x 2 x 3 x 13 Common prime factors 2, 2 and 3. GCD = 2 x 2 x 3 = 12
Proving the correctness of Euclid’s algorithm Two steps involved in proving that an algorithm is correct. Show that Euclid’s algorithm always terminates. Will be done in class. Show that Euclid’s algorithm always produces the correct output.
Complexity of the algorithm Time complexity is one of the central issues that we will be discussing. For every algorithm, we want to measure its time complexity. Rough definition of time complexity: The number of operations performed by the algorithm. What operations? we could count every operation, e.g. assignment, comparison, addition etc. but what is an operation? E.g. % or / may require many CPU cycles, or machine instructions. one approach is to fix an instruction set, and architecture and count the number of machine instructions. another approach is to focus on the most expensive operation – e.g. in the case of GCD problem, division is the most expensive operation.
Complexity depends on the size of the input Measured as a function of N, the input size. For the GCD problem, the input is the total number of bits in the inputs. Example: for the input (23, 132), the input size is 5 + 8 since we need 5 bits to represent 23, and 8 bits to represent 132.
Complexity of the algorithm Many ways to measure complexity: best-case: of all the inputs of size N, the input that makes the algorithm perform the fewest number of divisions. What is the best case for Euclid’s algorithm? worst-case: The one of size N for which the algorithm performs the maximum number of operations. average-case: add up the complexity for all inputs of size N, and divide it by the number of such inputs. The most common one used in this course is the worst-case complexity.
Algorithm design techniques At least two approaches: problem-oriented shortest-path sorting linear programming etc. technique-oriented greedy method dynamic programming divide and conquer transformation etc.
Reduction – mapping one problem to another Effectiveness of problem-oriented method is if you know how to solve problem A, then you know how to solve problem B. most problems can be solved by breaking into sub problems each of which is one of the standard problems for which well-crafted solution exists. a powerful way to reduce one problem to another is “problem transformation”.
Transformation consider a two player game in which players alternately choose a number between 1 and 9. (Each number can be chosen at most once.) At any stage, if a player has three numbers that add to 15, he/she has won. What will be your strategy?
Transformation – Another example Weighted Interval Scheduling Input. Set of jobs with start times, finish times, and weights. Goal. Find maximum weight subset of mutually compatible jobs. 23 12 20 26 13 20 11 16 Time 1 2 3 4 5 6 7 8 9 10 11