Compsci 201, Mathematical & Emprical Analysis

Slides:



Advertisements
Similar presentations
Algorithm Analysis.
Advertisements

Lecture: Algorithmic complexity
Razdan with contribution from others 1 Algorithm Analysis What is the Big ‘O Bout? Anshuman Razdan Div of Computing.
Reference: Tremblay and Cheston: Section 5.1 T IMING A NALYSIS.
Fall 2006CENG 7071 Algorithm Analysis. Fall 2006CENG 7072 Algorithmic Performance There are two aspects of algorithmic performance: Time Instructions.
Introduction to Analysis of Algorithms
Lecture 3 Aug 31, 2011 Goals: Chapter 2 (algorithm analysis) Examples: Selection sorting rules for algorithm analysis discussion of lab – permutation generation.
Analysis of Algorithms 7/2/2015CS202 - Fundamentals of Computer Science II1.
Lecture 3 Feb 7, 2011 Goals: Chapter 2 (algorithm analysis) Examples: Selection sorting rules for algorithm analysis Image representation Image processing.
Analysis of Algorithms COMP171 Fall Analysis of Algorithms / Slide 2 Introduction * What is Algorithm? n a clearly specified set of simple instructions.
Analysis of Algorithms Spring 2015CS202 - Fundamentals of Computer Science II1.
Algorithm Analysis (Big O)
COMP s1 Computing 2 Complexity
COMPSCI 102 Introduction to Discrete Mathematics.
SIGCSE Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan
Chapter 2.6 Comparison of Algorithms modified from Clifford A. Shaffer and George Bebis.
Week 2 CS 361: Advanced Data Structures and Algorithms
4.1 Performance Introduction to Programming in Java: An Interdisciplinary Approach · Robert Sedgewick and Kevin Wayne · Copyright © 2008 · October 1, 2015.
Analysis of Algorithms
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
Introduction to Programming (in C++) Complexity Analysis of Algorithms
CompSci Analyzing Algorithms  Consider three solutions to SortByFreqs, also code used in Anagram assignment  Sort, then scan looking for changes.
Algorithm Analysis (Big O)
Algorithm Analysis. What is an algorithm ? A clearly specifiable set of instructions –to solve a problem Given a problem –decide that the algorithm is.
CISC220 Spring 2010 James Atlas Lecture 07: Big O Notation.
CPS 100e 5.1 Inheritance and Interfaces l Inheritance models an "is-a" relationship  A dog is a mammal, an ArrayList is a List, a square is a shape, …
Analysis of Algorithms Spring 2016CS202 - Fundamentals of Computer Science II1.
AP National Conference, AP CS A and AB: New/Experienced A Tall Order? Mark Stehlik
Data Structures I (CPCS-204) Week # 2: Algorithm Analysis tools Dr. Omar Batarfi Dr. Yahya Dahab Dr. Imtiaz Khan.
Algorithm Analysis 1.
Introduction to Analysis of Algorithms
Analysis of Algorithms
Analysis of Algorithms
GC 211:Data Structures Week 2: Algorithm Analysis Tools
Analysis of Algorithms
Introduction to Algorithms
GC 211:Data Structures Algorithm Analysis Tools
Analysis of Algorithms
Analysis of Algorithms
Algorithm Analysis (not included in any exams!)
Building Java Programs
CSE 143 Lecture 5 Binary search; complexity reading:
Algorithm design and Analysis
GC 211:Data Structures Algorithm Analysis Tools
CSC 413/513: Intro to Algorithms
What is CS 253 about? Contrary to the wide spread belief that the #1 job of computers is to perform calculations (which is why the are called “computers”),
Introduction to Algorithms Analysis
Algorithm Efficiency Chapter 10.
Big O Notation.
Complexity Analysis of Algorithms
Lecture 5: complexity reading:
CS 201 Fundamental Structures of Computer Science
Programming and Data Structure
Analysis of Algorithms
DS.A.1 Algorithm Analysis Chapter 2 Overview
Analysis of Algorithms
Programming and Data Structure
Algorithm Analysis Bina Ramamurthy CSE116A,B.
Analyzing an Algorithm Computing the Order of Magnitude Big O Notation
Searching, Sorting, and Asymptotic Complexity
CSC 427: Data Structures and Algorithm Analysis
Introduction to Discrete Mathematics
CSE 1342 Programming Concepts
Sum this up for me Let’s write a method to calculate the sum from 1 to some n public static int sum1(int n) { int sum = 0; for (int i = 1; i
Analysis of Algorithms
Java Basics – Arrays Should be a very familiar idea
Algorithms and data structures: basic definitions
Algorithm Analysis How can we demonstrate that one algorithm is superior to another without being misled by any of the following problems: Special cases.
Compsci 201, O-Notation and Maps (Interfaces too)
Presentation transcript:

Compsci 201, Mathematical & Emprical Analysis Owen Astrachan Jeff Forbes September 27, 2017 9/22/17 Compsci 201, Fall 2017, Analysis+Markov

I is for … Invariant Reasoning about your code Interface Inheritance MarkovModel implements MarkovInterface<String> Inheritance EfficientMarkov extends MarkovModel Identity You’re a computer scientist. 9/27/17 Compsci 201, Fall 2017, Analysis

Plan for the Week Empirical & mathematical analysis of algorithms Big-Oh basics Calculations from code Code in https://coursework.cs.duke.edu/201fall17/classwork Towards Test #1 9/27/17 Compsci 201, Fall 2017, Analysis

Computer Science Scientific Method Observe some feature of the natural world. Hypothesize a model that is consistent with the observations. Predict events using the hypothesis. Verify the predictions by making further observations. Validate by repeating until the hypothesis and observations agree. Principles Experiments we design must be reproducible; hypothesis must be falsifiable. In CompSci 201: Empirical & mathematical analysis 9/27/17 Compsci 201, Fall 2017, Analysis

Scientific Method Empirical & mathematical analysis Analysis of algorithms. Framework for comparing algorithms and predicting performance. Scientific method. Observe some feature of the natural world. Hypothesize a model that is consistent with the observations. Predict events using the hypothesis. Verify the predictions by making further observations. Validate by repeating until the hypothesis and observations agree. Principles. Experiments we design must be reproducible; hypothesis must be falsifiable. Computer Science Empirical & mathematical analysis

Dropping Glass Balls Is your algorithm the most efficient one? Tower with 100 Floors Given 2 glass balls Want to determine the lowest floor from which a ball can be dropped and will break How? Is your algorithm the most efficient one? Generalize to n floors 9/27/17 CompSci 201, Fall 2017, Analysis

Glass balls continued http://bit.ly/CS201-f17-0927-0 Assume the number of floors is 100 In the best case how many balls will I have to drop to determine the lowest floor where a ball will break? In the worst case, how many balls will I have to drop? If there are n floors, how many balls will you have to drop? (roughly)

What is big-Oh about? (preview) Intuition: avoid details when they don’t matter, and they don’t matter when input size (N) is big enough For polynomials, use only leading term, ignore coefficients y = 3x y = 6x - 2 y = 15x + 44 y = x2 y = x2 - 6x+ 9 y = 3x2 + 4x The first family is O(n), the second is O(n2) Intuition: family of curves, generally the same shape More formally: O(f(n)) is an upper-bound, when n is large enough the expression cf(n) is larger Intuition: linear function: double input, double time, quadratic function: double input, quadruple the time

More on O-notation, big-Oh Big-Oh hides/obscures some empirical analysis, but is good for general description of algorithm Allows us to compare algorithms in the limit 20N hours vs N2 microseconds: which is better? O-notation is an upper-bound, this means that N is O(N), but it is also O(N2); we try to provide tight bounds. Formally: g(N) ∈ O(f(N)) iff there exist constants c and n0 such that for all g(N) < cf(N), N > n cf(N) g(N) x = n0

Rank orders of growth http://bit.ly/201-f17-0927-1 n4 grows faster than n2 n4 ∉ O(n2) 0.001n4 is in the same growth class as 1E6n4 0.001n4, 1E6n4 ∈ O(n4) http://bit.ly/201-f17-0927-1

Reasoning about growth Consider a 3-tower How tall is a 5-tower? How tall is a 10 tower? How many blocks in a 5-tower? Which best captures the height of an n-tower? http://bit.ly/201-f17-0927-2 9/27/17 CompSci 201, Fall 2017, Analysis

Three-Sum Given N integers, find triples that sum to 0. Deeply related to problems in computational geometry. public class ThreeSum { // return number of distinct triples (i, j, k) // such that (a[i] + a[j] + a[k] == 0) public static int count(int[] a) { int N = a.length; int cnt = 0; for (int i = 0; i < N; i++) for (int j = i+1; j < N; j++) for (int k = j+1; k < N; k++) if (a[i] + a[j] + a[k] == 0) cnt++; return cnt; }

Empirical Analysis Empirical analysis. Run the program for various input sizes. How much time for N = 4096 on my machine? How much could I do in an minute, hour, day? N time † 512 0.03 1024 0.26 2048 2.16 4096 17.18 Run ThreeSum.java in class from the command-line and wait for it to finish, perhaps up to N = 4096. Ask students to make predictions. 8192 136.76 † Running Linux on Sun-Fire-X4100 with 16GB RAM

Empirical Analysis Data analysis. Plot running time vs. input size N. log-log plot identifies power law relationships

Mathematical Analysis Count up frequency of execution of each instruction and weight by its execution time. int count = 0; for (int i = 0; i < N; i++) if (a[i] == 0) count++; how many times is each instruction executed? int count = 0; for (int i = 0; i < N; i++) for (int j = 0; j < N; j++) if (a[i] + a[j] == 0) count++; Goal: simplify terms without losing explanatory power. int count = 0; for (int i = 0; i < N; i++) for (int j = i+1; j < N; j++) if (a[i] + a[j] == 0) count++;

Three Sum Analysis Focus on instructions in "inner loop." Mathematical analysis. The running time is proportional to N 3. Focus on instructions in "inner loop."

Order of Growth Classifications Observation. A small subset of mathematical functions suffice to describe running time of many fundamental algorithms. public void g(int N) { if (N == 0) return; g(N/2); for (int i = 0; i < N; i++ ) ... } N log2 N while (N > 1) { N = N / 2; ... } log2 N don't have to launch rockets, crash cars, or kill rats not difficult to perform and reproduce experiments 2^n: towers of Hanoi n^2: insertion sort / force calculation in N-body n log n: quicksort log n: binary search / 20 questions for (int i = 0; i < N; i++) ... N public void f(int N) { if (N == 0) return; f(N-1); ... } for (int i = 0; i < N; i++) for (int j = 0; j < N; j++) ... 2N N2

Big-Oh calculations from code Search for element in an array: What is complexity of code (using O-notation)? What if array doubles, what happens to time? for(int k=0; k < a.length; k++) if (a[k].equals(target)) return true; return false; Complexity if we call N times on M-element vector? What about best case? Average case? Worst case?

IsomorphicWords Consider code from the solution to IsomophicWords: What is the input size? What does the runtime depend on? What’s the big-Oh for the run-time? int total = 0; for(int j=0; j < words.length; j++) {     for(int k=j+1; k < words.length; k++) {         if (isomorphic(words[j],words[k])) {             total += 1;         }     } } return total; 9/27/17 CompSci 201, Fall 2017, Analysis

Array vs. ArrayList Run the code ArrayVsArrayList.java https://coursework.cs.duke.edu/201fall17/classwork/blob/master/src/ArrayVsArrayList.java Change the value of argument Submit your data here: Submit as many times as you want http://bit.ly/201-f17-0927-3

Amortization: Expanding ArrayLists Expand capacity of list when add() called Calling add N times, doubling capacity as needed Big-Oh of adding n elements? What if we grow size by one each time? Item # Resizing cost Cumulative cost Resizing Cost per item Capacity After add 1 2 3-4 4 6 1.5 5-8 8 14 1.75 ... 2m+1 - 2m+1 2 m+1 2m+2-2 around 2 2m+1

Some helpful mathematics 1 + 2 + 3 + 4 + … + N N(N+1)/2, exactly = N2/2 + N/2 which is O(N2) why? N + N + N + …. + N (total of N times) N*N = N2 which is O(N2) N + N + N + …. + N + … + N + … + N (total of 3N times) 3N*N = 3N2 which is O(N2) 1 + 2 + 4 + … + 2N 2N+1 – 1 = 2 x 2N – 1 which is O(2N ) Impact of last statement on adding 2N+1 elements to a vector 1 + 2 + … + 2N + 2N+1 = 2N+2-1 = 4x2N-1 which is O(2N) resizing + copy = total (let x = 2N)

Running times @ 109 instructions/sec O(log N) O(N) O(N log N) O(N2) 10 3E-9 1E-8 3.3E-8 0.0000001 100 7E-9 1E-7 6.64E-7 0.0001 1,000 1E-6 0.00001 0.001 10,000 1.3E-8 0.0001329 0.102 100,000 1.7E-8 0.001661 10.008 1,000,000 0.00000002 0.0199 16.7 min 1,000,000,000 0.00000003 1.002 65.8 3.18 centuries

Analysis: Empirical vs. Mathematical Empirical analysis. Measure running times, plot, and fit curve. Easy to perform experiments. Model useful for predicting, but not for explaining. Mathematical analysis. Analyze algorithm to estimate # ops as a function of input size. May require advanced mathematics. Model useful for predicting and explaining. Critical difference. Mathematical analysis is independent of a particular machine or compiler; applies to machines not yet built.