Download presentation
Presentation is loading. Please wait.
Published byRebecca Brianna Watson Modified over 6 years ago
1
Compsci 201, Mathematical & Emprical Analysis
Owen Astrachan Jeff Forbes September 27, 2017 9/22/17 Compsci 201, Fall 2017, Analysis+Markov
2
I is for … Invariant Reasoning about your code Interface Inheritance
MarkovModel implements MarkovInterface<String> Inheritance EfficientMarkov extends MarkovModel Identity You’re a computer scientist. 9/27/17 Compsci 201, Fall 2017, Analysis
3
Plan for the Week Empirical & mathematical analysis of algorithms
Big-Oh basics Calculations from code Code in Towards Test #1 9/27/17 Compsci 201, Fall 2017, Analysis
4
Computer Science Scientific Method
Observe some feature of the natural world. Hypothesize a model that is consistent with the observations. Predict events using the hypothesis. Verify the predictions by making further observations. Validate by repeating until the hypothesis and observations agree. Principles Experiments we design must be reproducible; hypothesis must be falsifiable. In CompSci 201: Empirical & mathematical analysis 9/27/17 Compsci 201, Fall 2017, Analysis
5
Scientific Method Empirical & mathematical analysis
Analysis of algorithms. Framework for comparing algorithms and predicting performance. Scientific method. Observe some feature of the natural world. Hypothesize a model that is consistent with the observations. Predict events using the hypothesis. Verify the predictions by making further observations. Validate by repeating until the hypothesis and observations agree. Principles. Experiments we design must be reproducible; hypothesis must be falsifiable. Computer Science Empirical & mathematical analysis
6
Dropping Glass Balls Is your algorithm the most efficient one?
Tower with 100 Floors Given 2 glass balls Want to determine the lowest floor from which a ball can be dropped and will break How? Is your algorithm the most efficient one? Generalize to n floors 9/27/17 CompSci 201, Fall 2017, Analysis
7
Glass balls continued http://bit.ly/CS201-f17-0927-0
Assume the number of floors is 100 In the best case how many balls will I have to drop to determine the lowest floor where a ball will break? In the worst case, how many balls will I have to drop? If there are n floors, how many balls will you have to drop? (roughly)
8
What is big-Oh about? (preview)
Intuition: avoid details when they don’t matter, and they don’t matter when input size (N) is big enough For polynomials, use only leading term, ignore coefficients y = 3x y = 6x y = 15x + 44 y = x2 y = x2 - 6x+ 9 y = 3x2 + 4x The first family is O(n), the second is O(n2) Intuition: family of curves, generally the same shape More formally: O(f(n)) is an upper-bound, when n is large enough the expression cf(n) is larger Intuition: linear function: double input, double time, quadratic function: double input, quadruple the time
9
More on O-notation, big-Oh
Big-Oh hides/obscures some empirical analysis, but is good for general description of algorithm Allows us to compare algorithms in the limit 20N hours vs N2 microseconds: which is better? O-notation is an upper-bound, this means that N is O(N), but it is also O(N2); we try to provide tight bounds. Formally: g(N) ∈ O(f(N)) iff there exist constants c and n0 such that for all g(N) < cf(N), N > n cf(N) g(N) x = n0
10
Rank orders of growth http://bit.ly/201-f17-0927-1
n4 grows faster than n2 n4 ∉ O(n2) 0.001n4 is in the same growth class as 1E6n4 0.001n4, 1E6n4 ∈ O(n4)
11
Reasoning about growth
Consider a 3-tower How tall is a 5-tower? How tall is a 10 tower? How many blocks in a 5-tower? Which best captures the height of an n-tower? 9/27/17 CompSci 201, Fall 2017, Analysis
12
Three-Sum Given N integers, find triples that sum to 0.
Deeply related to problems in computational geometry. public class ThreeSum { // return number of distinct triples (i, j, k) // such that (a[i] + a[j] + a[k] == 0) public static int count(int[] a) { int N = a.length; int cnt = 0; for (int i = 0; i < N; i++) for (int j = i+1; j < N; j++) for (int k = j+1; k < N; k++) if (a[i] + a[j] + a[k] == 0) cnt++; return cnt; }
13
Empirical Analysis Empirical analysis. Run the program for various input sizes. How much time for N = 4096 on my machine? How much could I do in an minute, hour, day? N time † 512 0.03 1024 0.26 2048 2.16 4096 17.18 Run ThreeSum.java in class from the command-line and wait for it to finish, perhaps up to N = 4096. Ask students to make predictions. 8192 136.76 † Running Linux on Sun-Fire-X4100 with 16GB RAM
14
Empirical Analysis Data analysis. Plot running time vs. input size N.
log-log plot identifies power law relationships
15
Mathematical Analysis
Count up frequency of execution of each instruction and weight by its execution time. int count = 0; for (int i = 0; i < N; i++) if (a[i] == 0) count++; how many times is each instruction executed? int count = 0; for (int i = 0; i < N; i++) for (int j = 0; j < N; j++) if (a[i] + a[j] == 0) count++; Goal: simplify terms without losing explanatory power. int count = 0; for (int i = 0; i < N; i++) for (int j = i+1; j < N; j++) if (a[i] + a[j] == 0) count++;
16
Three Sum Analysis Focus on instructions in "inner loop."
Mathematical analysis. The running time is proportional to N 3. Focus on instructions in "inner loop."
17
Order of Growth Classifications
Observation. A small subset of mathematical functions suffice to describe running time of many fundamental algorithms. public void g(int N) { if (N == 0) return; g(N/2); for (int i = 0; i < N; i++ ) ... } N log2 N while (N > 1) { N = N / 2; ... } log2 N don't have to launch rockets, crash cars, or kill rats not difficult to perform and reproduce experiments 2^n: towers of Hanoi n^2: insertion sort / force calculation in N-body n log n: quicksort log n: binary search / 20 questions for (int i = 0; i < N; i++) ... N public void f(int N) { if (N == 0) return; f(N-1); ... } for (int i = 0; i < N; i++) for (int j = 0; j < N; j++) ... 2N N2
18
Big-Oh calculations from code
Search for element in an array: What is complexity of code (using O-notation)? What if array doubles, what happens to time? for(int k=0; k < a.length; k++) if (a[k].equals(target)) return true; return false; Complexity if we call N times on M-element vector? What about best case? Average case? Worst case?
19
IsomorphicWords Consider code from the solution to IsomophicWords:
What is the input size? What does the runtime depend on? What’s the big-Oh for the run-time? int total = 0; for(int j=0; j < words.length; j++) { for(int k=j+1; k < words.length; k++) { if (isomorphic(words[j],words[k])) { total += 1; } } } return total; 9/27/17 CompSci 201, Fall 2017, Analysis
20
Array vs. ArrayList Run the code ArrayVsArrayList.java
Change the value of argument Submit your data here: Submit as many times as you want
21
Amortization: Expanding ArrayLists
Expand capacity of list when add() called Calling add N times, doubling capacity as needed Big-Oh of adding n elements? What if we grow size by one each time? Item # Resizing cost Cumulative cost Resizing Cost per item Capacity After add 1 2 3-4 4 6 1.5 5-8 8 14 1.75 ... 2m+1 - 2m+1 2 m+1 2m+2-2 around 2 2m+1
22
Some helpful mathematics
… + N N(N+1)/2, exactly = N2/2 + N/2 which is O(N2) why? N + N + N + …. + N (total of N times) N*N = N2 which is O(N2) N + N + N + …. + N + … + N + … + N (total of 3N times) 3N*N = 3N2 which is O(N2) … + 2N 2N+1 – 1 = 2 x 2N – 1 which is O(2N ) Impact of last statement on adding 2N+1 elements to a vector … + 2N + 2N+1 = 2N+2-1 = 4x2N-1 which is O(2N) resizing + copy = total (let x = 2N)
23
Running times @ 109 instructions/sec
O(log N) O(N) O(N log N) O(N2) 10 3E-9 1E-8 3.3E-8 100 7E-9 1E-7 6.64E-7 0.0001 1,000 1E-6 0.001 10,000 1.3E-8 0.102 100,000 1.7E-8 10.008 1,000,000 0.0199 16.7 min 1,000,000,000 1.002 65.8 3.18 centuries
24
Analysis: Empirical vs. Mathematical
Empirical analysis. Measure running times, plot, and fit curve. Easy to perform experiments. Model useful for predicting, but not for explaining. Mathematical analysis. Analyze algorithm to estimate # ops as a function of input size. May require advanced mathematics. Model useful for predicting and explaining. Critical difference. Mathematical analysis is independent of a particular machine or compiler; applies to machines not yet built.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.