Presentation is loading. Please wait.

Presentation is loading. Please wait.

Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can.

Similar presentations


Presentation on theme: "Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can."— Presentation transcript:

1 Week 12: Running Time and Performance 1

2  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can take much longer:  Chess algorithms (Deep Blue)  Routing and scheduling algorithms (Traveling Sales Person, Purdue classroom scheduling)  Medical imaging (massive amounts of data)  Graphics processing (Pixar: Toy Story, Incredibles) 2

3  You can time the program  What if it takes 1000 years to run?  You can look at the program  What are you looking for?  Does the input matter?  Sorting 100,000 numbers must take longer than sorting 10 numbers, right? 3

4  A sequence of steps used to solve a problem  In CS, we take an algorithm and convert it into code -- Java in this class  The study of algorithms is intended to:  Make sure that the code you write gets the right answer  Make sure that the code runs as quickly as possible  Make sure that the code uses a reasonable amount of memory 4

5  You want to find the definition of the word “rheostat” in the dictionary  The Oxford English Dictionary has 301,100 main entries  What if you search for the word “rheostat” by starting at the beginning and checking each word to see if it is the one we are interested in  Given a random word, you’ll have to check about 150,550 entries before you find it  Alternatively, you could jump to the correct letter of the alphabet, saving tons of time  Searching for “zither” takes forever otherwise 5

6  Write a program to sum up the numbers between 1 and n  Easy, right?  Let us recall that a faster way might be to use the summation equation: 6

7  Memory usage is a problem too  If you run out of memory, your program can crash  Memory usage can have serious performance consequences too 7

8  Remember, there are multiple levels of memory on a computer  Each next level is on the order of 100 times larger and 100 times slower Cache Actually on the CPU Fast and expensive RAM Primary memory for a desktop computer Pretty fast and relatively expensive Hard Drive Secondary memory for a desktop computer Slow and cheap 100X SizeSpeed 8

9  If you can do a lot of number crunching without leaving cache, that will be very fast  If you have to fetch data from RAM, that will slow things down  If you have to read and write data to the hard drive (unavoidable with large pieces of data like digital movies), you will slow things down a lot 9

10  Memory can be easier to estimate than running time  Depending on your input, you will allocate a certain number of objects, arrays, and primitive data types  It is possible to count the storage for each item allocated  A reference to an object or an array costs an additional 4 bytes on top of the size of the object 10

11  A graph is a set of nodes and edges  It’s a really simple way to represent almost any kind of relationship  The numbers on the edges could be distances, costs, time, whatever 5 3 4 16 A A B B D D C C E E 2 8 6 11

12  There’s a classic CS problem called the Traveling Salesperson Problem (TSP)  Given a graph, find the shortest path that visits every node and comes back to the starting point  Like a lazy traveling salesperson who wants to visit all possible clients and then return home as soon as possible 12

13  Strategies:  Always visit the nearest neighbor next  Randomized approaches  Try every possible combination  How can you tell what’s the best solution? 13

14  We are tempted to always take the closest neighbor, but there are pitfalls 14

15  In a completely connected graph, we can try any sequence of nodes  If there are n nodes, there are (n – 1)! tours  For 30 cities, 29! = 8841761993739701954543616000000  If we can check 1,000,000,000 tours in one second, it will only take about 20,000 times the age of the universe to check them all  We will (eventually) get the best answer! 15

16  This and similar problems are useful problems to solve for people who need to plan paths  UPS  Military commanders  Pizza delivery people  Internet routing algorithms  No one knows a solution that always works in a reasonable amount of time  We have been working on it for decades 16

17  The problem is so hard that you can win money if you solve it  Clay Mathematics Institute has offered a $1,000,000 prize  You can do one of two things to collect it:  Find an efficient solution to the general problem  Prove that it is impossible to find an efficient solution in some cases 17

18  How do we measure algorithm running time and memory usage in general?  We want to take into account the size of the problem (TSP with 4 cities is certainly easier than TSP with 4,000 cities)  Some algorithms might work well in certain cases and badly in others  How do we come up with a measuring scheme? 18

19  Ideally, we want to classify both problems and algorithms  We want to figure out which problems take a really long time to solve  We want to figure out how we can compare one algorithm to another  We want to prove that some algorithm is the best possible algorithm to solve a particular problem 19

20  Computer scientists use Big-Oh notation to give an indication of how much time (or memory) a program will take  Big-Oh notation is:  Asymptotic Focuses on the behavior as the size of input gets large  Worst-Case Focuses on the absolute longest the program can take 20

21  Add up the operations done by the following code:  Initialization: 1 operation  Loop: 1 initialization + n checks + n increments + n additions to sum = 3n + 1  Output: 1 operation  Total: 3n + 3 int sum = 0; for( int i = 0; i < n; i++ ) sum += i; System.out.println(“Sum: “ + sum); 21

22  We could express the time taken by the code on the previous slide as a function of n: f(n) = 3n + 3  This approach has a number of problems:  We assumed that each line takes 1 time unit to accomplish, but the output line takes much longer than an integer operation  This program is 4 lines long, a longer program is going to have a very messy running time function  We can get nit picky about details: Does sum += i; take one operation or two if we count the addition and the store separately? 22

23  In short, this way of getting a running time function is almost useless because:  It cannot be used to give us an idea of how long the program really runs in seconds  It is complex and unwieldy  The most important thing about the analysis of the code that we did is learning that the growth of the function should be linear (3n+3)  A general description of how the running time grows as the input grows would be useful 23

24  Enter Big Oh notation  Big Oh simplifies a complicated running time function into a simple statement about its worst case growth rate  All constant coefficients are ignored  All low order terms are ignored  3n + 3 is O(n)  Big Oh is a statement that a particular running time is no worse than some function, with appropriate constants 24

25  147n 3 + 2n 2 + 5n + 12083 is O(n3)O(n3)  15n 2 + 6n + 7log n + 145 is O(n2)O(n2)  log n + 87829 is O(log n)  Note: In CS, we use log 2 unless stated otherwise 25

26  How long does it take to do multiplication by hand? 123 x 456 738 615 492__ 56088  Let’s assume that the length of the numbers is n digits  (n multiplications + n carries) x n digits + (n + 1 digits) x n additions = 3n 2 + n  Running time: O(n 2 ) 26

27  How do we find the largest element in an array?  Running time: O(n) if n is the length of the array  What if the array is sorted in ascending order?  Running time: O(1) int largest = array[0]; for( int i = 1; i < array.length; i++ ) if( array[i] > largest ) largest = array[i]; System.out.println(“Largest: “ + largest); System.out.println(“Largest: “ + array[array.length-1]); 27

28  Given two n x n matrices A and B, the code to multiply them is:  Running time: O(n 3 )  Is there a faster way to multiply matrices?  Yes, but it’s complicated and has other problems double[][] c = new double[n][n]; for( int i = 0; i < n; i++ ) for( int j = 0; j < n; j++ ) for( int k = 0; k < n; k++ ) c[i][j] += a[i][k]*b[k][j]; 28

29  Here is some code that sorts an array in ascending order  What is its running time?  Running time: O(n 2 ) for( int i = 0; i < array.length; i++ ) for( int j = 0; j < array.length - 1; j++ ) if( array[j] > array[j + 1] ) { int temp = array[j]; array[j] = array[j + 1]; array[j + 1] = temp; } 29

30  There is nothing better than constant time (O(1))  Logarithmic time (O(log n)means that the problem can become much larger and only take a little longer  Linear time (O(n))means that time grows with the problem  Quadratic time (O(n 2 )) means that expanding the problem size significantly could make it impractical  Cubic time (O(n 3 )) is about the reasonable maximum if we expect the problem to grow  Exponential and factorial time mean that we cannot solve anything but the most trivial problem instances 30

31  Looking at the code and determining the order of complexity is one technique  Sometimes this only captures worst-case behavior  The code can be complicated and difficult to understand  Another possibility is running experiments to see how the running time changes as the problem size increases 31

32  The doubling hypothesis states that it is often possible to determine the order of the running time of a program by progressively doubling the input size and measuring the changes in time 32

33  Let’s use the doubling hypothesis to test the running time of the program that finds the largest element in an unsorted array  Looks like a decent indication of O(n) SizeTimeRatio 10240.035829 ms- 20480.071727 ms2.00 40960.188013 ms2.62 81920.278877 ms1.48 33

34  Let’s check matrix multiplication  A factor of 8 is expected for O(n 3 )  Not too bad SizeTimeRatio 12839.786341 ms- 256229.902823 ms5.78 5122945.396691 ms12.81 102425957.367417 ms8.81 34

35  Let’s try bubble sort  Success! O(n 2 ) is supposed to give a ratio of 4  We should have done more trials to do this scientifically SizeTimeRatio 10248.648306 ms- 204828.331604 ms3.28 4096108.931582 ms3.84 8192404.451683 ms3.71 35


Download ppt "Week 12: Running Time and Performance 1.  Most of the problems you have written in this class run in a few seconds or less  Some kinds of programs can."

Similar presentations


Ads by Google