Performance Measurement

Slides:



Advertisements
Similar presentations
CHAPTER 2 ALGORITHM ANALYSIS 【 Definition 】 An algorithm is a finite set of instructions that, if followed, accomplishes a particular task. In addition,
Advertisements

1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.
Lecture3: Algorithm Analysis Bohyung Han CSE, POSTECH CSED233: Data Structures (2014F)
Fundamentals of Python: From First Programs Through Data Structures
Introduction to Analysis of Algorithms
Performance Measurement Performance Analysis Paper and pencil. Don’t need a working computer program or even a computer.
Insertion Sort for (i = 1; i < n; i++) {/* insert a[i] into a[0:i-1] */ int t = a[i]; int j; for (j = i - 1; j >= 0 && t < a[j]; j--) a[j + 1] = a[j];
Time Complexity s Sorting –Insertion sorting s Time complexity.
1 Performance Measurement CSE, POSTECH 2 2 Program Performance Recall that the program performance is the amount of computer memory and time needed to.
Insertion Sort for (int i = 1; i < a.length; i++) {// insert a[i] into a[0:i-1] int t = a[i]; int j; for (j = i - 1; j >= 0 && t < a[j]; j--) a[j + 1]
Asymptotic Notation (O, Ω, )
Performance Measurement Performance Analysis Paper and pencil. Don’t need a working computer program or even a computer.
SNU IDB Lab. Ch4. Performance Measurement © copyright 2006 SNU IDB Lab.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
ANALYSING COSTS COMP 103. RECAP  ArrayList: add(), ensuring capacity, iterator for ArrayList TODAY  Analysing Costs 2 RECAP-TODAY.
COSC 1P03 Data Structures and Abstraction 2.1 Analysis of Algorithms Only Adam had no mother-in-law. That's how we know he lived in paradise.
TAs Fardad Jalili: Aisharjya Sarkar: Soham Das:
1 Chapter 2 Program Performance. 2 Concepts Memory and time complexity of a program Measuring the time complexity using the operation count and step count.
Introduction to Analysing Costs 2013-T2 Lecture 10 School of Engineering and Computer Science, Victoria University of Wellington  Marcus Frean, Rashina.
CSC 108H: Introduction to Computer Programming Summer 2011 Marek Janicki.
LECTURE 9 CS203. Execution Time Suppose two algorithms perform the same task such as search (linear search vs. binary search) and sorting (selection sort.
Complexity Analysis (Part I)
CSC 427: Data Structures and Algorithm Analysis
May 17th – Comparison Sorts
Analysis of Algorithms
CS/COE 1541 (term 2174) Jarrett Billingsley
Lecture 16: Data Storage Wednesday, November 6, 2006.
William Stallings Computer Organization and Architecture 8th Edition
Recitation 13 Searching and Sorting.
Introduction to complexity
Big-O notation.
Introduction to Analysing Costs
Source: wikipedia
Introduction to Algorithms
Architecture Background
Source:
Algorithm Analysis CSE 2011 Winter September 2018.
Computer Science 210 Computer Organization
Performance Measurement
CSCI1600: Embedded and Real Time Software
Big-Oh and Execution Time: A Review
Algorithm Analysis (not included in any exams!)
Building Java Programs
Computer Science 210 Computer Organization
CSE 143 Lecture 5 Binary search; complexity reading:
Insertion Sort for (int i = 1; i < a.length; i++)
Quicksort analysis Bubble sort
Algorithm An algorithm is a finite set of steps required to solve a problem. An algorithm must have following properties: Input: An algorithm must have.
ITEC 2620M Introduction to Data Structures
CSC 413/513: Intro to Algorithms
Data Structures Review Session
CS 201 Fundamental Structures of Computer Science
Building Java Programs
Simple Sorting Methods: Bubble, Selection, Insertion, Shell
Sorting Rearrange a[0], a[1], …, a[n-1] into ascending order.
CSE 373 Data Structures and Algorithms
CSC 427: Data Structures and Algorithm Analysis
Measuring Experimental Performance
Insertion Sort for (int i = 1; i < n; i++)
Performance Measurement
Memory System Performance Chapter 3
Sum this up for me Let’s write a method to calculate the sum from 1 to some n public static int sum1(int n) { int sum = 0; for (int i = 1; i
Performance Measurement
Estimating Algorithm Performance
Insertion Sort for (int i = 1; i < n; i++)
CSCI1600: Embedded and Real Time Software
Performance Measurement
Complexity Analysis (Part I)
Modern Collections Classes & Generics
Complexity Analysis (Part I)
Presentation transcript:

Performance Measurement

Performance Analysis Paper and pencil. Don’t need a working computer program or even a computer. These could be considered some of the advantages of performance analysis over performance measurement.

Some Uses Of Performance Analysis determine practicality of algorithm predict run time on large instance compare 2 algorithms that have different asymptotic complexity e.g., O(n) and O(n2) If the complexity is O(n2), we may consider the algorithm practical even for large n. An O(2n) algorithm is practical only for small n, say n < 40. The worst-case run time of an O(n2) algorithm will quadruple with each doubling of n. So if the worst-case time is 10 sec when n = 100, it will be approximately 40 sec when n = 200.

Limitations of Analysis Doesn’t account for constant factors. but constant factor may dominate 1000n vs n2 and we are interested only in n < 1000

Limitations of Analysis Modern computers have a hierarchical memory organization with different access time for memory at different levels of the hierarchy.

Memory Hierarchy MAIN L2 L1 ALU R 8-32 32KB 512KB 512MB 1C 2C 10C 100C 1 cycle to add data that are in the registers 2 cycles to copy data from L1 cache to a register 10 cycles to copy from L2 to L1 and register L1 ALU R 8-32 32KB 512KB 512MB 1C 2C 10C 100C

Limitations of Analysis Our analysis doesn’t account for this difference in memory access times. Programs that do more work may take less time than those that do less work. A program that does 100 operations on the same data would take less time than A program that performs a single operation on say 50 different pieces of data (assuming, in a simple model) that each time data is fetched from main memory, only one piece of data is fetched. The first program would make one fetch at a cost of 100 cycles and perform 100 operations at a cost of 1 cycle each for a total of 200 cycles. The second program would need 50 * 100 cycles just to fetch the data.

Performance Measurement Measure actual time on an actual computer. What do we need?

Performance Measurement Needs programming language working program computer compiler and options to use gcc –O2

Performance Measurement Needs data to use for measurement worst-case data best-case data average-case data timing mechanism --- clock

Timing In C++ double clocksPerMillis = double(CLOCKS_PER_SEC) / 1000; // clock ticks per millisecond clock_t startTime = clock(); // code to be timed comes here double elapsedMillis = (clock() – startTime) / clocksPerMillis; // elapsed time in milliseconds

Shortcoming Clock accuracy assume 100 ticks Repeat work many times to bring total time to be >= 1000 ticks Preceding measurement code is acceptable only when the elapsed time is large relative to the accuracy of the clock.

Accurate Timing clock_t startTime = clock(); long numberOfRepetitions; do { numberOfRepetitions++; doSomething(); } while (clock() - startTime < 1000) double elapsedMillis = (clock()- startTime) / clocksPerMillis; double timeForCode = elapsedMillis/numberOfRepetitions;

Accuracy Now accuracy is 10%. first reading may be just about to change to startTime + 100 second reading (final value of clock())may have just changed to finishTime so finishTime - startTime is off by 100 ticks

Accuracy first reading may have just changed to startTime second reading may be about to change to finishTime + 100 so finishTime - startTime is off by 100 ticks

Accuracy Examining remaining cases, we get trueElapsedTime = finishTime - startTime +- 100 ticks To ensure 10% accuracy, require elapsedTime = finishTime – startTime >= 1000 ticks

What Went Wrong? clock_t startTime = clock(); long numberOfRepetitions; do { numberOfRepetitions++; insertionSort(a, n); } while (clock() - startTime < 1000) double elapsedMillis = (clock()- startTime) / clocksPerMillis; double timeForCode = elapsedMillis/numberOfRepetitions; Suppose we are measuring worst-case run time. The array a Initially is in descending order. After the first iteration, the data is sorted. Subsequent iterations no longer start with worst-case data. So the measured time is for one Worst-case sort plus counter-1 best-case sorts!

The Fix clock_t startTime = clock(); long numberOfRepetitions; do { // put code to initialize a[] here insertionSort(a, n) } while (clock() - startTime < 1000) double elapsedMillis = (clock()- startTime) / clocksPerMillis; Now the measured time is for counter number of worst-case sorts plus counter number of initializes. We can measure the time to initialize separately and subtract. Or, if We are to compare with other sort methods, we could ignore the initialize time as it is there in the measurements for all sort methods. Or we could argue that for large n the worst case time to sort using insertion sort (O(n^2)) overshadows the initialize time, which is O(n), and so we may ignore the initialize time.

Time Shared System UNIX time MyProgram Using System.currentTimeMillis() is a problem, because your program may run for only part of the physical time that has elapsed. The UNIX command time gives the time for which your program ran.

Bad Way To Time do { counter++; startTime = clock(); doSomething(); elapsedTime += clock() - startTime; } while (elapsedTime < 1000) Suppose the clock is updated every 100ms and doSomething takes 99ms and it takes another 1ms do do everything else in the loop. If the clock is updated exactly at the start of the loop then it is not updated between the startTime = and elapsedTime += statements. So elapsedTime is always 0.