Measuring Where CPU Time Goes

Slides:



Advertisements
Similar presentations
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 4 Jim Martin.
Advertisements

CSE332: Data Abstractions Lecture 27: A Few Words on NP Dan Grossman Spring 2010.
The Theory of NP-Completeness
CSE 326: Data Structures NP Completeness Ben Lerner Summer 2007.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Traveling Salesman Problem Continued. Heuristic 1 Ideas? –Go from depot to nearest delivery –Then to delivery closest to that –And so on until we are.
Package Transportation Scheduling Albert Lee Robert Z. Lee.
Copyright © 2012 Pearson Education, Inc. Chapter 1: Introduction to Computers and Programming.
Chocolate Bar! luqili. Milestone 3 Speed 11% of final mark 7%: path quality and speed –Some cleverness required for full marks –Implement some A* techniques.
© Janice Regan, CMPT 128, Jan CMPT 128 Introduction to Computing Science for Engineering Students Creating a program.
Scott Perryman Jordan Williams.  NP-completeness is a class of unsolved decision problems in Computer Science.  A decision problem is a YES or NO answer.
SEARCHING, SORTING, AND ASYMPTOTIC COMPLEXITY Lecture 12 CS2110 – Fall 2009.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 8: Complexity Theory.
A few m3 tips. Speed Tuning 1.Algorithm 2.Data structures 3.Low level code string streetName1, streetName2; if (streetName1 != streetName2) {... int streetId1,
Cliff Shaffer Computer Science Computational Complexity.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
Beauty and Joy of Computing Limits of Computing Ivona Bezáková CS10: UC Berkeley, April 14, 2014 (Slides inspired by Dan Garcia’s slides.)
Optimization Problems
Administration: Upcoming Due Dates Milestone 1: due Monday, Feb. 1 at 5 pm Code Review on milestone 1: –Coming soon; due Monday, Feb. 8 WD1: graphics proposal.
Young CS 331 D&A of Algo. NP-Completeness1 NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and.
Software Engineering Algorithms, Compilers, & Lifecycle.
A few m3 tips. Street Segment Length / Travel Time Need to compute in double precision –Otherwise too much round off See piazza.
July 10, 2016ISA's, Compilers, and Assembly1 CS232 roadmap In the first 3 quarters of the class, we have covered 1.Understanding the relationship between.
1 Interactive Computer Theorem Proving CS294-9 October 19, 2006 Adam Chlipala UC Berkeley Lecture 9: Beyond Primitive Recursion.
Optimization Problems
NP-Complete Problems.
P & NP.
University of British Columbia
Clustering Data Streams
Computational problems, algorithms, runtime, hardness
Breadth-First Search: Complexity
Routing Through Networks - 1
Algorithms Detour - Shortest Path
Scalability for Search
The compilation process
CPU Efficiency Issues.
Profiling for Performance in C++
Unsolvable Problems December 4, 2017.
C ODEBREAKER Class discussion.
Quick Start Guide for Visual Studio 2010
Multi - Way Number Partitioning
Chapter 1: Introduction to Computers and Programming
Understanding Randomness
Chapter 2: Business Efficiency Lesson Plan
Optimization Problems
Chapter 2: Business Efficiency Business Efficiency
Searching, Sorting, and Asymptotic Complexity
CPS 173 Computational problems, algorithms, runtime, hardness
Milestone 3: Finding Routes
CSC 380: Design and Analysis of Algorithms
Multithreading Why & How.
M4 and Parallel Programming
Milestone 4: Courier Company
Mental Health and Wellness Resources
Not guaranteed to find best answer, but run in a reasonable time
NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and Johnson, W.H. Freeman and Company, 1979.
Min Heap Update E.g. remove smallest item 1. Pop off top (smallest) 3
A New Technique for Destination Choice
Efficiently Estimating Travel Time
Spanning Trees Lecture 20 CS2110 – Spring 2015.
The Rich/Knight Implementation
CSC 380: Design and Analysis of Algorithms
Approximation Algorithms
1.3.7 High- and low-level languages and their translators
Algorithms CSCI 235, Spring 2019 Lecture 36 P vs
RAIK 283 Data Structures & Algorithms
What is Computer Science at Duke?
The Rich/Knight Implementation
Chapter 1: Introduction to Computers and Programming
Presentation transcript:

Measuring Where CPU Time Goes Profiling Code Measuring Where CPU Time Goes

My Code Is Too Slow – Why? Look at code  what O() is it? Loading: O(N) unavoidable and OK O(N2)  not good if N can get big Look-ups If you’ll do something N times, try to keep it O(1) or O(log N)

My Code is Complex! Can’t figure out O() Or O() looks OK, but still not fast enough Profile! Measure where the time goes

Simple Profiling: Manual Random Sampling Run the debugger Stop it with Debug  Pause Look at the subroutine and line where you paused Examine the call stack to see how you got there Continue execution with Debug  Continue More time in a routine  higher probability of stopping there Usually stop in same routine  found the problem

Detailed Profiling: gprof Tool Randomly samples the function your program is in ~ every 1 ms Also records how it got there (call stack / call graph) Then summarizes the output for you How is this random sampling done? Program asks to be interrupted ~1000x / second by operating system Each interrupt  record function you are in

gprof Tool: How to Use Compile your program with the right options Select profile configuration or make CONF=profile Adds –pg option to compiler  instruments exe Turns off function inlining  all function calls exist  easier to interpret

gprof Tool: How to Use Run program normally ./mapper Collects statistics, stores in big file (gmon.out) Program runs only a little slower (~30%) Run gprof to summarize / interpret output gprof mapper > outfile.txt Reads gmon.out, generates readable outfile.txt Even better: can visualize (graphics) gprof mapper gmon.out | gprof2dot . py -s | xdot - ECE 297 Profiling Quick Start Guide

Example: Extract Negatives #include <vector> using namespace std; vector<int> extract_negatives (vector<int>& numbers) { vector<int> negatives; int i = 0; while(i < numbers.size()) { if(numbers[i] < 0) { negatives.push_back(numbers[i]); numbers.erase(numbers.begin() + i); } else { i++; //Next element } return negatives; Takes 30 s when given an 800,000 element vector. Too slow  why?

Visualized Call Graph Extract negatives called once by main Takes 71% of time Type equation here. push_back() Estimated 37% of time This is an over-estimate: sample-based profiling isn’t perfect Erase called over 640,000 times Takes 53% of time  biggest problem

Milestone 4: Courier Company

Problem Definition C B D C A D A B Given N deliveries (pick up, drop off) N = 4 here, all intersections and Given M courier truck depots M = 3 here, all intersections Return: low travel time path Starting and ending at some depot And reaching all 2N delivery intersections And always picking up a package before delivering it

No  dropping off packages before they’re picked up! Possible Solution? C B D C A D A B No  dropping off packages before they’re picked up!

Output: vector of street segment ids Legal Solution C B D C A D A B Output: vector of street segment ids

Simple Solution: Re-use m3 path-finder // Go from first depot to first package pick up path = find_path (depot[0], delivery[0].pickUp); // Complete deliveries, in order for (i = 0; i < N-1; i++) { path += find_path (delivery[i].pickUp, delivery[i].dropOff); path += find_path (delivery[i].dropOff, delivery[i+1].pickUp); } // Drop off last package path += find_path (delivery[N-1].pickUp, delivery[N-1].dropOff); // Go back to the first depot to drop off the truck path += find_path (delivery[N-1].dropOff, depot[0]);

Possible Solution C B D C A D 2 A B 1 Lots of wasted travel!

Need to optimize delivery order! More Logical Solution C B D C A D A B Need to optimize delivery order!

Exhaustive Algorithm? Try all possible delivery orders Pick the one with lowest travel time How many combinations? M truck depots N deliveries  2N pick-up + drop-off intersections Pick one of M starting locations Then pick one of 2N pick-up/drop-off intersections Then one of 2N-1 for the second intersection … (repeat until last delivery) Then M places to drop off truck M * 2N * (2N-1) * (2N-2) * … * 1 * M = M2 (2N)! Some of these are illegal orders  say algorithm checks legality after generating the solution

Exhaustive Algorithm? Say M = 10, N = 100 102 * (2N)!  10377 Invoke find_path () 2N+1 times to get path between each intersection Say find_path takes 0.1 s (very good!) 10377 * 201 * 0.1 s = 1.6 x 10378 s 5 x 10370 years! Lifetime of universe: ~14 x 109 years!

Traveling Salesman Problem We are solving a variation of the traveling salesman problem Computationally hard problem For N deliveries, no guaranteed optimal (lowest travel time solution) in polynomial time i.e. > O(Nk), for any k Means at least O(2N) Need to use heuristics to solve Most research problems are computationally hard Integrated circuit design, drug design, transportation network design, …

Stephen Cook: Pioneer of Complexity Theory U of T professor in Computer Science NP-complete problems No polynomial time solution known on conventional computers Proved: If a method to solve any of these problems in polynomial time found, Then all these problems can be solved in polynomial time P vs. NP: most famous open problem in computer science 1970: denied tenure at UC Berkeley 1971: published most famous paper 1982: won Turing award

Not guaranteed to find best answer, but run in a reasonable time Heuristic Algorithms Not guaranteed to find best answer, but run in a reasonable time

Heuristic 1 Ideas? Overall: O(M*N) + O(N*N) + O(M) = O(M*N + N2) Go from any depot to nearest pickup, p while (packages to deliver) drop off package at delivery[p].dropOff p = nearest remaining pickup } Go to nearest depot O(M*N) N iterations O(N) O(M) Overall: O(M*N) + O(N*N) + O(M) = O(M*N + N2)