Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental.

Similar presentations


Presentation on theme: "Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental."— Presentation transcript:

1 Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle

2 A Systematic Approach 1.Understand the problem, frame the questions, articulate the goals. A problem well-stated is half-solved. Must remain objective Be able to answer “why” as well as “what” 2.Select metrics that will help answer the questions. 3.Identify the parameters that affect behavior System parameters (e.g., HW config) Workload parameters (e.g., user request patterns) 4.Decide which parameters to study (vary).

3 Vague idea 1. Understand the problem, frame the questions, articulate the goals. A problem well-stated is half-solved. “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle

4 What can go wrong at this stage? Never understanding the problem well enough to crisply articulate the goals / questions / hypothesis. Getting invested in some solution before making sure a real problem exists. Getting invested in any desired result. Not being unbiased enough to follow proper methodology. –Any biases should be working against yourself. Fishing expeditions (groping around forever). Having no goals but building apparatus for it 1 st. –Swiss Army knife of simulators?

5 An Example Vague idea: there should be “interesting” interactions between DVS (dynamic voltage scaling of the CPU) and memory, especially PADRAM (power-aware memory) –DVS: in soft real-time applications, slow down CPU speed and reduce supply voltage so as to just meet the deadlines. –PADRAM: when there are no memory accesses pending, transition memory chip into lower power state –Intuition: DVS will affect the length of memory idle gaps

6 Back of the Envelope What information do you need to know? Xscale range – 50MHz,.65V, 15mW to 1GHz, 1.75V, 2.2W Fully active mem – 300mW nap – 30mW w. 60ns extra latency E = P * t

7 Power Aware Memory Standby 180mW Active 300mW Power Down 3mW Nap 30mW Read/Write Transaction +6 ns +6000 ns +60 ns RDRAM Power States

8 Example Hypthesis: the best speed/voltage choice for DVS to minimize energy consumption when idle memory can power down is the lowest speed that is able to meet deadline (i.e., the same conclusion made by most DVS studies without memory).

9 CPU Energy

10 Execution Time

11 A Systematic Approach 1.Understand the problem, frame the questions, articulate the goals. A problem well-stated is half-solved. Must remain objective Be able to answer “why” as well as “what” 2.Select metrics that will help answer the questions. 3.Identify the parameters that affect behavior System parameters (e.g., HW config) Workload parameters (e.g., user request patterns)

12 Vague idea 2. Select metrics that will help answer the questions. 3. Identify the parameters that affect behavior System parameters Workload parameters “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle

13 What can go wrong at this stage? Wrong metrics (they don’t address the questions at hand) What everyone else uses. Easy to get. Not clear about where the “system under test” boundaries are. Unrepresentative workload. Not predictive of real usage. Just what everyone else uses (adopted blindly) – or NOT what anyone else uses (no comparison possible) Overlooking significant parameters that affect the behavior of the system.

14 An Example System under test: CPU and memory. Metrics: total energy used by CPU + memory, CPU energy, memory energy, execution time

15 Parameters Affecting Behavior Hardware parameters CPU voltage/speed settings, Processor model (e.g. in-order, out-of-order, issue width) Cache organization Number of memory chips and data layout across them Memory power state transitioning policy –Threshold values Power levels of power states Transitioning times in & out of power states. Workload: periods, miss ratio, memory access pattern

16 4.Decide which parameters to study (vary). 5.Select technique: Measurement of prototype implementation How invasive? Can we quantify interference of monitoring? Can we directly measure what we want? Simulation – how detailed? Validated against what? Repeatability 6.Select workload Representative? Community acceptance Availability A Systematic Approach

17 Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle 4.Decide which parameters to vary 5.Select technique 6.Select workload

18 What can go wrong at this stage? Choosing the wrong values for parameters you aren’t going to vary. Not considering the effect of other values (sensitivity analysis) Not choosing to study the parameters that matter most – factors Wrong technique Wrong level of detail

19 An Example Choice of workload: MediaBench applications (later iterations will use a synthetic benchmark as well in which miss ratio can be varied) Technique: simulation using SimpleScalar augmented with RDRAM memory, PowerAnalyzer Factors to study– CPU speed/voltage Comparing nap memory policy with base case

20 7.Run experiments How many trials? How many combinations of parameter settings? Sensitivity analysis on other parameter values. 8.Analyze and interpret data Statistics, dealing with variability, outliers 9.Data presentation 10.Where does it lead us next? New hypotheses, new questions, a new round of experiments A Systematic Approach

21 Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle 7.Run experiments 8.Analyze and interpret data 9.Data presentation

22 What can go wrong at this stage? One trial – data from a single run when variation can arise. Multiple runs – reporting average but not variability Tricks of statistics No interpretation of what the results mean. Ignoring errors and outliers Overgeneralizing conclusions – omitting assumptions and limitations of study.

23 Our Example

24 7.Run experiments How many trials? How many combinations of parameter settings? Sensitivity analysis on other parameter values. 8.Analyze and interpret data Statistics, dealing with variability, outliers 9.Data presentation 10.Where does it lead us next? New hypotheses, new questions, a new round of experiments A Systematic Approach

25 Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle 10. What next?

26 An Example New Hypothesis: There is one “best” controller policy across all different speed settings –Vary miss ratio of synthetic benchmark –Vary speed/voltage

27 Our Example

28 Criteria to compare performance –Quantifiable, measurable –Relevant to goals –Complete set reflects all possible outcomes: Successful – responsiveness (latency), productivity rate (throughput), resource utilization (%) Unsuccessful – availability (probability of failure modes) or mean time to failure Error – reliability (probability of error class) or mean time between errors Must relate to the statement of hypothesis If “System x makes programming easier” is the claim, what is a metric? Lines of code, development time? Metrics speed metrics

29 Common Performance Metrics (Successful Operation) Response time Utilization - %busy Throughput (requests per unit of time) MIPS, bps, TPS Request starts Request ends Service begins Service completes Response back Request starts reaction think response time load thruput nominal capacity knee usable capacity

30 Issues Individual vs. system-wide (global) Ideal set of metrics should have low variability, nonredundancy, completeness (all outcomes represented) Higher better (HB), lower better (LB), or nominal better (NB) Counts of events, durations (all types of time), rates (normalized to common basis),

31 “Good” Metrics Intuitive (linear with perceived or observed behaviors) –e.g. Double physical mem -> double page hit rate –nice not required Predictive –e.g. MIPS(A) > MIPS(B) but exectime(A) > exectime(B) Repeatable (no nondeterminism embedded in measurement) –e.g. wall clock time contains lots of junk Easy to use or measure Comparable across alternatives –e.g. MIPS on RISC vs. MIPS on CISC Unbiased or independent of alternatives –e.g. MFLOPS biased toward those processors with FL units

32 Means vs. Ends Metrics Means-based metrics measure what was done (counts of page faults, clock frequency) Ends-based metrics measure progress towards goal MeansEnds Clock rate Execution time

33 For Discussion Next Tuesday Survey the types of metrics used in your proceedings (10 papers). © 2003, Carla Ellis

34 Hints about Metrics Discussion Be precise Categories of metrics – e.g. performance. There are many precisely defined performance metrics. –Rates (bandwidth, IPC, power) –Durations (latency, response time, overflow) –Counts (deadlines missed, faults) Normalized data – normalized to what? Ratios can be dangerous (misleading, confusing) –Improvements (speedups) –Percentages (efficiency, utilization) –Rates (cache miss rate) Beware ratios of ratios of … What would % improvement in average miss rate mean?

35 Not Metrics Intuitive goals – “I know it when I see it” (like great art or the “right” behavior) – e.g. fairness, ease of programming. Analysis methods – cumulative distribution function (CDF) – ask: of what data? Presentation approach – piechart – again ask: of what data?


Download ppt "Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental."

Similar presentations


Ads by Google