© 2003, Carla Ellis Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final.

Slides:



Advertisements
Similar presentations
Chapter 2 The Process of Experimentation
Advertisements

3-1 ©2013 Raj Jain Washington University in St. Louis Selection of Techniques and Metrics Raj Jain Washington.
1 CS533 Modeling and Performance Evaluation of Network and Computer Systems Capacity Planning and Benchmarking (Chapter 9)
PERFORMANCE ANALYSIS OF MULTIPLE THREADS/CORES USING THE ULTRASPARC T1 (NIAGARA) Unique Chips and Systems (UCAS-4) Dimitris Kaseridis & Lizy K. John The.
G. Alonso, D. Kossmann Systems Group
Statistical Methods in Computer Science Hypothesis Life-cycle Ido Dagan.
Workloads Experimental environment prototype real sys exec- driven sim trace- driven sim stochastic sim Live workload Benchmark applications Micro- benchmark.
Variability in Architectural Simulations of Multi-threaded Workloads Alaa R. Alameldeen and David A. Wood University of Wisconsin-Madison
ITEC 451 Network Design and Analysis. 2 You will Learn: (1) Specifying performance requirements Evaluating design alternatives Comparing two or more systems.
Project 4 U-Pick – A Project of Your Own Design Proposal Due: April 14 th (earlier ok) Project Due: April 25 th.
VLSI Systems--Spring 2009 Introduction: --syllabus; goals --schedule --project --student survey, group formation.
Overview of Phase 4 Performance Validation Methods and Techniques.
Statistics CSE 807.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Performance Evaluation
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
Adaptive Cache Compression for High-Performance Processors Alaa R. Alameldeen and David A.Wood Computer Sciences Department, University of Wisconsin- Madison.
CS533 Modeling and Performance Evaluation of Network and Computer Systems Introduction (Chapters 1 and 2)
ECE 510 Brendan Crowley Paper Review October 31, 2006.
Scientific Method.
Chapter 1: Introduction to Statistics
How can you find a supported answer to an investigative question?
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Introduction to Experimental Design
1 The Performance Potential for Single Application Heterogeneous Systems Henry Wong* and Tor M. Aamodt § *University of Toronto § University of British.
© 2003, Carla Ellis Experimentation in Computer Systems Research Why: “It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you.
Computer Science Department University of Pittsburgh 1 Evaluating a DVS Scheme for Real-Time Embedded Systems Ruibin Xu, Daniel Mossé and Rami Melhem.
Modeling and Performance Evaluation of Network and Computer Systems Introduction (Chapters 1 and 2) 10/4/2015H.Malekinezhad1.
Performance Evaluation of Computer Systems Introduction
1 Performance Evaluation of Computer Systems and Networks Introduction, Outlines, Class Policy Instructor: A. Ghasemi Many thanks to Dr. Behzad Akbari.
Introduction to Experimental Design
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
Consumer behavior studies1 CONSUMER BEHAVIOR STUDIES STATISTICAL ISSUES Ralph B. D’Agostino, Sr. Boston University Harvard Clinical Research Institute.
© 2003, Carla Ellis Simulation Techniques Overview Simulation environments emulation exec- driven sim trace- driven sim stochastic sim Workload parameters.
1 Tuning Garbage Collection in an Embedded Java Environment G. Chen, R. Shetty, M. Kandemir, N. Vijaykrishnan, M. J. Irwin Microsystems Design Lab The.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
EXPERIMENTAL DESIGN Science answers questions with experiments.
ESSES 2003 © 2003, Carla Schlatter Ellis 1 Outline for Today Objective –Power-aware memory Announcements.
Automatically Characterizing Large Scale Program Behavior Timothy Sherwood Erez Perelman Greg Hamerly Brad Calder Used with permission of author.
Chapter 10 Verification and Validation of Simulation Models
Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
1 Common Mistakes in Performance Evaluation (1) 1.No Goals  Goals  Techniques, Metrics, Workload 2.Biased Goals  (Ex) To show that OUR system is better.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
Modeling Virtualized Environments in Simalytic ® Models by Computing Missing Service Demand Parameters CMG2009 Paper 9103, December 11, 2009 Dr. Tim R.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Research Word has a broad spectrum of meanings –“Research this topic on ….” –“Years of research has produced a new ….”
© 2006, Carla Ellis Vague idea 1. Understand the problem, frame the questions, articulate the goals. A problem well-stated is half-solved. Why, not just.
Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental.
Introduction Andy Wang CIS Computer Systems Performance Analysis.
Understanding the Research Process
Best detection scheme achieves 100% hit detection with
© 2003, Carla Schlatter Ellis Power-Aware Memory Management.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
© 2003, Carla Ellis Model Vague idea “groping around” experiences Hypothesis Initial observations Experiment Data, analysis, interpretation Results & final.
Role of Experimentation and Experimental Planning Research Methods CPE 401 / 6002 / 6003 Professor Will Zimmerman.
Common Mistakes in Performance Evaluation The Art of Computer Systems Performance Analysis By Raj Jain Adel Nadjaran Toosi.
OPERATING SYSTEMS CS 3502 Fall 2017
Framework For Exploring Interconnect Level Cache Coherency
Software Architecture in Practice
Architecture & System Performance
Architecture & System Performance
Network Performance and Quality of Service
Andy Wang CIS 5930 Computer Systems Performance Analysis
ITEC 451 Network Design and Analysis
Bank-aware Dynamic Cache Partitioning for Multicore Architectures
Chapter 10 Verification and Validation of Simulation Models
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
DESIGN OF EXPERIMENTS by R. C. Baker
A Novel Cache-Utilization Based Dynamic Voltage Frequency Scaling (DVFS) Mechanism for Reliability Enhancements *Yen-Hao Chen, *Yi-Lun Tang, **Yi-Yu Liu,
Presentation transcript:

© 2003, Carla Ellis Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle

© 2003, Carla Ellis A Systematic Approach 1.Understand the problem, frame the questions, articulate the goals. A problem well-stated is half-solved. Must remain objective Be able to answer “why” as well as “what” 2.Select metrics that will help answer the questions. 3.Identify the parameters that affect behavior System parameters (e.g., HW config) Workload parameters (e.g., user request patterns) 4.Decide which parameters to study (vary).

© 2003, Carla Ellis Vague idea 1. Understand the problem, frame the questions, articulate the goals. A problem well-stated is half-solved. “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle

© 2003, Carla Ellis An Example Vague idea: there should be “interesting” interactions between DVS (dynamic voltage scaling of the CPU) and PADRAM (power- aware memory) –DVS: in soft real-time applications, slow down CPU speed and reduce supply voltage so as to just meet the deadlines. –PADRAM: when there are no memory accesses pending, transition memory chip into lower power state –Intuition: DVS will affect the length of memory idle gaps

© 2003, Carla Ellis Back of the Envelope What information do you need to know? Xscale range – 50MHz,.65V, 15mW to 1GHz, 1.75V, 2.2W Fully active mem – 300mW nap – 30mW w. 60ns extra latency E = P * t

© 2003, Carla Ellis Power Aware Memory Standby 180mW Active 300mW Power Down 3mW Nap 30mW Read/Write Transaction +6 ns ns +60 ns RDRAM Power States

© 2003, Carla Ellis Example Hypothesis: the best speed/voltage choice for DVS to minimize energy consumption when idle memory can power down is not necessarily the lowest speed that is able to meet deadline – counter to the assumption made by most DVS studies.

© 2003, Carla Ellis Example Restate hypothesis to disprove: the best speed/voltage choice for DVS to minimize energy consumption when idle memory can power down is still the lowest speed that is able to meet deadline – the assumption made by most DVS studies.

© 2003, Carla Ellis What can go wrong at this stage? Never understanding the problem well enough to crisply articulate the goals / questions / hypothesis. Getting invested in some solution before making sure a real problem exists. Getting invested in any desired result. Not being unbiased enough to follow proper methodology. Fishing expeditions (groping around forever). Having no goals but building apparatus for it 1 st.

© 2003, Carla Ellis A Systematic Approach 1.Understand the problem, frame the questions, articulate the goals. A problem well-stated is half-solved. Must remain objective Be able to answer “why” as well as “what” 2.Select metrics that will help answer the questions. 3.Identify the parameters that affect behavior System parameters (e.g., HW config) Workload parameters (e.g., user request patterns)

© 2003, Carla Ellis Vague idea 2. Select metrics that will help answer the questions. 3. Identify the parameters that affect behavior System parameters Workload parameters “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle

© 2003, Carla Ellis An Example System under test: CPU and memory. Metrics: total energy used by CPU + memory, CPU energy, memory energy, leakage, execution time, ave. memory gap

© 2003, Carla Ellis Parameters Affecting Behavior Hardware parameters CPU voltage/speed settings, Processor model (e.g. in-order, out-of-order, issue width) Cache organization Number of memory chips and data layout across them Memory power state transitioning policy –Threshold values Power levels of power states Transitioning times in & out of power states. Workload: periods, miss ratio, memory access pattern

© 2003, Carla Ellis What can go wrong at this stage? Wrong metrics (they don’t address the questions at hand) What everyone else uses. Easy to get. Not clear about where the “system under test” boundaries are. Unrepresentative workload. Not predictive of real usage. Just what everyone else uses (adopted blindly) – or NOT what anyone else uses (no comparison possible) Overlooking significant parameters that affect the behavior of the system.

© 2003, Carla Ellis 4.Decide which parameters to study (vary). 5.Select technique: Measurement of prototype implementation How invasive? Can we quantify interference of monitoring? Can we directly measure what we want? Simulation – how detailed? Validated against what? Repeatability 6.Select workload Representative? Community acceptance Availability A Systematic Approach

© 2003, Carla Ellis Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle 4.Decide which parameters to vary 5.Select technique 6.Select workload

© 2003, Carla Ellis An Example Choice of workload: MediaBench applications (later iterations will use a synthetic benchmark as well in which miss ratio can be varied) Technique: simulation using SimpleScalar augmented with RDRAM memory, PowerAnalyzer Factors to study– CPU speed/voltage Comparing nap memory policy with base case

© 2003, Carla Ellis What can go wrong at this stage? Choosing the wrong values for parameters you aren’t going to vary. Not considering the effect of other values (sensitivity analysis) Not choosing to study the parameters that matter most – factors Wrong technique Wrong level of detail

© 2003, Carla Ellis 7.Run experiments How many trials? How many combinations of parameter settings? Sensitivity analysis on other parameter values. 8.Analyze and interpret data Statistics, dealing with variability, outliers 9.Data presentation 10.Where does it lead us next? New hypotheses, new questions, a new round of experiments A Systematic Approach

© 2003, Carla Ellis Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle 7.Run experiments 8.Analyze and interpret data 9.Data presentation

© 2003, Carla Ellis An Example

© 2003, Carla Ellis What can go wrong at this stage? One trial – data from a single run when variation can arise. Multiple runs – reporting average but not variability Tricks of statistics No interpretation of what the results mean. Ignoring errors and outliers Overgeneralizing conclusions – omitting assumptions and limitations of study.

© 2003, Carla Ellis 7.Run experiments How many trials? How many combinations of parameter settings? Sensitivity analysis on other parameter values. 8.Analyze and interpret data Statistics, dealing with variability, outliers 9.Data presentation 10.Where does it lead us next? New hypotheses, new questions, a new round of experiments A Systematic Approach

© 2003, Carla Ellis Vague idea “groping around” experiences Hypothesis Model Initial observations Experiment Data, analysis, interpretation Results & final Presentation Experimental Lifecycle 10. What next?

© 2003, Carla Ellis An Example New Hypothesis: different controller policies are appropriate at different speed settings –Vary miss ratio of synthetic benchmark –Vary speed/voltage

© 2003, Carla Ellis Metrics Criteria to compare performance –Quantifiable, measureable –Relevant to goals –Complete set reflects all possible outcomes: Successful – responsiveness, productivity rate (throughput), resource utilization Unsuccessful – availability (probability of failure mode) or mean time to failure Error – reliability (probability of error class) or mean time between errors

© 2003, Carla Ellis Common Performance Metrics (Successful Operation) Response time Throughput (requests per unit of time) MIPS, bps, TPS Request starts Request ends Service begins Service completes Response back Request starts reaction think response time load thruput nominal capacity knee usable capacity

© 2003, Carla Ellis Discussion: Sampling of Metrics from Literature

© 2003, Carla Ellis Vague idea “groping around” experiences Hypothesis Initial observations Discussion Next Time: Destination Initial Hypothesis Pre-proposal 1: Sketch out what information you would need to collect (or have already gathered) in a “groping around” phase to get from a vague idea to the hypothesis stage for your planned project