Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Introduction to Experiment Design Shiv Kalyanaraman Rensselaer Polytechnic Institute

Slides:



Advertisements
Similar presentations
G. Alonso, D. Kossmann Systems Group
Advertisements

Copyright © 2005 Department of Computer Science CPSC 641 Winter PERFORMANCE EVALUATION Often in Computer Science you need to: – demonstrate that.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
Multiple Regression [ Cross-Sectional Data ]
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-4690: Experimental Networking Informal Quiz: Prob/Stat Shiv Kalyanaraman:
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #05: SOLUTIONS Shivkumar Kalyanaraman: GOOGLE: “Shiv.
Friday, May 14, 2004 ISYS3015 Analytical Methods for IS Professionals School of IT, The University of Sydney 1 Factorial Designs Week 9 Lecture 2.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Basic Ideas in Probability and Statistics for Experimenters: Part II: Quantitative Work, Regression.
AQM for Congestion Control1 A Study of Active Queue Management for Congestion Control Victor Firoiu Marty Borden.
1 PERFORMANCE EVALUATION H Often one needs to design and conduct an experiment in order to: – demonstrate that a new technique or concept is feasible –demonstrate.
Statistics CSE 807.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Term Projects Shiv Kalyanaraman, Rensselaer Polytechnic Institute
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Review of Networking Concepts (Part 2) Shivkumar Kalyanaraman Rensselaer Polytechnic Institute.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 TCP Congestion Control: AIMD and Binomial Shivkumar Kalyanaraman Rensselaer Polytechnic Institute.
L Berkley Davis Copyright 2009 MER301: Engineering Reliability Lecture 14 1 MER301: Engineering Reliability LECTURE 14: Chapter 7: Design of Engineering.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-4963: Experimental Networking Exam 1: SOLUTIONS Time: 60 min (strictly enforced) Points:
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Introduction to Experiment Design Shiv Kalyanaraman Rensselaer Polytechnic Institute
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Understanding Linux Kernel to Build Software Routers (Qualitative Discussion) Shiv Kalyanaraman,
Performance Evaluation
Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Introduction to Simulation Shiv Kalyanaraman Rensselaer Polytechnic Institute
1 PERFORMANCE EVALUATION H Often in Computer Science you need to: – demonstrate that a new concept, technique, or algorithm is feasible –demonstrate that.
OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.
Ns Simulation Final presentation Stella Pantofel Igor Berman Michael Halperin
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 00-1 ECSE-4730: Computer Communications Networks (CCN): Introduction Shivkumar Kalyanaraman Rensselaer.
Objectives of Multiple Regression
Analysis of Simulation Results Andy Wang CIS Computer Systems Performance Analysis.
AZIZ KUSTIYO DEPARTEMEN ILMU KOMPUTER FMIPA IPB. INTRODUCTION TO EXPERIMENTAL DESIGN  The goal of a proper experimental design is to obtain the maximum.
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
Experimental Design Tutorial Presented By Michael W. Totaro Wireless Research Group Center for Advanced Computer Studies University of Louisiana at Lafayette.
CPE 619 2k-p Factorial Design
Ratio Games and Designing Experiments Andy Wang CIS Computer Systems Performance Analysis.
Raj Jain The Ohio State University R1: Performance Analysis of TCP Enhancements for WWW Traffic using UBR+ with Limited Buffers over Satellite.
Modeling and Performance Evaluation of Network and Computer Systems Introduction (Chapters 1 and 2) 10/4/2015H.Malekinezhad1.
A Really Bad Graph. For Discussion Today Project Proposal 1.Statement of hypothesis 2.Workload decisions 3.Metrics to be used 4.Method.
Introduction to Experimental Design
Analysis of Simulation Results Chapter 25. Overview  Analysis of Simulation Results  Model Verification Techniques  Model Validation Techniques  Transient.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 A TCP Friendly Traffic Marker for IP Differentiated Services Feroz Azeem, Shiv Kalyanaraman,
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
1 IEEE Meeting July 19, 2006 Raj Jain Modeling of BCN V2.0 Jinjing Jiang and Raj Jain Washington University in Saint Louis Saint Louis, MO
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
CPE 619 Experimental Design Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in Huntsville.
Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
1 Summarizing Performance Data Confidence Intervals Important Easy to Difficult Warning: some mathematical content.
Experiment Design Overview Number of factors 1 2 k levels 2:min/max n - cat num regression models2k2k repl interactions & errors 2 k-p weak interactions.
ETM 607 – Output Analysis: Estimation of Relative Performance Output comparison between two or more alternative systems Common Random Numbers (CRN) Comparison.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #02 SOLUTIONS (part I) Shivkumar Kalyanaraman: GOOGLE:
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Sample Size Determination
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Exam 3 Time: 90 min (strictly enforced) [Hint: spend time roughly.
Designs for Experiments with More Than One Factor When the experimenter is interested in the effect of multiple factors on a response a factorial design.
Introduction Dispersion 1 Central Tendency alone does not explain the observations fully as it does reveal the degree of spread or variability of individual.
Introduction to Experiment Design
Common Mistakes in Performance Evaluation The Art of Computer Systems Performance Analysis By Raj Jain Adel Nadjaran Toosi.
Overview Modern chip designs have multiple IP components with different process, voltage, temperature sensitivities Optimizing mix to different customer.
OPERATING SYSTEMS CS 3502 Fall 2017
Ratio Games and Designing Experiments
Introduction to Simulation
Correlation and Regression
CHAPTER 29: Multiple Regression*
Computer Systems Performance Evaluation
Incremental Partitioning of Variance (aka Hierarchical Regression)
Replicated Binary Designs
Computer Systems Performance Evaluation
One-Factor Experiments
Introduction to Experiment Design
CS533 Modeling and Performance Evalua ion of Network and Computer Systems Experimental Design (Chapters 16-17)
Presentation transcript:

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Introduction to Experiment Design Shiv Kalyanaraman Rensselaer Polytechnic Institute Refs: Chap of Raj Jain’s book Adapted from Prof. Raj Jain’s slides

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 2 Why Experiment design? q Problem: validation and results only as good as your test cases! q How to design a critical set of test cases ? q Idea: Parameterize and use black-box strategy System Parameters or Factors Metrics

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 3 Performance, Metrics and Parameters q Performance questions: q Absolute: How fast does computer A run MY program ? q Relative: Is machine A faster than machine B, and if so, how much faster ? q Parameters: factors or inputs q Eg: clock rate, poisson inter-arrivals q Metrics: functions of factors or parameters q Eg: throughput, response time, queue length... q Metric should characterize the design tradeoffs adequately q Metrics are usually functions of many factors. Use of one factor alone may be misleading.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 4 Choose metrics to reflect appropriate tradeoffs q Network users: services and performance that their applications need, e.g., guarantee that each message it sends will be delivered without error within a certain amount of time q Network designers: cost-effective design e.g., that network resources are efficiently utilized and fairly allocated to different users q Users + designer perspectives enough to meet 3 factors of interface/architecture design. But... q Network providers: system that is easy to administer and manage e.g., that faults can be easily isolated and it is easy to account for usage q Require management tools and interfaces. q Often considered to the basic protocol interface design

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 5 Goals of Experiment Design q Design a proper set of (simulation or measurement) experiments: maximize information gained with minimum experiments! q Develop a (regression) model that best describes the data obtained q Estimate the contribution of each factor and its values (I.e alternatives) to the performance variation q Isolate measurement errors q Estimate confidence intervals for model parameters q Check if the alternatives are significantly different q Check if the model is adequate

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 6 Example: TCP/AQM Design Problem q TCP versions: Reno or SACK q Max Segment sizes: 100 B vs 1000 B q Buffer Size: 10 pkts vs 100 pkts q AQM: Drop tail vs RED vs REM q AQM parameters … q Workload: FTP q Configuration parameters: 3 flows vs 10 flows q Metrics: total throughput (efficiency), C.o.V of per-flow throughputs (fairness)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 7 Terminology q Response Variable: Outcome. Eg: response time q Factors: Variables that affect the response variable q Eg: buffer size, RED parameters q Levels: The values that a factor can assume q Eg: TCP type has 2 levels: SACK or Reno q Primary Factors: The “interesting” factors whose effects need to be quantified. q Secondary Factors: Factors whose impact need not be quantified q Eg: We have excluded minor factors like receiver window size, or delay ack (whose effects we roughly know and/or consider minor). Some workload parameters also are secondary factors q Replication: Repetition of all or some experiments q Design: The number of experiments, the factor level, and number of replications for each experiment q Eg: Full Factorial Design with replications q Interaction: Effect of one factor depends upon other factors!

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 8 Experiment Design Problem q A system with k parameters and n i level for parameter x i and a performance metric f q f=s(x 1, x 2,…,x k ) can be evaluated by an experiment, I.e. simulation or measurement or analytical formula q Find the effect of each parameter or parameter interaction on the system performance, the best parameter combination, etc. System Parameters or Factors Metrics

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 9 Simple Experiment Design q Start with a typical configuration q Vary one parameter at a time q Number of experiments: q Disadvantage: only good for the system with no interaction between parameters q Wrong conclusions if the factors have interaction q Not statistically efficient

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 10 Full factorial Design q Try every possible combination of all the parameters. q Number of experiments q Disadvantage: too many experiments

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 11 Reduce experiment number?  Reduce the number of levels for each parameter, e.g., reduce n i to 2 levels for each parameter  Reduce the number of parameters  Use fractional factorial design: q Try a subset of the all possible parameter combinations  Less information  May not get all interactions  Not a problem if negligible interactions

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 12 2 k Full Factorial Design  Using a nonlinear regression model (assuming 2 parameters, x 1, x 2 ) y = q 0 + q 1 x 1 + q 2 x 2 + q 3 x 3 + q 12 x 1 x 2 + q 23 x 2 x 3 + q 13 x 1 x 3 + q 123 x 1 x 2 x 3 q Goal: Run 2 k experiment (e.g., take the two extreme values of each parameter) and solve the above equation for the regression parameters: q 0, q 1, q 2, q 12, q 23, q 13, q 123. q Disadvantage: only good for systems where effects of parameters or interactions are monotonous q i.e., the system has to be consistent with the model

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 13 2 k-p Fractional Factorial Design q Use a simplified regression model, ignore some interactions (especially high-order interactions since their effect are usually small) q For example, design: y=q 0 + q 1 x 1 + q 2 x 2 + q 3 x 3 All interactions are ignored, only 4 unknowns, only experiments are need to solve them

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 14 Simple Full Factorial Problem Qn: What is the effect of memory & cache on workstation performance ?

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 15 Underlying Regression Model q Interpretation: Mean performance = 40 MIPS q Effect of memory = 20 MIPS q Effect of cache = 10 MIPS q Interaction between memory and cache = 5 MIPS

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 16 Computation of Effects (I.e. coeffs)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 17 Computation of Effects (cont’d) q Note: effects (q j ) are linear combinations of responses (y i ) q Sum of the coefficients is zero: aka “contrasts” q Note that the above equations can be expressed in terms of column A, B etc ! => Sign table method!

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 18 Sign Table Method q Each column multiplies the responses (y i ) ; q Sum the multiples, and q Divide the sums by 2 k

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 19 Allocation of Variation q A.k.a. which factors and alternatives matter! q Importance of a factor: proportion of variation explained by that factor q Note: variation is not variance!

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 20 Allocation of Variation: 2 2 design q Fractions explained by B and interaction between A & B can be obtained similarly

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 21 Allocation of Variation: Example 2 2 design Note: A is most impt factor!

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 22 General 2 k Factorial Designs EXAMPLE

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 23 2 k Factorial Design Example

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 24 Allocation of Variation q Factor C (I.e. number of processors) is the most important (accounts for 71% of performance variation!)

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 25 Exercise

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 26 Experiment Designer Package q Run 2 k full factorial or 2 (k-p) fractional factorial design q Executable: edesign –s q Experiment results are in designer.res edesign Configuration File Experiment Script Designer.res

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 27 Example Configuration File ScriptFile my_script.tcl LogOn y Designer Factorial P0 [parameters] #namemin max x1-2 2 x2-2 2 [/parameters]

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 28 Example Configuration File Farmer localhost:6666:farmer:any Worker localhost:worker [deployer]./get_effects.pl designer.res [/deployer] edesignfarmer worker … E E E R R R

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 29 Example Experiment Script #SETPARAMETERS return_result [expr $x1*$x1+$x2*$x2] edesign Original Experiment Script Generate a set of parameter values Experiment Script Farmer-worker System Return experiment result

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 30 Example Output(design.res) q Format of designer.res 1 st parameter, 2 nd parameter,…, metric q Example -1,-1,-1,-1, ,-1,-1,-1, ,1,-1,-1,1.001 q get-effects.pl will process this file and output the effect and variation percentage of each parameter (interaction).

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 31 Example get-effects.pl Output The parameter index represents: 1 x1 2 x2 Parameter(s) Effect Variation Percentage none % % %

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 32 What you need to do? q Finish the tcl script q Open a terminal window, run start_system.sh q Open another terminal windows, run: edesign –s ex_ff.conf q Sit and wait to see the output of get-effects.pl q Any problem, ask Yong or me.