Statistics CSE 807.

Slides:



Advertisements
Similar presentations
Prepared by Lloyd R. Jaisingh
Advertisements

1 CS533 Modeling and Performance Evaluation of Network and Computer Systems Capacity Planning and Benchmarking (Chapter 9)
Copyright 2004 David J. Lilja1 Comparing Two Alternatives Use confidence intervals for Before-and-after comparisons Noncorresponding measurements.
Multiple Comparisons in Factorial Experiments
Experimental Design, Response Surface Analysis, and Optimization
Chapter 12 Simple Linear Regression
Friday, May 14, 2004 ISYS3015 Analytical Methods for IS Professionals School of IT, The University of Sydney 1 Factorial Designs Week 9 Lecture 2.
Chapter 11 Analysis of Variance
Statistics for Business and Economics
Experimental Design Research vs Experiment. Research A careful search An effort to obtain new knowledge in order to answer a question or to solve a problem.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Introduction to Experiment Design Shiv Kalyanaraman Rensselaer Polytechnic Institute
Chapter 28 Design of Experiments (DOE). Objectives Define basic design of experiments (DOE) terminology. Apply DOE principles. Plan, organize, and evaluate.
k r Factorial Designs with Replications r replications of 2 k Experiments –2 k r observations. –Allows estimation of experimental errors Model:
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Introduction to Experiment Design Shiv Kalyanaraman Rensselaer Polytechnic Institute
T WO WAY ANOVA WITH REPLICATION  Also called a Factorial Experiment.  Replication means an independent repeat of each factor combination.  The purpose.
Statistics Design of Experiment.
AZIZ KUSTIYO DEPARTEMEN ILMU KOMPUTER FMIPA IPB. INTRODUCTION TO EXPERIMENTAL DESIGN  The goal of a proper experimental design is to obtain the maximum.
The Essentials of 2-Level Design of Experiments Part I: The Essentials of Full Factorial Designs The Essentials of 2-Level Design of Experiments Part I:
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
Statistics for Business and Economics Chapter 8 Design of Experiments and Analysis of Variance.
QNT 531 Advanced Problems in Statistics and Research Methods
CPE 619 2k-p Factorial Design
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
© 2003 Prentice-Hall, Inc.Chap 11-1 Analysis of Variance IE 340/440 PROCESS IMPROVEMENT THROUGH PLANNED EXPERIMENTATION Dr. Xueping Li University of Tennessee.
Ratio Games and Designing Experiments Andy Wang CIS Computer Systems Performance Analysis.
Modeling and Performance Evaluation of Network and Computer Systems Introduction (Chapters 1 and 2) 10/4/2015H.Malekinezhad1.
© 1998, Geoff Kuenning General 2 k Factorial Designs Used to explain the effects of k factors, each with two alternatives or levels 2 2 factorial designs.
© 2002 Prentice-Hall, Inc.Chap 9-1 Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 9 Analysis of Variance.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
A Really Bad Graph. For Discussion Today Project Proposal 1.Statement of hypothesis 2.Workload decisions 3.Metrics to be used 4.Method.
Lecture 8 Page 1 CS 239, Spring 2007 Experiment Design CS 239 Experimental Methodologies for System Software Peter Reiher May 1, 2007.
Introduction to Experimental Design
CHAPTER 12 Analysis of Variance Tests
Lecture 10 Page 1 CS 239, Spring 2007 Experiment Designs for Categorical Parameters CS 239 Experimental Methodologies for System Software Peter Reiher.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
ANALYSIS OF VARIANCE (ANOVA) BCT 2053 CHAPTER 5. CONTENT 5.1 Introduction to ANOVA 5.2 One-Way ANOVA 5.3 Two-Way ANOVA.
Design Of Experiments With Several Factors
Lecture 9-1 Analysis of Variance
CPE 619 Experimental Design Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in Huntsville.
Chapter 10: Analysis of Variance: Comparing More Than Two Means.
CPE 619 Two-Factor Full Factorial Design With Replications Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The.
1 Common Mistakes in Performance Evaluation (1) 1.No Goals  Goals  Techniques, Metrics, Workload 2.Biased Goals  (Ex) To show that OUR system is better.
CPE 619 One Factor Experiments Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in.
Experiment Design Overview Number of factors 1 2 k levels 2:min/max n - cat num regression models2k2k repl interactions & errors 2 k-p weak interactions.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
ANOVA and Multiple Comparison Tests
Designs for Experiments with More Than One Factor When the experimenter is interested in the effect of multiple factors on a response a factorial design.
Inferential Statistics Psych 231: Research Methods in Psychology.
Chapter 11 Analysis of Variance
Introduction to Experiment Design
Common Mistakes in Performance Evaluation The Art of Computer Systems Performance Analysis By Raj Jain Adel Nadjaran Toosi.
Statistics for Managers Using Microsoft Excel 3rd Edition
Factorial Experiments
Network Performance and Quality of Service
Ratio Games and Designing Experiments
Chapter 10: Analysis of Variance: Comparing More Than Two Means
Two-Factor Full Factorial Designs
Two-Factor Studies with Equal Replication
Chapter 11 Analysis of Variance
Two-Factor Studies with Equal Replication
Experimental Design Research vs Experiment
Replicated Binary Designs
One-Factor Experiments
DESIGN OF EXPERIMENTS by R. C. Baker
CS533 Modeling and Performance Evalua ion of Network and Computer Systems Experimental Design (Chapters 16-17)
Presentation transcript:

Statistics CSE 807

Experimental Design and Analysis How to: Design a proper set of experiments for measurement or simulation. Develop a model that best describes the data obtained. Estimate the contribution of each alternative to the performance. Isolate the measurement errors. Estimate confidence intervals for model parameters. Check if the alternatives are significantly different. Check if the model is adequate.

Example Personal workstation design. Processor:68000, Z80, or 8086. Memory size: 512K, 2M, or 8M bytes. Number of Disks: One, two, three, or four. Workload: Secretarial, managerial, or scientific. User education: High school, college, or Post-graduate level.

Terminology Response Variable: Outcome. E.g., throughput, response time. Factors: Variables that affect the response variable. E.g., CPU type, memory size, number of disk drivers, workload used, and user’s educational level. Also called predictor variables or predictors. Levels: The value that a factor can assume. E.g., the CPU type has three levels: 68000, 8080, or Z80. # of disk drives has four levels. Also called treatment.

Terminology (cont’d) Primary Factors: The factors whose effects need to be quantified. E.g., CPU type, memory size only, and number of disk drives. Secondary Factors: “Factors whose impact need not be quantified. E.g., the work loads. Replication: Repetition of all or some experiments.

Terminology (cont’d) Design: The number of experiments, the factor level and number of replications for each experiment. E.g., Full Factorial design with 5 replications: 3 X 3 X 4 X 3 X 3 or 324 experiments, each repeated five times. Experimental Unit: Any entity that is used for experiments. E.g., users. Generally, no interest in comparing the units. Goal - minimize the impact of variation among the units.

Terminology (cont’d) Interaction => Effect of one factor depends upon the level of the other. Non-interacting Factors Interacting Factors

Common Mistakes in Experimentation 1. The variation due to experimental error is ignored. 2. Important parameters are not controlled. 3. Effects of different factors are not isolated. 4. Simple one-factor-at-a-time designs are used 5. Interactions are ignored. 6. Too many experiments are conducted. Better: two phases.

Types of Experimental Designs Simple Designs: Vary one factor at a time #of Experiments = Not statistically efficient. Wrong conclusions if the factors have interaction. Not recommended.

Types of Experimental Designs (cont’d) Full Factorial Design: All combinations. # of Experiments = Can find the effect of all factors. Too much time and money. May try 2k design first

Types of Experimental Designs (cont’d) Fractional Factorial Designs: Save time and expense. Less information. May not get all interactions. Not a problem if negligible interactions.

A Sample Fractional Factorial Design.

Exercise The performance of a System being designed depends upon the following three factors: a. CPU type: 68000, 8086, 80286 b. Operating System type: CPM, MS-DOS, UNIX c. Disk drive type: A, B, C How many experiments are required to analyze the performance if a. There is significant interaction among factors. b. There is no interaction among factors c. The interactions are small compared to main effects.

2k Factorial Designs k factors, each at two levels. Easy to analyze. Helps in sorting out impact of factors. Good at the beginning of study. Valid only if the effect is unidirectional. E.g., memory size, the number of disk drives

22 Factorial Designs xA= xB= Two factors, each at two levels Performance in MIPS Cache Size Memory size 4M Bytes 16M Bytes 45 75 15 25 1K 2K -1 if 4M bytes memory 1 if 16M bytes memory -1 if 1M bytes cache 1 if 2M bytes cache xA= xB=

Model y = q0 + qAxA + qBxB +qABxAxB 15= q0 - qA - qB + qAB y = 40 + 20xA + 10xB + 5xAxB Interpretation: Mean performance = 40 MIPS Effect of memory = 20 MIPS Effect cache = 10 MIPS Interaction between memory and cache = 5 MIPS

Computation of Effects Model: y = q0 + qAxA + qBxB +qABxAxB Substitution: y1 = q0 - qA - qB + qAB y2 = q0 + qA - qB - qAB y3 = q0 - qA + qB - qAB y4 = q0 + qA + qB + qAB

Computation of Effects (cont’d) Solution: q0 =1/4 (y1 + y2 + y3 + y4) qA =1/4 (-y1 + y2 - y3 + y4) qB =1/4 (-y1 - y2 + y3 + y4) qAB =1/4 (y1 - y2 - y3 + y4) Notice that effects are linear combinations of responses. Sum of the coefficients is zero => contrasts. Notice: qA = Column A x Column y qB = Column B x Column y qAB = Column A x Column B x Column y

Sign Table Method

Allocation of Variation Importance of a factor = proportion of the variation explained Sample variance of Variation of y  Numerator = sum of squares total (SST)

Allocation of Variation (cont’d) For a 22 design: Variation due to Variation due to interaction SST = SSA + SSB + SSAB Fraction explained by Variation  Variance

Derivation Model: yi = q0 + qAxAi + qBxBi +qABxAixBi Notice 1. The sum of entries in each column is zero: 2. The sum of the squares of entries in each column is 4:

Derivation (cont’d) 3. The columns are orthogonal (inner product of any two columns is zero):

Derivation (cont’d) Sample mean

Derivation (cont’d) Variation of y Product terms

Example Memory-cache study: Total Variation Total variation = 2100 Variation due to memory = 1600 (76%) Variation due to cache = 400 (19%) Variation due to interaction = 100 (5%)

Case Study: Interconnection Net Memory interconnection networks: Omega and Crossbar. Memory reference patterns: random and Matrix Fixed factors: 1. Number of processors was fixed at 16. 2. Queued requests were not buffered but blocked. 3. Circuit switching instead of packet switching. 4. Random arbitration instead of round robin. 5. Infinite interleaving of memory => no memory back contention.

22 Design for Interconnection Networks Factors Used in the Interconnection Network Study Level Response

Interconnection Network Study (cont’d) Para- meter Mean Estimate Variation Explained q0 qA qB qAB 0.5725 0.0595 -0.1257 -0.0346 3.5 -0.5 1.0 0.0 1.871 -0.145 0.413 0.051 17.2% 77.0% 5.8% 20% 80% 0% 10.9% 87.8% 1.3% T N R

Interpretation of Results Average throughput = 0.5725 Most effective factor = B = reference pattern => The address patterns chosen are very different. Reference pattern explains  0.1257 (77%) of variation Effect of network type = 0.0595 Omega networks = Average + 0.0595 Crossbar networks = Average - 0.0595 Difference between the two = 0.119 Slight interaction (0.0346) between reference pattern and network type.

General 2k Factorial Designs k factors at two levels each. 2k experiments. 2k effects: k main effects Two factor interactions Three factor interactions...

2k Design Example Three factors in designing a machine: Cache size Memory size Number of processors

2k Design Example (cont’d) Cache Size 4M Bytes 16M Bytes 1K Byte 2K Byte 1 Proc 14 10 2 Proc 46 50 22 34 58 86

Analysis =18%+4%+71%+4%+1%+2%+0% =100% Number of Processors (C) is the most important factor

Exercise Analyze the 23 design: B1 B2 C1 100 40 C2 15 30 120 20 10 50 a. Quantify main effects and all interactions. b. Quantify percentages of variation explained. c. Sort the variables in the order of decreasing importance