Design and Analysis of Experiments Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN,

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
Model Adequacy Checking in the ANOVA Text reference, Section 3-4, pg
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
ANOVA: Analysis of Variation
Objectives (BPS chapter 24)
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Part I – MULTIVARIATE ANALYSIS
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Chapter 3 Experiments with a Single Factor: The Analysis of Variance
Final Review Session.
Analysis of Variance (ANOVA) MARE 250 Dr. Jason Turner.
Analysis of Variance Chapter 3Design & Analysis of Experiments 7E 2009 Montgomery 1.
8. ANALYSIS OF VARIANCE 8.1 Elements of a Designed Experiment
13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.
One-way Between Groups Analysis of Variance
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Chapter 2 Simple Comparative Experiments
Inferences About Process Quality
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Statistical Methods in Computer Science Hypothesis Testing II: Single-Factor Experiments Ido Dagan.
13 Design and Analysis of Single-Factor Experiments:
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Chapter 13: Inference in Regression
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences.
Intermediate Applied Statistics STAT 460
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
ITK-226 Statistika & Rancangan Percobaan Dicky Dermawan
NONPARAMETRIC STATISTICS
PROBABILITY & STATISTICAL INFERENCE LECTURE 6 MSc in Computing (Data Analytics)
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
 The idea of ANOVA  Comparing several means  The problem of multiple comparisons  The ANOVA F test 1.
Introduction to Linear Regression
STA305 week21 The One-Factor Model Statistical model is used to describe data. It is an equation that shows the dependence of the response variable upon.
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
Copyright © 2011 Pearson Education, Inc. Analysis of Variance Chapter 26.
Design of Engineering Experiments Part 2 – Basic Statistical Concepts
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
An alternative approach to testing for a linear association The Analysis of Variance (ANOVA) Table.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Chapter 19 Analysis of Variance (ANOVA). ANOVA How to test a null hypothesis that the means of more than two populations are equal. H 0 :  1 =  2 =
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
ANOVA: Analysis of Variance.
1 Always be mindful of the kindness and not the faults of others.
1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
DOX 6E Montgomery1 Design of Engineering Experiments Part 2 – Basic Statistical Concepts Simple comparative experiments –The hypothesis testing framework.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Analysis of Variance STAT E-150 Statistical Methods.
Chapters Way Analysis of Variance - Completely Randomized Design.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
Model adequacy checking in the ANOVA Checking assumptions is important –Normality –Constant variance –Independence –Have we fit the right model? Later.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 26 Analysis of Variance.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
1 Chapter 5.8 What if We Have More Than Two Samples?
Analysis of variance approach to regression analysis … an (alternative) approach to testing for a linear association.
ANOVA: Analysis of Variation
What If There Are More Than Two Factor Levels?
Design and Analysis of Experiments
CHAPTER 13 Design and Analysis of Single-Factor Experiments:
Design and Analysis of Experiments
Presentation transcript:

Design and Analysis of Experiments Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN, ROC 1/33

Analysis of Variance Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN, ROC 2/33

Outline(1/2) Example The ANOVA Analysis of Fixed effects Model Model adequacy Checking Practical Interpretation of results Determining Sample Size 3/33

Outline (2/2) Discovering Dispersion Effects Nonparametric Methods in the ANOVA 4/33

What If There Are More Than Two Factor Levels? The t-test does not directly apply There are lots of practical situations where there are either more than two levels of interest, or there are several factors of simultaneous interest The ANalysis Of VAriance (ANOVA) is the appropriate analysis “engine” for these types of experiments The ANOVA was developed by Fisher in the early 1920s, and initially applied to agricultural experiments Used extensively today for industrial experiments 5

An Example(1/6) 6

An Example(2/6) An engineer is interested in investigating the relationship between the RF power setting and the etch rate for this tool. The objective of an experiment like this is to model the relationship between etch rate and RF power, and to specify the power setting that will give a desired target etch rate. The response variable is etch rate. 7

An Example(3/6) She is interested in a particular gas (C2F6) and gap (0.80 cm), and wants to test four levels of RF power: 160W, 180W, 200W, and 220W. She decided to test five wafers at each level of RF power. The experimenter chooses 4 levels of RF power 160W, 180W, 200W, and 220W The experiment is replicated 5 times – runs made in random order 8

An Example --Data 9

An Example – Data Plot 10

11 Does changing the power change the mean etch rate? Is there an optimum level for power? We would like to have an objective way to answer these questions The t-test really doesn’t apply here – more than two factor levels An Example--Questions

The Analysis of Variance In general, there will be a levels of the factor, or a treatments, and n replicates of the experiment, run in random order…a completely randomized design (CRD) N = an total runs We consider the fixed effects case…the random effects case will be discussed later Objective is to test hypotheses about the equality of the a treatment means 12

The Analysis of Variance 13

The Analysis of Variance The name “analysis of variance” stems from a partitioning of the total variability in the response variable into components that are consistent with a model for the experiment 14

The Analysis of Variance The basic single-factor ANOVA model is 15

Models for the Data There are several ways to write a model for the data Mean model Also known as one-way or single-factor ANOVA 16

Models for the Data Fixed or random factor? The a treatments could have been specifically chosen by the experimenter. In this case, the results may apply only to the levels considered in the analysis.  fixed effect models 17

Models for the Data The a treatments could be a random sample from a larger population of treatments. In this case, we should be able to extend the conclusion to all treatments in the population.  random effect models 18

Analysis of the Fixed Effects Model Recall the single-factor ANOVA for the fixed effect model Define 19

Analysis of the Fixed Effects Model Hypothesis  20

Analysis of the Fixed Effects Model Thus, the equivalent Hypothesis 21

Total variability is measured by the total sum of squares: The basic ANOVA partitioning is: 22 Analysis of the Fixed Effects Model-Decomposition

In detail 23 Analysis of the Fixed Effects Model-Decomposition (=0)

Thus 24 Analysis of the Fixed Effects Model-Decomposition SS T SS Treatments SS E

A large value of SS Treatments reflects large differences in treatment means A small value of SS Treatments likely indicates no differences in treatment means Formal statistical hypotheses are: 25 Analysis of the Fixed Effects Model-Decomposition

For SS E Recall 26 Analysis of the Fixed Effects Model-Decomposition

Combine a sample variances The above formula is a pooled estimate of the common variance with each a treatment. 27 Analysis of the Fixed Effects Model-Decomposition

Define and 28 Analysis of the Fixed Effects Model-Mean Squares df

By mathematics, That is, MS E estimates σ 2. If there are no differences in treatments means, MS Treatments also estimates σ Analysis of the Fixed Effects Model-Mean Squares

Cochran’s Theorem Let Z i be NID(0,1) for i=1,2,…,v and then Q1, q2,…,Qs are independent chi- square random variables withv1, v2, …,vs degrees of freedom, respectively, If and only if 30 Analysis of the Fixed Effects Model-Statistical Analysis

Cochran’s Theorem implies that are independently distributed chi-square random variables Thus, if the null hypothesis is true, the ratio is distributed as F with a-1 and N-a degrees of freedom. 31 Analysis of the Fixed Effects Model-Statistical Analysis

Cochran’s Theorem implies that Are independently distributed chi-square random variables Thus, if the null hypothesis is true, the ratio is distributed as F with a-1 and N-a degrees of freedom. 32 Analysis of the Fixed Effects Model-Statistical Analysis

Analysis of the Fixed Effects Models-- Summary Table The reference distribution for F 0 is the F a-1, a(n-1) distribution Reject the null hypothesis (equal treatment means) if 33

Analysis of the Fixed Effects Models-- Example Recall the example of etch rate, Hypothesis 34

Analysis of the Fixed Effects Models-- Example ANOVA table 35

Analysis of the Fixed Effects Models-- Example Rejection region 36

Analysis of the Fixed Effects Models-- Example P-value 37 P-value

Analysis of the Fixed Effects Models-- Example Minitab 38 One-way ANOVA: Etch Rate versus Power Source DF SS MS F P Power Error Total S = R-Sq = 92.61% R-Sq(adj) = 91.22% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev (--*---) (--*---) (--*---) (--*---) Pooled StDev = 18.27

Coding the observations will not change the results Without the assumption of randomization, the ANOVA F test can be viewed as approximation to the randomization test. 39 Analysis of the Fixed Effects Model-Statistical Analysis

Reasonable estimates of the overall mean and the treatment effects for the single- factor model are given by 40 Analysis of the Fixed Effects Model- Estimation of the model parameters

Confidence interval for μ i Confidence interval for μ i - μ j 41 Analysis of the Fixed Effects Model- Estimation of the model parameters

For the unbalanced data 42 Analysis of the Fixed Effects Model- Unbalanced data

43 A little (very little) humor…

Assumptions on the model Errors are normally distributed and independently distributed with mean zero and constant but unknown variances σ 2 Define residual where is an estimate of y ij 44 Model Adequacy Checking

Normal probability plot 45 Model Adequacy Checking -- Normality

Four-in-one 46 Model Adequacy Checking -- Normality

Residuals vs run order 47 Model Adequacy Checking -- Plot of residuals in time sequence

Residuals vs fitted 48 Model Adequacy Checking -- Residuals vs fitted

Defects Horn shape Moon type Test for equal variances Bartlett’s test 49 Model Adequacy Checking -- Residuals vs fitted

Test for equal variances Bartlett’s test 50 Model Adequacy Checking -- Residuals vs fitted Test for Equal Variances: Etch Rate versus Power 95% Bonferroni confidence intervals for standard deviations Power N Lower StDev Upper Bartlett's Test (Normal Distribution) Test statistic = 0.43, p-value = Levene's Test (Any Continuous Distribution) Test statistic = 0.20, p-value = 0.898

Test for equal variances Bartlett’s test 51 Model Adequacy Checking -- Residuals vs fitted

Variance-stabilizing transformation Deal with nonconstant variance If observations follows Poisson distribution  square root transformation If observations follows Lognormal distribution  Logarithmic transformation 52 Model Adequacy Checking -- Residuals vs fitted

Variance-stabilizing transformation If observations are binominal data  Arcsine transformation Other transformation  check the relationship among observations and mean. 53 Model Adequacy Checking -- Residuals vs fitted

The one-way ANOVA model is a regression model and is similar to 54 Practical Interpretation of Results – Regression Model

Computer Results 55 Practical Interpretation of Results – Regression Model Regression Analysis: Etch Rate versus Power The regression equation is Etch Rate = Power Predictor Coef SE Coef T P Constant Power S = R-Sq = 88.4% R-Sq(adj) = 87.8% Analysis of Variance Source DFSS MS F P Regression Residual Error Total Unusual Observations Obs Power Etch Rate Fit SE Fit Residual St Resid R R denotes an observation with a large standardized residual.

The Regression Model 56

Practical Interpretation of Results – Comparison of Means The analysis of variance tests the hypothesis of equal treatment means Assume that residual analysis is satisfactory If that hypothesis is rejected, we don’t know which specific means are different Determining which specific means differ following an ANOVA is called the multiple comparisons problem 57

Practical Interpretation of Results – Comparison of Means There are lots of ways to do this We will use pairwised t-tests on means…sometimes called Fisher’s Least Significant Difference (or Fisher’s LSD) Method 58

Practical Interpretation of Results – Comparison of Means 59 Fisher 95% Individual Confidence Intervals All Pairwise Comparisons among Levels of Power Simultaneous confidence level = 81.11% Power = 160 subtracted from: Power Lower Center Upper (--*-) (-*--) (--*-)

Practical Interpretation of Results – Comparison of Means 60 Fisher 95% Individual Confidence Intervals All Pairwise Comparisons among Levels of Power Simultaneous confidence level = 81.11% Power = 180 subtracted from: Power Lower Center Upper (--*-) (-*-) Power = 200 subtracted from: Power Lower Center Upper (-*--)

Practical Interpretation of Results – Graphical Comparison of Means 61

Practical Interpretation of Results – Contrasts 62 A linear combination of parameters So the hypothesis becomes

Practical Interpretation of Results – Contrasts 63 Examples

Practical Interpretation of Results – Contrasts 64 Testing t-test Contrast average Contrast Variance Test statistic

Practical Interpretation of Results – Contrasts 65 Testing F-test Test statistic

Practical Interpretation of Results – Contrasts 66 Testing F-test Reject hypothesis if

Practical Interpretation of Results – Contrasts 67 Confidence interval

Practical Interpretation of Results – Contrasts 68 Standardized contrast Standardize the contrasts when more than one contrast is of interest Standardized contrast

Practical Interpretation of Results – Contrasts 69 Unequal sample size Contrast t-statistic

Practical Interpretation of Results – Contrasts 70 Unequal sample size Contrast sum of squares

Practical Interpretation of Results – Orthogonal Contrasts 71 Define two contrasts with coefficients {c i } and {d i } are orthogonal contrasts if Unbalanced design

Practical Interpretation of Results – Orthogonal Contrasts 72 Why use orthogonal contrasts ?  For a treatments, the set of a-1 orthogonal contrasts partition the sum of squares due to treatments into a-1 independent single- degree-of freedom  tests performed on orthogonal contrasts are independent.

Practical Interpretation of Results – Orthogonal Contrasts 73 example

Practical Interpretation of Results – Orthogonal Contrasts 74 Example for contrast

Practical Interpretation of Results – Scheffe’s method for comparing all contrasts 75 Comparing any and all possible contrasts between treatment means Suppose that a set of m contrasts in the treatment means of interest have been determined. The corresponding contrast in the treatment averages is

Practical Interpretation of Results – Scheffe’s method for comparing all contrasts 76 The standard error of this contrast is The critical value against which should be compared is If |C u |>S α,u, the hypothesis that contrast Γ u equals zero is rejected.

Practical Interpretation of Results – Scheffe’s method for comparing all contrasts 77 The simultaneous confidence intervals with type I error α

Practical Interpretation of Results – example for Scheffe’s method 78 Contrast of interest The numerical values of these contrasts are

Practical Interpretation of Results – example for Scheffe’s method 79 Standard error One percent critical values are |C 1 |>S 0.01,1 and |C 1 |>S 0.01,1, both contrast hypotheses should be rejected.

Practical Interpretation of Results – comparing pairs of treatment means 80 Tukey’s test Fisher’s Least significant Difference (LSD) method Hsu’s Methods

Practical Interpretation of Results – comparing pairs of treatment means—Computer output 81 One-way ANOVA: Etch Rate versus Power Source DF SS MS F P Power Error Total S = R-Sq = 92.61% R-Sq(adj) = 91.22% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev (--*---) (--*---) (--*---) (--*---) Pooled StDev = 18.27

Practical Interpretation of Results – comparing pairs of treatment means—Computer output 82 Grouping Information Using Tukey Method Power N Mean Grouping A B C D Means that do not share a letter are significantly different. Tukey 95% Simultaneous Confidence Intervals All Pairwise Comparisons among Levels of Power Individual confidence level = 98.87% Power = 160 subtracted from: Power Lower Center Upper (---*--) (--*---) (---*--)

Practical Interpretation of Results – comparing pairs of treatment means—Computer output 83 Power = 180 subtracted from: Power Lower Center Upper (---*--) (--*--) Power = 200 subtracted from: Power Lower Center Upper (--*--)

Practical Interpretation of Results – comparing pairs of treatment means—Computer output 84 Hsu's MCB (Multiple Comparisons with the Best) Family error rate = 0.05 Critical value = 2.23 Intervals for level mean minus largest of other level means Level Lower Center Upper (---* ) (--* ) (--* ) ( *--)

Practical Interpretation of Results – comparing pairs of treatment means—Computer output 85 Grouping Information Using Fisher Method Power N Mean Grouping A B C D Means that do not share a letter are significantly different. Fisher 95% Individual Confidence Intervals All Pairwise Comparisons among Levels of Power Simultaneous confidence level = 81.11% Power = 160 subtracted from: Power Lower Center Upper (--*-) (-*--) (--*-)

Practical Interpretation of Results – comparing pairs of treatment means—Computer output 86 Power = 180 subtracted from: Power Lower Center Upper (--*-) (-*-) Power = 200 subtracted from: Power Lower Center Upper (-*--)

Practical Interpretation of Results – comparing treatment means with a control 87 Donntt’s method Control– the one to be compared Totally a-1 comparisons Dunnett's comparisons with a control Family error rate = 0.05 Individual error rate = Critical value = 2.59 Control = level (160) of Power Intervals for treatment mean minus control mean Level Lower Center Upper (-----*-----) (-----*-----) (-----*-----)

Determining Sample size -- Minitab 88 One-way ANOVA Alpha = 0.01 Assumed standard deviation = 25 Factors: 1 Number of levels: 4 Maximum Sample Target Difference Size Power Actual Power The sample size is for each level. Stat  Power and sample size  One-way ANOVA Number of level ->4 Sample size -> Max. difference  75 Power value  0.9 SD  25

Dispersion Effects 89 ANOVA  for location effects Different factor level affects variability  dispersion effects Example Average and standard deviation are measured for a response variable.

Dispersion Effects 90 ANOVA found no location effects Transform the standard deviation to RatioObser. Control Algorithm

Dispersion Effects 91 ANOVA found dispersion effects One-way ANOVA: y=ln(s) versus Algorithm Source DF SS MS F P Algorithm Error Total S = R-Sq = 76.71% R-Sq(adj) = 73.22%

Nonparametric methods in the ANOVA 92 When normality is invalid Use Kruskal-Wallis test Rank observation y ij in ascending order Replace each observation by its rank, R ij In case tie, assign average rank to them test statistic

Regression and ANOVA 93 Regression Analysis: Etch Rate versus Power The regression equation is Etch Rate = Power Predictor Coef SE Coef T P Constant Power S = R-Sq = 88.4% R-Sq(adj) = 87.8% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Unusual Observations Obs Power Etch Rate Fit SE Fit Residual St Resid R R denotes an observation with a large standardized residual.

Nonparametric methods in the ANOVA 94 Kruskal-Wallis test test statistic where

Nonparametric methods in the ANOVA 95 Kruskal-Wallis test If the null hypothesis is rejected.