1 Chapter 5.8 What if We Have More Than Two Samples?

Slides:



Advertisements
Similar presentations
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Analysis of Variance (ANOVA) ANOVA methods are widely used for comparing 2 or more population means from populations that are approximately normal in distribution.
Chapter 4Design & Analysis of Experiments 7E 2009 Montgomery 1 Experiments with Blocking Factors Text Reference, Chapter 4 Blocking and nuisance factors.
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
DOX 6E Montgomery1 Design of Engineering Experiments Part 3 – The Blocking Principle Text Reference, Chapter 4 Blocking and nuisance factors The randomized.
1 Design of Engineering Experiments Part 3 – The Blocking Principle Text Reference, Chapter 4 Blocking and nuisance factors The randomized complete block.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
Analysis of Variance Outlines: Designing Engineering Experiments
Design and Analysis of Experiments
Chapter 3Design & Analysis of Experiments 7E 2009 Montgomery 1.
i) Two way ANOVA without replication
Model Adequacy Checking in the ANOVA Text reference, Section 3-4, pg
Design of Experiments and Analysis of Variance
ANOVA: Analysis of Variation
Probability & Statistical Inference Lecture 8 MSc in Computing (Data Analytics)
© 2010 Pearson Prentice Hall. All rights reserved The Complete Randomized Block Design.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Part I – MULTIVARIATE ANALYSIS
Chapter 3 Analysis of Variance
Chapter 3 Experiments with a Single Factor: The Analysis of Variance
Analysis of Variance Chapter 3Design & Analysis of Experiments 7E 2009 Montgomery 1.
8. ANALYSIS OF VARIANCE 8.1 Elements of a Designed Experiment
13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.
Chapter 2 Simple Comparative Experiments
Chapter Ten The Analysis Of Variance. ANOVA Definitions > Factor The characteristic that differentiates the treatment or populations from one another.
Inferences About Process Quality
Analysis of Variance & Multivariate Analysis of Variance
5-3 Inference on the Means of Two Populations, Variances Unknown
13 Design and Analysis of Single-Factor Experiments:
5-1 Introduction 5-2 Inference on the Means of Two Populations, Variances Known Assumptions.
QNT 531 Advanced Problems in Statistics and Research Methods
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
ITK-226 Statistika & Rancangan Percobaan Dicky Dermawan
PROBABILITY & STATISTICAL INFERENCE LECTURE 6 MSc in Computing (Data Analytics)
Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.
Design and Analysis of Experiments Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN,
Chapter 10 Analysis of Variance.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Experimental Design If a process is in statistical control but has poor capability it will often be necessary to reduce variability. Experimental design.
Design of Engineering Experiments Part 2 – Basic Statistical Concepts
1 A nuisance factor is a factor that probably has some effect on the response, but it’s of no interest to the experimenter…however, the variability it.
1 Chapter 1: Introduction to Design of Experiments 1.1 Review of Basic Statistical Concepts (Optional) 1.2 Introduction to Experimental Design 1.3 Completely.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Chapter 13 ANOVA The Design and Analysis of Single Factor Experiments - DOE Chapter 13A A nova (pl. novae) is a cataclysmic nuclear explosionnovaecataclysmic.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
DOX 6E Montgomery1 Design of Engineering Experiments Part 2 – Basic Statistical Concepts Simple comparative experiments –The hypothesis testing framework.
ETM U 1 Analysis of Variance (ANOVA) Suppose we want to compare more than two means? For example, suppose a manufacturer of paper used for grocery.
Chapter 4 Analysis of Variance
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Chapters Way Analysis of Variance - Completely Randomized Design.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
Model adequacy checking in the ANOVA Checking assumptions is important –Normality –Constant variance –Independence –Have we fit the right model? Later.
1/54 Statistics Analysis of Variance. 2/54 Statistics in practice Introduction to Analysis of Variance Analysis of Variance: Testing for the Equality.
ENGR 610 Applied Statistics Fall Week 8 Marshall University CITE Jack Smith.
F73DA2 INTRODUCTORY DATA ANALYSIS ANALYSIS OF VARIANCE.
Design Lecture: week3 HSTS212.
What If There Are More Than Two Factor Levels?
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Analysis of Variance (ANOVA)
CHAPTER 13 Design and Analysis of Single-Factor Experiments:
Comparing Three or More Means
Statistics Analysis of Variance.
Chapter 2 Simple Comparative Experiments
Topics Randomized complete block design (RCBD) Latin square designs
Statistics for Business and Economics (13e)
ENM 310 Design of Experiments and Regression Analysis Chapter 3
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

1 Chapter 5.8 What if We Have More Than Two Samples?

2 Example A manufacturer of paper used for making grocery bags is interested in improving the tensile strength of the product Product engineering thinks that tensile strength is a function of the hardwood concentration in the pulp and that the range of hardwood concentrations of practical interest is between 5 and 20% A team of engineers responsible for the study decides to investigate four levels of hardwood concentration: 5, 10, 15 and 20% They decide to make up six test specimens at each concentration level, using a pilot plant All 24 specimens are tested on a laboratory tensile tester, in random order The data from this experiment are as follows

3 Factor : Hardwood concentration Factor level or treatments: 4 (4 levels of the factor: 5, 10, 15, 20) 6 replicates for each level Randomization: all 4x6=24 runs are performed in a random order. (Completely Randomize Design) Single factor layout having 4 levels Terminology Single-Factor ANOVA Or One-way ANOVA (Analysis of Variance)

4

5 Notation a: different levels of a single factor y ij : jth observation taken under factor level i n i : number of replicates for factor level i If n 1 = n 2 =  = n a = n then “balanced design” otherwise, “unbalanced design”

6 run in random order…a completely randomized design (CRD) N = an total runs (For a balanced design) Objective Testing hypotheses about the equality of the a treatment means

7  Assumptions 1) Normality 2) Constant Variance 3) Independence Model  ij iid N(0,  2 ) Y ij : response variable

8 Hypothesis Formal statistical hypotheses are Alternatively,

9 Analysis of Variance The name “analysis of variance” stems from a partitioning of the total variability in the response variable (Y ij ) into components that are consistent with a model for the experiment. Total variability is measured by the total sum of squares:

10 A large value of SS Treatments reflects large differences in treatment means A small value of SS Treatments likely indicates no differences in treatment means While sums of squares cannot be directly compared to test the hypothesis of equal means, mean squares can be compared. A mean square is a sum of squares divided by its degrees of freedom:

11 We can show that Comparing MS treatments and MS E If MS treatments  MS E then we do not reject H 0 If MS treatments >> MS E then we reject H 0 Testing

12 ANOVA Table

13 Estimation of Model Parameters Point estimation Confidence interval on  i (the ith treatment mean)

14 Which Means Differ? : Post-ANOVA Comparison If the hypothesis is rejected, we don’t know which specific means are different Determining which specific means differ following an ANOVA is called the multiple comparisons problem: H 0 :  i =  j H 1 :  i ≠  j (i  j,) There are many quantitative methods for multiple comparisons (Fisher’s LSD method, Tukey’s test, Dunett’s method etc.) Fisher’s LSD method Least significant difference We conclude that the population means  i and  j differ if > LSD. Drawback: Overall type I error rate is inflated using this method. If  0 is used for H 0 :  1 =  2 H 0 :  1 =  3 H 0 :  2 =  3 Experimentwise error rate  =1-(1-  0 ) 3

Tukey’s Method  We conclude that the population means  i and  j differ if where q  ( a, f ) is given in some textbooks or we can do this test using minitab.  The overall experimentwise error rate is . 15

16 Model Checking Checking assumptions is important Normality Constant variance Independence Residual plots are very useful Residual Difference between an observation and its fitted (estimated) value from the model Checked by examining residuals

17 1. Normal Probability Plot of the Residuals Normality Assumption Check If the plotted points fall approximately along a straight line, conclude that the data follow the normal distribution.

18 2. Residuals vs. Time Plotting of residuals in time sequence is helpful in detecting correlation between the residuals. A tendency to have runs of positive and negative residuals indicates positive correlation. Changing error variance over time indicates nonconstant variance.

19 3. Residuals vs. Fitted Values Plotting of residuals versus fitted values should not reveal any obvious pattern. When a pattern appears in these plots, it suggests the need for a transformation Changing error variance over the magnitude of the fitted values may indicate the violation of constant variance.

20 Example In IC manufacturing steps, wafers are coated with a layer of material such as silicon dioxide or a metal. The unwanted material is then selectively removed by etching through a mask creating circuit patterns. A plasma etching process is widely used for this process. Figure 3-1. A single wafer plasma etching tool

21 The response variable is the etch rate for this tool. The experimenter wants to specify the RF power setting that will give a desired target etch rate. RF power can vary between 160 W and 220 W. The experimenter chooses 4 levels of RF power; 160W, 180W, 200W, 220W. The experiment is replicated 5 times. 20 runs are made in a random order.

Figure. Box plots and scatter diagram of the etch rate data. 22

23 1. Construct ANOVA table Minitab Result: One-way ANOVA: Etch Rate versus Power Source DF SS MS F P Power Error Total Determine if the RF power setting affects the etch rate. Use a=0.05 Since p-value< 0.05, we reject the null hypothesis saying all treatment means are the same. That is, we conclude that the RF power setting affects the etch rate

24 3. Find the 95% confidence interval on the mean of 220W of RF power Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev W (--*---) 180W (--*---) 200W (--*---) 220W (--*---) Pooled StDev = 18.27

25 5. Check the model assumptions

The Randomized Complete Block Design (RCBD) Definition A nuisance factor is a design factor that probably has some effect on the response, but it’s of no interest to the experimenter. (…however, the variability it transmits to the response needs to be minimized.) ex) Batches of raw material, operators, pieces of test equipment, time (shifts, days, etc.), different experimental units. Dealing nuisance factors 1. Unknown, uncontrollable => randomization to guard against it 2. Known, uncontrollable => use Analysis of Covariance (see Chapter 14) 3. Known, Controllable => use “blocking” to eliminate its effect 26

EXAMPLE Consider a hardness testing machine that presses a rod with a pointed tip into a metal test coupon with a known force. By measuring the depth of the depression caused by the tip, the hardness of the coupon is determined. ■ We wish to determine whether or not four different tips produce different readings on a hardness testing machine. ■ Structure of a completely randomized single factor experiment: 4 ⅹ 4 = 16 runs, requiring 16 coupons.. ■ Nuisance factor: Test coupons (.If the metal coupons differ slightly in their hardness, they will contribute to the variability observed in the hardness data) Experimental error = Random error + Variability between coupons ■ Alternatively, the experimenter may want to test the tips across coupons of various hardness levels. => “Blocking” 27 We want to remove it, because we want to make experimental error as small as possible !!

Randomized Complete Block Design (RCBD)  To conduct this experiment as a RCBD, assign all 4 tips to each coupon.  Each coupon is called a “block”; that is, it’s a more homogenous experimental unit on which to test the tips.  Variability between blocks can be large, but variability within a block should be relatively small.  In general, a block is a specific level of the nuisance factor.  “Complete” means each block contains all the treatments (tip1,2,3,4)  A block represents a restriction on randomization.  All runs within a block are randomized.  RCBD improve the comparison accuracy among tips by eliminating the variability among coupons. 28

Notice the two-way structure of the experiment. Once again, we are interested in testing the equality of treatment means, but now we have to remove the variability associated with the nuisance factor (the blocks) 29

Extension of the ANOVA to the RCBD Suppose that there are a treatments (factor levels) and b blocks A statistical model (effects model) for the RCBD is  : overall mean  i : i th treatment effect  j : j th block effect  ij : random error ~ i.i.d. N(0,  2 ) 30

31 y 11 y 21 y 31. y a1 y 12 y 22 y 32. y a2 y 1b y 2b y 3b. y ab Block 1 Block 2 Block b Figure 4-1 The randomized complete block design

ANOVA partitioning of total variability: The degrees of freedom for the sums of squares above are as follows Ratios of sums of squares to their degrees of freedom result in mean squares. 32

33 The ratio of the mean square for treatments to the error mean square is an F statistic. If H o is true, the test statistic F o =MS Treatment /MS E  F a-1, (a-1)(b-1)

For manual computing, use 34