Analysis of Variance (ANOVA)

Slides:



Advertisements
Similar presentations
Chapter 15 ANOVA.
Advertisements

Analysis of Variance (ANOVA) ANOVA methods are widely used for comparing 2 or more population means from populations that are approximately normal in distribution.
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Analysis of Variance Outlines: Designing Engineering Experiments
Design of Experiments and Analysis of Variance
Probability & Statistical Inference Lecture 8 MSc in Computing (Data Analytics)
© 2010 Pearson Prentice Hall. All rights reserved Single Factor ANOVA.
Analysis of Variance: ANOVA. Group 1: control group/ no ind. Var. Group 2: low level of the ind. Var. Group 3: high level of the ind var.
Analysis of Variance Chapter 3Design & Analysis of Experiments 7E 2009 Montgomery 1.
13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.
Chapter 2 Simple Comparative Experiments
Chapter Ten The Analysis Of Variance. ANOVA Definitions > Factor The characteristic that differentiates the treatment or populations from one another.
Inferences About Process Quality
13 Design and Analysis of Single-Factor Experiments:
Analysis of Variance Nutan S. Mishra Department of Mathematics and Statistics University of South Alabama.
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
Analysis of Variance Chapter 12 Introduction Analysis of variance compares two or more populations of interval data. Specifically, we are interested.
PROBABILITY & STATISTICAL INFERENCE LECTURE 6 MSc in Computing (Data Analytics)
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Analysis of Variance ST 511 Introduction n Analysis of variance compares two or more populations of quantitative data. n Specifically, we are interested.
Statistics 11 Confidence Interval Suppose you have a sample from a population You know the sample mean is an unbiased estimate of population mean Question:
Chapter 10 Analysis of Variance.
Experimental Design If a process is in statistical control but has poor capability it will often be necessary to reduce variability. Experimental design.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Chapter 19 Analysis of Variance (ANOVA). ANOVA How to test a null hypothesis that the means of more than two populations are equal. H 0 :  1 =  2 =
Lecture 9-1 Analysis of Variance
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
ETM U 1 Analysis of Variance (ANOVA) Suppose we want to compare more than two means? For example, suppose a manufacturer of paper used for grocery.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Chapter 12 Introduction to Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
1 Chapter 5.8 What if We Have More Than Two Samples?
Six Easy Steps for an ANOVA 1) State the hypothesis 2) Find the F-critical value 3) Calculate the F-value 4) Decision 5) Create the summary table 6) Put.
ANOVA: Analysis of Variation
Chapter 11 Analysis of Variance
Chapter 11 Created by Bethany Stubbe and Stephan Kogitz.
Comparing Multiple Groups:
ANOVA: Analysis of Variation
ANOVA: Analysis of Variation
Copyright © 2008 by Hawkes Learning Systems/Quant Systems, Inc.
Chapter 10 Two-Sample Tests and One-Way ANOVA.
ANOVA: Analysis of Variation
Factorial Experiments
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
CHAPTER 13 Design and Analysis of Single-Factor Experiments:
i) Two way ANOVA without replication
Applied Business Statistics, 7th ed. by Ken Black
Comparing Three or More Means
Basic Practice of Statistics - 5th Edition
Statistics Analysis of Variance.
Chapter 2 Simple Comparative Experiments
Lecture 2 2-Sample Tests Goodness of Fit Tests for Independence
Chapter 10: Analysis of Variance: Comparing More Than Two Means
Post Hoc Tests on One-Way ANOVA
Post Hoc Tests on One-Way ANOVA
Inferential Statistics and Probability a Holistic Approach
Comparing Multiple Groups: Analysis of Variance ANOVA (1-way)
Chapter 14 Repeated Measures
What if. . . You were asked to determine if psychology and sociology majors have significantly different class attendance (i.e., the number of days a person.
Comparing Several Means: ANOVA
Chapter 11 Analysis of Variance
1-Way Analysis of Variance - Completely Randomized Design
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

Analysis of Variance (ANOVA) A single-factor ANOVA can be used to compare more than two means. For example, suppose a manufacturer of paper used for grocery bags is concerned about the tensile strength of the paper. Product engineers believe that tensile strength is a function of the hardwood concentration and want to test several concentrations for the effect on tensile strength. If there are 2 different hardwood concentrations (say, 5% and 15%), then a z-test or t-test is appropriate: H0: μ1 = μ2 H1: μ1 ≠ μ2 EGR 252 Spring 2009 - Ch.13 Part 1

Comparing More Than Two Means What if there are 3 different hardwood concentrations (say, 5%, 10%, and 15%)? H0: μ1 = μ2 H0: μ1 = μ3 H0: μ2 = μ3 H1: μ1 ≠ μ2 H1: μ1 ≠ μ3 H1: μ2 ≠ μ3 How about 4 different concentrations (say, 5%, 10%, 15%, and 20%)? All of the above, PLUS H0: μ1 = μ4 H0: μ2 = μ4 H0: μ3 = μ4 H1: μ1 ≠ μ4 H1: μ2 ≠ μ4 H1: μ3 ≠ μ4 What about 5 concentrations? 10? and and 5 concentrations = 5!/(2!*3!) = 10 tests … 10 concentrations = 10!/(2!*8!) = 45 tests … and and EGR 252 Spring 2009 - Ch.13 Part 1

Comparing Multiple Means - Type I Error Suppose α = 0.05 P(Type 1 error) = 0.05 (1 – α) = P (accept H0 | H0 is true) = 0.95 Conducting multiple t-tests increases the probability of a Type 1 error The greater the number of t-tests, the greater the error probability 4 concentrations: (0.95)4 = 0.814 5 concentrations: (0.95)5 = 0.774 10 concentrations: (0.95)10 = 0.599 Making the comparisons simultaneously (as in an ANOVA) reduces the error back to 0.05 4 concentrations = 0.814506 5 conc. = 0.773781 10 conc. = 0.598737 EGR 252 Spring 2009 - Ch.13 Part 1

Analysis of Variance (ANOVA) Terms Independent variable: that which is varied Treatment Factor Level: the selected categories of the factor In a single–factor experiment there are a levels Dependent variable: the measured result Observations Replicates (N observations in the total experiment) Randomization: performing experimental runs in random order so that other factors don’t influence results. 4 concentrations = 0.814506 5 conc. = 0.773781 10 conc. = 0.598737 EGR 252 Spring 2009 - Ch.13 Part 1

The Experimental Design Suppose a manufacturer is concerned about the tensile strength of the paper used to produce grocery bags. Product engineers believe that tensile strength is a function of the hardwood concentration and want to test several concentrations for the effect on tensile strength. Six specimens were made at each of the 4 hardwood concentrations (5%, 10%, 15%, and 20%). The 24 specimens were tested in random order on a tensile test machine. Terms Factor: Hardwood Concentration Levels: 5%, 10%, 15%, 20% a = 4 N = 24 EGR 252 Spring 2009 - Ch.13 Part 1

The Results and Partial Analysis The experimental results consist of 6 observations at each of 4 levels for a total of N = 24 items. To begin the analysis, we calculate the average and total for each level. Hardwood Observations Concentration 1 2 3 4 5 6 Totals Averages 5% 7 8 15 11 9 10 60 10.00 10% 12 17 13 18 19 94 15.67 15% 14 16 102 17.00 20% 25 22 23 20 127 21.17   383 15.96 Overlay: a = 4 n = 6 Hardwood Observations Concentration (%) 1 2 3 4 5 6 Totals Averages 5 y11 y12 y13 y14 y15 y16 y1• y1• (bar) 10 y21 y22 y23 y24 y25 y26 y2• y2• (bar) 15 y31 y32 y33 y34 y35 y36 y3• y3• (bar) 20 y41 y42 y43 y44 y45 y46 y4• y4• (bar) y•• y•• (bar) EGR 252 Spring 2009 - Ch.13 Part 1

To determine if there is a difference in the response at the 4 levels … Calculate sums of squares Calculate degrees of freedom Calculate mean squares Calculate the F statistic Organize the results in the ANOVA table Conduct the hypothesis test SStotal = (72 + 82 + 152 + … + 232 + 182 + 202) – 3832/24 = 512.9583 SStreat = (602 + 942 + 1022 + 1272)/6 – 3832/24 = 382.7917 SSE = SStotal - SStreat = 130.1667 dftreat = 3 dfE = 20 dftotal = 23 EGR 252 Spring 2009 - Ch.13 Part 1

Calculate the sums of squares SStotal = (72 + 82 + 152 + … + 232 + 182 + 202) – 3832/24 = 512.9583 SStreat = (602 + 942 + 1022 + 1272)/6 – 3832/24 = 382.7917 SSE = SStotal - SStreat = 130.1667 dftreat = 3 dfE = 20 dftotal = 23 EGR 252 Spring 2009 - Ch.13 Part 1

Additional Calculations Calculate Degrees of Freedom dftreat = a – 1 = 3 df error = a(n – 1) = 20 dftotal = an – 1 = 23 Mean Square, MS = SS/df MStreat = 382.7917/3 = 127.5972 MSE = 130.1667 /20 = 6.508333 Calculate F = MStreat / MSError = 127.58 / 6.51 = 19.61 MStreat = SStreat/dftreat = 382.7917/3 = 127.5972 MSE = SSE/dfE = 130.1667 /20 = 6.508333 F = 127.58/6.51 = 19.61 EGR 252 Spring 2009 - Ch.13 Part 1

Organizing the Results Build the ANOVA table Determine significance fixed α-level  compare to Fα,a-1, a(n-1) p – value  find p associated with this F with degrees of freedom a-1, a(n-1) ANOVA Source of Variation SS df MS F P-value F crit Treatment 382.79 3 127.6 19.6 3.6E-06 3.1 Error 130.17 20 6.5083 Total 512.96 23   F = 19.60521 F.05,3,20 = 3.10 p-value =3.6E-06 EGR 252 Spring 2009 - Ch.13 Part 1

Conduct the Hypothesis Test Null Hypothesis: The mean tensile strength is the same for each hardwood concentration. Alternate Hypothesis: The mean tensile strength differs for at least one hardwood concentration Compare Fcrit to Fcalc Draw the graphic State your decision with respect to the null hypothesis State your conclusion based on the problem statement conclusion: there is a difference in tensile strength as a function of hardwood concentration EGR 252 Spring 2009 - Ch.13 Part 1

Hypothesis Test Results Null Hypothesis: The mean tensile strength is the same for each hardwood concentration. Alternate Hypothesis: The mean tensile strength differs for at least one hardwood concentration Fcrit less than Fcalc Draw the graphic Reject the null hypothesis Conclusion: The mean tensile strength differs for at least one hardwood concentration. conclusion: there is a difference in tensile strength as a function of hardwood concentration EGR 252 Spring 2009 - Ch.13 Part 1

Post-hoc Analysis: “Hand Calculations” Calculate and check residuals, eij = Oi - Ei plot residuals vs treatments normal probability plot Perform ANOVA and determine if there is a difference in the means If the decision is to reject the null hypothesis, identify which means are different using Tukey’s procedure: Model: yij = μ + αi + εij note: α in model refers to the treatment effect (not the significance level) EGR 252 Spring 2009 - Ch.13 Part 1

Graphical Methods - Computer Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev +---------+---------+---------+--------- 5% 6 10.000 2.828 (----*----) 10% 6 15.667 2.805 (----*-----) 15% 6 17.000 1.789 (----*-----) 20% 6 21.167 2.639 (-----*----) +---------+---------+---------+--------- 8.0 12.0 16.0 20.0 EGR 252 Spring 2009 - Ch.13 Part 1

Numerical Methods - Computer Tukey’s test Duncan’s Multiple Range test Easily performed in Minitab Tukey 95% Simultaneous Confidence Intervals (partial results) 10% subtracted from: Lower Center Upper ----+---------+---------+---------+----- 15% -2.791 1.333 5.458 (-----*-----) 20% 1.376 5.500 9.624 (-----*-----) ----+---------+---------+---------+----- -7.0 0.0 7.0 14.0 EGR 252 Spring 2009 - Ch.13 Part 1