Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Variance Chapter 16.

Slides:



Advertisements
Similar presentations
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Advertisements

CHAPTER 25: One-Way Analysis of Variance Comparing Several Means
CHAPTER 25: One-Way Analysis of Variance: Comparing Several Means ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner.
Chapter 11 Analysis of Variance
Inference for Regression
ANOVA: Analysis of Variation
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Copyright ©2011 Brooks/Cole, Cengage Learning Testing Hypotheses about Means Chapter 13.
Significance Testing Chapter 13 Victor Katch Kinesiology.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Significance Tests Chapter 13.
© 2010 Pearson Prentice Hall. All rights reserved Single Factor ANOVA.
Copyright ©2011 Brooks/Cole, Cengage Learning Analysis of Variance Chapter 16 1.
Part I – MULTIVARIATE ANALYSIS
Statistics Are Fun! Analysis of Variance
Lecture 9: One Way ANOVA Between Subjects
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide Are the Means of Several Groups Equal? Ho:Ha: Consider the following.
One-way Between Groups Analysis of Variance
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Chapter 12: Analysis of Variance
Copyright © 2009 Pearson Education, Inc. Chapter 28 Analysis of Variance.
F-Test ( ANOVA ) & Two-Way ANOVA
Copyright © 2010, 2007, 2004 Pearson Education, Inc. *Chapter 28 Analysis of Variance.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
STAT 3130 Statistical Methods I Session 2 One Way Analysis of Variance (ANOVA)
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
STA291 Statistical Methods Lecture 31. Analyzing a Design in One Factor – The One-Way Analysis of Variance Consider an experiment with a single factor.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
More About Significance Tests
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Comparing Two Population Means
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
CHAPTER 18: Inference about a Population Mean
 The idea of ANOVA  Comparing several means  The problem of multiple comparisons  The ANOVA F test 1.
t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.
© Copyright McGraw-Hill CHAPTER 12 Analysis of Variance (ANOVA)
One-way ANOVA: - Inference for one-way ANOVA IPS chapter 12.1 © 2006 W.H. Freeman and Company.
ANOVA (Analysis of Variance) by Aziza Munir
Copyright © 2004 Pearson Education, Inc.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Section Inference about Two Means: Independent Samples 11.3.
Chapter 19 Analysis of Variance (ANOVA). ANOVA How to test a null hypothesis that the means of more than two populations are equal. H 0 :  1 =  2 =
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
© Copyright McGraw-Hill 2000
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.3 Two-Way ANOVA.
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
CHAPTER 27: One-Way Analysis of Variance: Comparing Several Means
Chapter 4 Analysis of Variance
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Two-Sample Tests and One-Way ANOVA Business Statistics, A First.
Copyright © 2016, 2013, 2010 Pearson Education, Inc. Chapter 10, Slide 1 Two-Sample Tests and One-Way ANOVA Chapter 10.
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
Analysis of Variance STAT E-150 Statistical Methods.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Chapter 12 Introduction to Analysis of Variance
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
Chapter 15 Analysis of Variance. The article “Could Mean Platelet Volume be a Predictive Marker for Acute Myocardial Infarction?” (Medical Science Monitor,
ANOVA: Analysis of Variation
ANOVA: Analysis of Variation
Basic Practice of Statistics - 5th Edition
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Variance Chapter 16

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 2 ANOVA Analysis of variance: tool for analyzing how the mean value of a quantitative response variable is affected by one or more categorical explanatory factors. If one categorical variable: one-way ANOVA If two categorical variables: two-way ANOVA

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc Comparing Means with an ANOVA F-Test F-statistic: H 0 :  1 =  2 = … =  k H a : The means are not all equal.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 4 Variation among sample means is 0 if all k sample means are equal and gets larger the more spread out they are. If F is large enough => evidence at least one population mean differs from others => reject null hypothesis. p-value found using an F-distribution (more later)

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 5 Example 16.1 Seat Location and GPA Q: Do best students sit in the front of a classroom? Data on seat location and GPA for n = 384 students; 88 sit in front, 218 in middle, 78 in back Students sitting in the front generally have slightly higher GPAs than others.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 6 Example 16.1 Seat Location and GPA (cont) The F-statistic is 6.69 and the p-value is p-value so small => reject H 0 and conclude there are differences among the means. H 0 :  1 =  2 =  3 H a : The means are not all equal.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 7 Example 16.1 Seat Location and GPA (cont) 95% Confidence Intervals for 3 population means: Interval for “front” does not overlap with the other two intervals => significant difference between mean GPA for front-row sitters and mean GPA for other students

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 8 Notation for Summary Statistics k = number of groups, s i, and n i are the mean, standard deviation, and sample size for the i th sample group N = total sample size (N = n 1 + n 2 + … + n k ) Example 16.1 Seat Location and GPA (cont) Three seat locations => k = 3 n 1 = 88, n 2 = 218, n 3 = 78; N = = 384

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 9 Assumptions for the F-Test Samples are independent random samples. Distribution of response variable is a normal curve within each population. Different populations may have different means. All populations have same standard deviation, . e.g. How k = 3 populations might look …

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 10 Conditions for Using the F-Test F-statistic can be used if data are not extremely skewed, there are no extreme outliers, and group standard deviations are not markedly different. Tests based on F-statistic are valid for data with skewness or outliers if sample sizes are large. A rough criterion for standard deviations is that the largest of the sample standard deviations should not be more than twice as large as the smallest of the sample standard deviations.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 11 Example 16.1 Seat Location and GPA (cont) The boxplot showed two outliers in the group of students who typically sit in the middle of a classroom, but there are 218 students in that group so these outliers don’t have much influence on the results. The standard deviations for the three groups are nearly the same. Data do not appear to be skewed. Necessary conditions for F-test seem satisfied.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 12 The Family of F-Distributions Skewed distributions with minimum value of 0. Specific F-distribution indicated by two parameters called degrees of freedom: numerator degrees of freedom and denominator degrees of freedom. In one-way ANOVA, numerator df = k – 1, and denominator df = N – k

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 13 Determining the p-Value Statistical Software reports the p-value in output. Table A.4 provides critical values for 1% and 5% significance levels. If the F-statistic is > than the 5% critical value, the p-value < If the F-statistic is > than the 1% critical value, the p-value < If the F-statistic is between the 1% and 5% critical values, the p-value is between 0.01 and 0.05.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 14 Example 16.2 Testosterone and Occupation Study: Compare mean testosterone levels for k = 7 occupational groups Reported F-statistic was F = 2.5 and p-value < 0.05 N = 66 men: num df = k – 1 = 7 – 1 = 6 den df = N – k = 66 – 7 = 59 Table A.4 with df of (6, 60): The 5% critical value is 2.25 and the F-statistic was larger so the the p-value < 0.05.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 15 Multiple Comparisons Multiple comparisons: two or more comparisons are made to examine specific pattern of differences among means. Most common: all pairwise comparisons. Ways to make inferences about each pair of means: Significance test to assess if two means significantly differ. Confidence interval for difference computed and if 0 is not in the interval, there is a statistically significant difference.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 16 Multiple Comparisons Many statistical tests done => increased risk of making at least one type I error (erroneously rejecting a null hypothesis). Several procedures to control the overall family type I error rate or overall family confidence level. Family error rate for set of significance tests is probability of making one or more type I errors when more than one significance test is done. Family confidence level for procedure used to create a set of confidence intervals is the proportion of times all intervals in set capture their true parameter values.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 17 Example 16.1 Seat Location and GPA (cont) Pairwise Comparison Output: Tukey: Family confidence level of 0.95 Fisher: 0.95 level for each individual interval Here, both give same conclusions: Only 1 interval covers 0,  Middle –  Back Appears population mean GPAs differ for front and middle students and for front and back students.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc Details of One-Way Analysis of Variance Fundamental concept: the variation among the data values in the overall sample can be separated into: (1) differences between group means (2) natural variation among observations within a group Total variation = Variation between groups + Variation within groups ANOVA Table displays this information.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 19 Measuring Variation Between Groups Sum of squares for groups = SS Groups Numerator of F-statistic = mean square for groups

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 20 Measuring Variation within Groups Sum of squared errors = SS Error Denominator of F-statistic = mean square error Pooled standard deviation:

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 21 Measuring Total Variation Total sum of squares = SS Total = SSTO SS Total = SS Groups + SS Error

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 22 General Format of a One-Way ANOVA Table

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 23 Example 16.3 Comparison of Weight Loss Programs Program 3 appears to have the highest weight loss overall.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 24 Example 16.3 Comparison of Weight Loss Programs (cont)

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 25 Example 16.3 Comparison of Weight Loss Programs (cont)

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 26 Example 16.3 Comparison of Weight Loss Programs (cont)

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 27 Example 16.3 Comparison of Weight Loss Programs (cont) “Factor” used instead of Groups as the groups (weight-loss programs) form an explanatory factor for the response. Note: Pooled StDev is

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 28 Example 16.4 Top Speeds of Supercars Data: top speeds for six runs on each of five supercars. Kitchens (1998, p. 783)

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 29 Example 16.4 Top Speeds (cont)

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 30 Example 16.4 Top Speeds (cont) F = and p-value is => reject null hypothesis that population mean speeds are same for all five cars. Conditions are satisfied. Data not skewed and no extreme outliers. Largest sample std dev (5.02 Viper) not more than twice as large as smallest std dev (2.92 Acura). MS Error =14.5 is an estimate of variance of top speed for hypothetical distribution of all possible runs with one car. Estimated standard deviation for each car is Based on sample means and CIs: Porsche and Ferrari seem to be significantly faster than other cars.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc % Confidence Intervals for the Population Means In one-way analysis of variance, a confidence interval for a population mean   is where and t* is such that the confidence level is the probability between -t* and t* in a t-distribution with df = N – k.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc Other Methods When data are skewed or extreme outliers present …better to analyze the median instead of mean Two such tests are: 1.Kruskal-Wallis Test 2.Mood’s Median Test Also called nonparametric tests. H 0 : Population medians are equal. H a : Population medians are not all equal.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 33 Example 16.5 Drinks and Seat Location Data: Seat location and number of alcoholic drinks per week Students sitting in the back report drinking more. Data appear skewed, sample standard deviations differ.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 34 Example 16.5 Drinks and Seat (cont) P = => strong evidence that the population median number of drinks per week are not all equal.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 35 Example 16.5 Drinks and Seat (cont) P = => the null hypothesis of equal population medians can be rejected.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc Two-Way ANOVA (CD Topic S4) Two-way analysis of variance: to examine how two categorical explanatory variables affect the mean of a quantitative response variable. Main effect: overall effect of a single explanatory variable. Interaction: effect on response variable of one explanatory variable depends upon the specific value or level for the other explanatory variable.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 37 Example 16.6 Happy Faces and Tips Q: Does drawing a happy face on the restaurant bill increase average tip to server? Effect of drawing happy face depended on gender. Speculated customers felt happy face not gender appropriate for males.

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 38 Example 16.7 You’ve Got to Have Heart Response: Weight gain in Infants Explanatory: Heartbeat Status (Yes or No) Initial weight (low, med, high) Weight gain generally greater for heartbeat group. There is a main effect for the heartbeat status. Approximately parallel lines => little/no interaction

Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. 39 Example 16.6 Faces and Tips (cont) Two-way ANOVA: Three F-statistics are made – one for each main effect and one for interaction. Since interaction effect is significant => difficult to interpret the main effect.