Chapter 13: Comparing Several Means (One-Way ANOVA)

Slides:



Advertisements
Similar presentations
Intro to ANOVA.
Advertisements

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Variance Chapter 16.
Analysis and Interpretation Inferential Statistics ANOVA
Copyright ©2011 Brooks/Cole, Cengage Learning Analysis of Variance Chapter 16 1.
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Analysis of Variance: Inferences about 2 or More Means
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Hypothesis Testing Using The One-Sample t-Test
Chapter 12: Analysis of Variance
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
AM Recitation 2/10/11.
F-Test ( ANOVA ) & Two-Way ANOVA
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
Chapter 13 Analysis of Variance (ANOVA) PSY Spring 2003.
ANOVA (Analysis of Variance) by Aziza Munir
One-Way Analysis of Variance
November 15. In Chapter 12: 12.1 Paired and Independent Samples 12.2 Exploratory and Descriptive Statistics 12.3 Inference About the Mean Difference 12.4.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Chapter 14 – 1 Chapter 14: Analysis of Variance Understanding Analysis of Variance The Structure of Hypothesis Testing with ANOVA Decomposition of SST.
1 Always be mindful of the kindness and not the faults of others.
CHAPTER 27: One-Way Analysis of Variance: Comparing Several Means
Analysis of Variance STAT E-150 Statistical Methods.
Chapters Way Analysis of Variance - Completely Randomized Design.
Lecture notes 13: ANOVA (a.k.a. Analysis of Variance)
ANOVA: Analysis of Variation
Chapter 11 Analysis of Variance
I. ANOVA revisited & reviewed
ANOVA: Analysis of Variation
Statistical Significance
ANOVA: Analysis of Variation
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 20th February 2014  
Chapter 10 Two-Sample Tests and One-Way ANOVA.
ANOVA: Analysis of Variation
Statistics for Managers Using Microsoft Excel 3rd Edition
Lecture Slides Elementary Statistics Twelfth Edition
Objectives (PSLS Chapter 18)
CHAPTER 10 Comparing Two Populations or Groups
Inference for Regression
Comparing Three or More Means
Basic Practice of Statistics - 5th Edition
Statistical Data Analysis - Lecture10 26/03/03
Chapter 10: Analysis of Variance: Comparing More Than Two Means
Post Hoc Tests on One-Way ANOVA
Post Hoc Tests on One-Way ANOVA
Chapter 11 Analysis of Variance
Chapter 11: The ANalysis Of Variance (ANOVA)
Chapter 12: Comparing Independent Means
1-Way Analysis of Variance - Completely Randomized Design
I. Statistical Tests: Why do we use them? What do they involve?
Chapter 10 Analyzing the Association Between Categorical Variables
Statistics for the Social Sciences
CHAPTER 10 Comparing Two Populations or Groups
Chi Square (2) Dr. Richard Jackson
One-Way Analysis of Variance
Test to See if Samples Come From Same Population
Analyzing the Association Between Categorical Variables
CHAPTER 10 Comparing Two Populations or Groups
Parametric versus Nonparametric (Chi-square)
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
1-Way Analysis of Variance - Completely Randomized Design
CHAPTER 10 Comparing Two Populations or Groups
Exercise 1 Use Transform  Compute variable to calculate weight lost by each person Calculate the overall mean weight lost Calculate the means and standard.
BUS-221 Quantitative Methods
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

Chapter 13: Comparing Several Means (One-Way ANOVA) November 18

In Chapter 13: 13.1 Descriptive Statistics 11/16/2018 In Chapter 13: 13.1 Descriptive Statistics 13.2 The Problem of Multiple Comparisons 13.3 Analysis of Variance 13.4 Post Hoc Comparisons 13.5 The Equal Variance Assumption 13.6 Introduction to Nonparametric Tests Basic Biostat

Illustrative Example: Data Pets as moderators of a stress response. This chapter follows the analysis of data from a study in which heart rates (bpm) of participants were monitored after being exposed to a psychological stressor. Participants were randomized to one of three groups: Group 1 - monitored in presence of pet dog Group 2 - monitored in the presence of human friend Group 3 - monitored with neither dog nor human friend present

Illustrative Example: Data

SPSS Data Table Most computer programs require data in two columns One column is for the explanatory variable (group) One column is for the response variable (hrt_rate)

13.1 Descriptive Statistics Data are described and explored before moving to inferential calculations Here are summary statistics by group:

Exploring Group Differences John Tukey taught us the importance of exploratory data analysis (EDA) EDA techniques that apply: Stemplots Boxplots Dotplots John W. Tukey (1915 -2000)

Side-by-Side Stemplots

Side-by-Side Boxplots

§13.2 The Problem of Multiple Comparisons Consider a comparison of three groups. There are three possible t tests when considering three groups: (1) H0: μ1 = μ2 versus Ha: μ1 ≠ μ2 (2) H0: μ1 = μ3 versus Ha: μ1 ≠ μ3 (3) H0: μ2 = μ3 versus Ha: μ2 ≠ μ3 However, we do not perform separate t tests without modification → this would identify too many random differences

Problem of Multiple Comparisons Family-wise error rate = probability of at least one false rejection of H0 Assume three null hypotheses are true: At α = 0.05, the Pr(retain all three H0s) = (1−0.05)3 = 0.857. Therefore, Pr(reject at least one) = 1−0.847 = 0.143  this is the family-wise error rate. The family-wise error rate is much greater than intended. This is “The Problem of Multiple Comparisons”

Problem of Multiple Comparisons The more comparisons you make, the greater the family-wise error rate. This table demonstrates the magnitude of the problem

Mitigating the Problem of Multiple Comparisons Two-step approach: 1. Test for overall significance using a technique called “Analysis of Variance” 2. Do post hoc comparison on individual groups

13.3 Analysis of Variance One-way ANalysis Of VAriance (ANOVA) Categorical explanatory variable Quantitative response variable Test group means for a significant difference Statistical hypotheses H0: μ1 = μ2 = … = μk Ha: at least one of the μis differ Method: compare variability between groups to variability within groups (F statistic)

Analysis of Variance Overview, cont. R. A. Fisher (1890-1962) The F in the F statistic stands for “Fisher”

Variability Between Groups Variability of group means around the grand mean → provides a “signal” of group difference Based on a statistic called the Mean Square Between (MSB) Notation SSB ≡ sum of squares between dfB ≡ degrees of freedom between k ≡ number of groups x-bar ≡ grand mean x-bari ≡ mean of group i

Mean Square Between: Formula Sum of Squares Between [Groups] Degrees of Freedom Between Mean Square Between

Mean Square Between: Graphically

Mean Square Between: Example

Variability Within Groups Variability of data points within groups → quantifies random “noise” Based on a statistic called the Mean Square Within (MSW) Notation SSW ≡ sum of squares within dfW ≡ degrees of freedom within N ≡ sample size, all groups combined ni ≡ sample size, group I s2i ≡ variance of group i

Mean Square Within: Formula Sum of Squares Within Degrees of Freedom Within

Mean Square Within: Graphically

Mean Square Within: Example

The F statistic and ANOVA table Data are arranged to form an ANOVA table F statistic is the ratio of the MSB to MSW Fstat “signal-to-noise” ratio

Fstat and P-value The Fstat has numerator and denominator degrees of freedom: df1 and df2 respectively (corresponding to dfB and dfW) Convert Fstat to P-value with a computer program or Table D The P-value corresponds to the area in the right tail beyond

(Table D does not have df2 of 42; next lowest df2 is 30). Table D (“F Table”) The F table has limited listings for df2. You often must round-down to the next available df2 (rounding down preferable for conservative estimate). Wedge the Fstat between listing to find the approximate P-value df1 = 2, df2 = 42 (Table D does not have df2 of 42; next lowest df2 is 30). Fstat = 14.02 falls below P of 0.001

Fstat and P-value P < 0.001

ANOVA Example (Summary) Hypotheses: H0: μ1 = μ2 = μ3 vs. Ha: at least one of the μis differ Statistics: Fstat = 14.08 with 2 and 42 degrees of freedom P-value = .000021 (via SPSS), providing highly significant evidence against the H0; conclude the heart rates (an indicator of the effects of stress) differed in the groups Significance level (optional): Results are significantly at α = .00005

Computation Because of the complexity of computations, ANOVA statistics are often calculated by computer

ANOVA and the t test (Optional) ANOVA for two groups is equivalent to the equal variance (pooled) t test (§12.4) Both address H0: μ1 = μ2 dfW = df for t test = N – 2 MSW = s2pooled Fstat = (tstat)2 F1,df2,α = (tdf,1-α/2)2

13.4 Post Hoc Comparisons ANOVA Ha says “at least one population mean differs” but does not delineate which differ. Post hoc comparisons are pursued after rejection of the ANOVA H0 to delineate differences

SPSS Post Hoc Comparison Procedures Many post hoc comparison procedures exist. We cover the LSD and Bonferroni methods.

Least Squares Difference Procedure Do after a significant ANOVA to protect against the problem of multiple comparisons Hypotheses. H0: μi = μj vs. Ha: μi ≠ μj for each group i and j Test statistic C. P-value. Use t table or software

LSD Procedure: Example For the “pets” illustrative data, we test H0: μ1 = μ2 by hand. The other tests will be done by computer A. Hypotheses. H0: μ1 = μ2 against Ha: μ1 ≠ μ2 B. Test statistic. C. P-value. P = 0.0000039; highly significant evidence of a difference.

LSD Procedure, SPSS Results for illustrative “pets” data. H0: μ1 = μ2

95% Confidence Interval, Mean Difference, LSD Method

95% CI, LSD Method, Example Comparing Group 1 to Group 2:

Bonferroni Procedure The Bonferroni procedure is instituted by multiplying the P-value from the LSD procedure by the number of post hoc comparisons “c”. Hypotheses. H0: μ1 = μ2 against Ha: μ1 ≠ μ2 Test statistic. Same as for the LSD method. P-value. The LSD method produced P = .0000039 (two-tailed). Since there were three post hoc comparisons, PBonf = 3 × .0000039 = .000012.

Bonferroni Confidence Interval Let c represent the number of post hoc comparisons. Comparing Group 1 to Group 2:

Bonferroni Procedure, SPSS P-values from Bonferroni are higher and confidence intervals are broader than LSD method, reflecting its conservative approach

§13.5. The Equal Variance Assumption Conditions for ANOVA: 1. Sampling independence 2. Normal sampling distributions of mean 3. Equal variance within population groups Let us focus on condition 3, since conditions 1 and 2 are covered elsewhere. Equal variance is called homoscedasticity. (Unequal variance = heteroscedasticity). Homoscedasticity allows us to pool group variances to form the MSW

Assessing “Equal Variance” 1. Graphical exploration. Compare spreads visually with side-by-side plots. 2. Descriptive statistics. If a group’s standard deviation is more than twice that of another, be alerted to possible heteroscedasticity 3. Test variances. A statistical test can be applied (next slide).

Levene’s Test of Variances Hypotheses. H0: σ21 = σ22 = … = σ2k Ha: at least one σ2i differs B. Test statistic. Test is performed by computer. The test statistic is a particular type of Fstat based on the rank transformed deviations (see p. 283 for details). C. P-value. The Fstat is converted to a P-value by the computational program. Interpretation of P is routine  small P  evidence against H0, suggesting heteroscedasticity.

Levene’s Test – Example (“pets” data) A. H0: σ21 = σ22 = σ23 versus Ha: at least one σ2i differs B. SPSS output (below). Fstat = 0.059 with 2 and 42 df C. P = 0.943. Very weak evidence against H0  retain assumption of homoscedasticity

Analyzing Groups with Unequal Variance Stay descriptive. Use summary statistics and EDA methods to compare groups. Remove outliers, if appropriate (p. 287). Mathematically transform the data to compensate for heteroscedasticity (e.g., a long right tail can be pulled in with a log transform). Use robust non-parametric methods.

13.6 Intro to Nonparametric Methods Many nonparametric procedures are based on rank transformed data (“rank tests”). Here are examples:

The Kruskal-Wallis Test Let us explore the Kruskal-Wallis test as an example of a non-parametric test The Kruskal-Wallis test is the non-parametric analogue of one-way ANOVA. It does not require Normality or Equal Variance conditions for inference. It is based on rank transformed data and seeing if the mean ranks in groups differ significantly.

Kruskal-Wallis Test The K-W hypothesis can be stated in terms of mean or median (depending on assumptions made about population shapes). Let us use the later. Let Mi ≡ the median of population i There are k groups H0: M1 = M2 = … = Mk Ha: at least one Mi differs

Kruskal-Wallis, Example Alcohol and income. Data from a survey on alcohol consumption and income are presented.

Kruskal-Wallis Test, Example We wish to test whether the means differ significantly but find graphical and hypothesis testing evidence that the population variances are unequal.

Kruskal-Wallis Test, Example, cont. Hypotheses. H0: M1 = M2 = M3 = M4 = M5 vs. Ha: at least one Mi differs Test statistic. Some computer programs use chi-square statistic based upon a Normal approximation. SPSS derives Chi-square statistics = 7.793 with 4 df (next slide)

Kruskal-Wallis Test, Example, cont. P = 0.099, providing marginally significant evidence against H0.