Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 13: Multiple Comparisons Experimentwise Alpha (α EW ) –The probability.

Slides:

Advertisements

Similar presentations

Week 2 – PART III POST-HOC TESTS. POST HOC TESTS When we get a significant F test result in an ANOVA test for a main effect of a factor with more than.

Advertisements

One-Way BG ANOVA Andrew Ainsworth Psy 420. Topics Analysis with more than 2 levels Deviation, Computation, Regression, Unequal Samples Specific Comparisons.

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.

C82MST Statistical Methods 2 - Lecture 4 1 Overview of Lecture Last Week Per comparison and familywise error Post hoc comparisons Testing the assumptions.

Analysis of Variance (ANOVA) Statistics for the Social Sciences Psychology 340 Spring 2010.

One-Way ANOVA Multiple Comparisons.

PSY 307 – Statistics for the Behavioral Sciences

Lecture 10 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D

Part I – MULTIVARIATE ANALYSIS

ANOVA Analysis of Variance: Why do these Sample Means differ as much as they do (Variance)? Standard Error of the Mean (“variance” of means) depends upon.

Analysis of Variance: Inferences about 2 or More Means

Comparing Means.

Intro to Statistics for the Behavioral Sciences PSYC 1900

Lecture 9: One Way ANOVA Between Subjects

Two Groups Too Many? Try Analysis of Variance (ANOVA)

Statistics for the Social Sciences Psychology 340 Spring 2005 Analysis of Variance (ANOVA)

One-way Between Groups Analysis of Variance

Lecture 12 One-way Analysis of Variance (Chapter 15.2)

The Analysis of Variance

K-group ANOVA & Pairwise Comparisons ANOVA for multiple condition designs Pairwise comparisons and RH Testing Alpha inflation & Correction LSD & HSD procedures.

Comparing Means.

Today Concepts underlying inferential statistics

Chapter 9: Introduction to the t statistic

Chapter 14 Inferential Data Analysis

Linear Contrasts and Multiple Comparisons (Chapter 9)

Chapter 12 Inferential Statistics Gay, Mills, and Airasian

Chapter 5For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Suppose we wish to know whether children who grow up in homes without access to.

Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210.

Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.

QNT 531 Advanced Problems in Statistics and Research Methods

Intermediate Applied Statistics STAT 460

ANOVA Greg C Elvers.

Stats Lunch: Day 7 One-Way ANOVA. Basic Steps of Calculating an ANOVA M = 3 M = 6 M = 10 Remember, there are 2 ways to estimate pop. variance in ANOVA:

Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.

Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.

t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.

Chapter 13 Analysis of Variance (ANOVA) PSY Spring 2003.

Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.

Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.

I. Statistical Tests: A Repetive Review A.Why do we use them? Namely: we need to make inferences from incomplete information or uncertainty þBut we want.

Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, October 15, 2013 Analysis of Variance (ANOVA)

© Copyright McGraw-Hill 2000

Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.

Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.

One-way ANOVA: - Comparing the means IPS chapter 12.2 © 2006 W.H. Freeman and Company.

Chapter 12 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 12: One-Way Independent ANOVA What type of therapy is best for alleviating.

Chapter 12 Introduction to Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick.

Chapter 13 Repeated-Measures and Two-Factor Analysis of Variance

ANOVA P OST ANOVA TEST 541 PHL By… Asma Al-Oneazi Supervised by… Dr. Amal Fatani King Saud University Pharmacy College Pharmacology Department.

Two-Way (Independent) ANOVA. PSYC 6130A, PROF. J. ELDER 2 Two-Way ANOVA “Two-Way” means groups are defined by 2 independent variables. These IVs are typically.

Statistics for the Social Sciences Psychology 340 Spring 2009 Analysis of Variance (ANOVA)

Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.

Chapters Way Analysis of Variance - Completely Randomized Design.

1 Statistics for the Behavioral Sciences (5 th ed.) Gravetter & Wallnau Chapter 13 Introduction to Analysis of Variance (ANOVA) University of Guelph Psychology.

Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.

Stats/Methods II JEOPARDY. Jeopardy Estimation ANOVA shorthand ANOVA concepts Post hoc testsSurprise $100 $200$200 $300 $500 $400 $300 $400 $300 $400.

©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 4 Investigating the Difference in Scores.

Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.

Chapter 10: The t Test For Two Independent Samples.

Chapter 12 Introduction to Analysis of Variance

Six Easy Steps for an ANOVA 1) State the hypothesis 2) Find the F-critical value 3) Calculate the F-value 4) Decision 5) Create the summary table 6) Put.

Two-Sample Hypothesis Testing

Statistical Data Analysis - Lecture /04/03

Hypothesis testing using contrasts

Psych 706: Stats II Class #2.

Planned Comparisons & Post Hoc Tests

What if. . . You were asked to determine if psychology and sociology majors have significantly different class attendance (i.e., the number of days a person.

I. Statistical Tests: Why do we use them? What do they involve?

Psych 231: Research Methods in Psychology

1-Way Analysis of Variance - Completely Randomized Design

Presentation transcript:

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 13: Multiple Comparisons Experimentwise Alpha (α EW ) –The probability that an experiment will produce any Type I errors among multiple tests (a variation is called familywise alpha). –If, instead of ANOVA, all of the possible t tests are performed for a multigroup experiment, α EW will be greater than α PC (alpha per comparison), the alpha used for each individual t test. –Without protection, α EW increases as the number of groups in a one- way ANOVA increases, due to the increased number of opportunities to commit Type I errors.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 2 How Large Can α EW Get? –First find the probability of making no Type I errors at all: For one test: usually, α pc =.05; therefore, p of no Type I error = 1 – α pc = 1 –.05 =.95 For j tests, if each test is independent of all the others, the probability of no errors requires multiplying j times: The probability of one or more Type I errors occurring among the multiple tests is the complement of the probability of no errors. Thus, The possible pairwise comparisons following an ANOVA are not all mutually independent, but the above equation does give a reasonable indication of how large α EW can get if α pc is not adjusted when performing multiple comparisons.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 3 Fisher’s Protected t Tests –“Protected” because the F for the ANOVA must be significant before proceeding. –If we assume homogeneity of variance for the experiment, MS W is the best estimate of the common error variance and can therefore be used in place of s 2 p in all of the pairwise comparisons. Fisher’s L east S ignificant D ifference Test Unequal ns Equal ns

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 4 Complete vs. Partial Null Hypothesis Complete Null: Hypothesizes the equality of all the population means Fisher’s protected t tests control α EW only with respect to experiments for which the complete H 0 is true, or when the study involves only three groups. Partial Null: Hypothesizes the equality of some, but not all of the population means If a partial null is true for more than three populations, the ANOVA can attain significance due to one mean that differs from the others, which then allows multiple comparisons among groups whose population means are equal, which, in turn, increases the chances of committing Type I errors. Thus, Fisher’s procedure is not protected against partial null hypotheses.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 5 Tukey’s H onestly S ignificant D ifference –Maintains good control over α EW even for partial nulls. It is therefore more conservative than Fisher’s LSD test. –Uses the Studentized range statistic (q) to obtain its critical values. The q statistic is based on the fact that the greater the number of samples drawn from the same population, the larger the difference between the smallest and largest of the sample means becomes. The size of all of the samples is assumed to be the same (n). The size of q increases as the number of groups (the “range”) increases, but decreases as the size (n) of those groups increases (in a manner similar to Student’s t distribution, which is why q is said to be “studentized”).

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 6 Tukey’s HSD Formula –The formula for Tukey’s HSD is as follows: where n is the size of each sample, and q crit is found from the number of groups, and the number of degrees of freedom associated with MS W (i.e., df W ) –The number 2, which appears in the LSD formula, is missing from the HSD formula, because its value has been incorporated into the table of the q statistic (i.e., the original q values were multiplied by the square root of 2). –The q statistic is based on equal sample sizes, but if the ns differ only slightly and accidentally, it is reasonable to use the harmonic mean of all of the ns to obtain the value of n for the HSD formula.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 7 Properties of Tukey’s HSD –Advantages: 1) α EW is kept from rising above the value chosen for the test (usually.05), regardless of how many pairs are compared, or whether any partial H 0 is true. 2) It is easy to find confidence intervals for the difference of any two population means. – Disadvantages: 1) α EW usually turns out to be below the alpha chosen for the test (e.g., about.02 or.03, if.05 is chosen). Therefore, there are more powerful alternatives to HSD (i.e., less strict about α EW ), which are, nonetheless, sufficiently conservative. 2) The sample sizes must be equal, or nearly equal. –HSD does not require the one-way ANOVA to be significant, in order to be “protected.” It is possible for a pair of means to be significantly different (i.e., exceed HSD), even when the ANOVA is not. It is possible for the ANOVA to be significant, and yet fail to find any pair of means to differ significantly (i.e., by more than HSD).

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 8 Confidence Intervals for Tukey’s HSD Test –HSD is considered a simultaneous rather than a sequential comparison method; CIs can be created easily for simultaneous methods. –The CI for any pair of means in the study is given by the following formula: where q crit is based on the total number of groups in the study and n is the size of any one group –If the groups differ slightly and accidentally in size, the harmonic mean of the sample sizes can be used in place of n in the preceding formula.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 9 Try This Example… Does diet affect bicycle riding speed? DV is time (in minutes) to ride 6 miles. Compare the results of the LSD and HSD tests. Which diets differ at the.05 level? LSD = 1.65; HSD = 2.02 Only the normal and vegetarian diets differ by HSD, but with LSD both the normal and organic diets differ significantly from the vegetarian diet. HSD is overly conservative with three groups.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 10 Other Procedures for Post Hoc Pairwise Comparisons The main difference among the many tests that can be used for multiple pairwise comparisons following an ANOVA is how conservative they are (i.e., strict in controlling Type I errors). The more conservative the test, the less powerful it is (i.e., leads to a greater rate of Type II errors). –Newman-Keuls Test (AKA Student Newman-Keuls or SNK Test) CVs come from studentized range statistic. Arrange means in order and use the range between any two of them to look up critical q values (instead of # of groups). Easier to find significance, and therefore more powerful than HSD. Used to be one of the most popular post hoc procedures until it was discovered that its extra power came from allowing α EW to rise above the level that was set for it. The greater the number of groups, the worse the problem. This test is no longer considered acceptable.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 11 Other Procedures for Post Hoc Pairwise Comparisons (cont.) –Modified LSD (Fisher-Hayter) Test Requires significance of the one-way ANOVA to proceed. If ANOVA is significant, calculate a modified HSD value: Find the critical q by setting the number of groups to k – 1, rather than k. This test is acceptably conservative, more powerful than Tukey’s HSD, and easy to understand and calculate. Unfortunately, this test is not well-known, and is therefore rarely used. –Dunnett’s Test Applies to the specific situation in which several groups are being compared to the same reference (e.g., control) group. It is the optimal test for that situation. –REGWQ Test Modifies Tukey’s test to be more powerful without allowing α EW to rise above the set value. Like Dunnett’s test, it is readily available from SPSS, but rarely used.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 12 Planned Comparisons Bonferroni Correction –Based on the following inequality: where j equals the number of comparisons being planned, and α pc is the α used for each comparison –The following formula can be used to adjust α PC accordingly: –Very conservative; not recommended for post hoc comparisons. –Very simple and flexible procedure when used for planned comparisons. –Works best when only a relatively small proportion of the possible tests in the study have been planned.

Chapter 13For Explaining Psychological Statistics, 4th ed. by B. Cohen 13 Complex Comparisons –You might want to compare the average of two groups to the average of three others: –The general format for a linear contrast is as follows: –Applied to this particular example, the linear contrast in the population looks like this: –Note that the coefficients (cs) add up to zero, which must be the case f or the contrast to be considered a linear one.

Chapter 13For Explaining Psychological Statistics, 4th ed. by B. Cohen 14 Sample Estimates for Linear Contrasts –A linear contrast can be reduced to a single difference score, though it may involve many group means. –The estimate of a linear contrast from sample means looks like this: When only two means are involved, it is called a pairwise comparison When more than two means are involved, it is called a complex comparison –The sum of squares associated with a linear contrast involving equal-sized samples can be found from this formula:

Chapter 13For Explaining Psychological Statistics, 4th ed. by B. Cohen 15 Testing a Planned Contrast for Significance Because the contrast is based on only one df, MS contrast = SS contrast Its error term is just MS w from the ANOVA, so for equal ns, the contrast can be tested as: –The critical F for a planned contrast is based on one df for the numerator, and df W from the ANOVA for the denominator. Note that this critical F will be larger than the critical F for the one-way ANOVA, which has df bet for the numerator. This is a potential disadvantage of testing a planned contrast.

Chapter 13For Explaining Psychological Statistics, 4th ed. by B. Cohen 16 Testing Post Hoc Complex Comparisons Scheffé’s test: Adjusts the critical F of the omnibus one-way ANOVA by multiplying it by df bet –N T = total number of subjects –k = number of groups –If the one-way ANOVA was not statistically significant, there is no point in finding a post hoc complex contrast; its F ratio will not exceed F S as defined above. –Advantages of Scheffé’s test: Adequately conservative for complex comparisons (but overly conservative for pairwise comparisons). Requires no special tables (just the F tables). Doesn’t require equal ns. Leads to easy creation of CI’s.

Chapter 13For Explaining Psychological Statistics, 4th ed. by B. Cohen 17 Orthogonal Contrasts –Comparisons that represent mutually independent pieces of information. –Maximum number of orthogonal contrasts in a set is related to the number of groups in the experiment. If there are k groups, there can be, at most, k –1 mutually orthogonal contrasts. Maximum number of contrasts is the same as df bet for ANOVA. Each orthogonal contrast represents 1 df. –Sum of the SSs for a test of orthogonal contrasts will add up to SS bet. –If the sum of the cross products of two linear contrasts is zero, those contrasts are orthogonal.

Chapter 13 For Explaining Psychological Statistics, 4th ed. by B. Cohen 18 Properties of Complex Comparisons –The more closely your contrast matches the actual pattern of sample means, the larger the proportion of SS bet that will be captured by SS contrast. –When the sample means do not fall in the pattern you expect, your F contrast may be smaller than the ANOVA F. –It is typically not acceptable to gain power through choosing a planned contrast if there is no possibility of losing power by making the wrong choice. –Planned contrasts are analogous to one- tailed tests (in a two-group design): They are valid only when you predict the pattern of means before looking at your results. –Post hoc complex comparisons are only allowed when the omnibus ANOVA is significant, and Scheffé’s test is used.