Tests after a significant F

Slides:



Advertisements
Similar presentations
Overview of Lecture Partitioning Evaluating the Null Hypothesis ANOVA
Advertisements

Analysis of Variance (ANOVA) ANOVA methods are widely used for comparing 2 or more population means from populations that are approximately normal in distribution.
Hypothesis Testing Steps in Hypothesis Testing:
Smith/Davis (c) 2005 Prentice Hall Chapter Thirteen Inferential Tests of Significance II: Analyzing and Interpreting Experiments with More than Two Groups.
C82MST Statistical Methods 2 - Lecture 4 1 Overview of Lecture Last Week Per comparison and familywise error Post hoc comparisons Testing the assumptions.
Analysis and Interpretation Inferential Statistics ANOVA
Testing Differences Among Several Sample Means Multiple t Tests vs. Analysis of Variance.
Lecture 10 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
ANOVA Analysis of Variance: Why do these Sample Means differ as much as they do (Variance)? Standard Error of the Mean (“variance” of means) depends upon.
Comparing Means.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Lecture 9: One Way ANOVA Between Subjects
Two Groups Too Many? Try Analysis of Variance (ANOVA)
One-way Between Groups Analysis of Variance
The Two-way ANOVA We have learned how to test for the effects of independent variables considered one at a time. However, much of human behavior is determined.
Comparing Means.
Introduction to Analysis of Variance (ANOVA)
Richard M. Jacobs, OSA, Ph.D.
1 Chapter 13: Introduction to Analysis of Variance.
Computing our example  Step 1: compute sums of squares  Recall our data… KNR 445 Statistics ANOVA (1w) Slide 1 TV MovieSoap OperaInfomercial
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Chapter 12: Analysis of Variance
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
ANOVA Chapter 12.
Repeated Measures ANOVA
1 One Way Analysis of Variance – Designed experiments usually involve comparisons among more than two means. – The use of Z or t tests with more than two.
ANOVA Greg C Elvers.
1 1 Slide © 2005 Thomson/South-Western Chapter 13, Part A Analysis of Variance and Experimental Design n Introduction to Analysis of Variance n Analysis.
One-Way Analysis of Variance Comparing means of more than 2 independent samples 1.
Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.
Chapter 12: Introduction to Analysis of Variance
Statistics 11 Confidence Interval Suppose you have a sample from a population You know the sample mean is an unbiased estimate of population mean Question:
One-way Analysis of Variance 1-Factor ANOVA. Previously… We learned how to determine the probability that one sample belongs to a certain population.
Between-Groups ANOVA Chapter 12. >When to use an F distribution Working with more than two samples >ANOVA Used with two or more nominal independent variables.
1 Randomized Block Design 1.Randomized block design 2.Sums of squares in the RBD 3.Where does SS B come from? 4.Conceptual formulas 5.Summary table 6.Computational.
Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 11-1 Business Statistics, 3e by Ken Black Chapter.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
I. Statistical Tests: A Repetive Review A.Why do we use them? Namely: we need to make inferences from incomplete information or uncertainty þBut we want.
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, October 15, 2013 Analysis of Variance (ANOVA)
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Chapter 17 Comparing Multiple Population Means: One-factor ANOVA.
1 Regression & Correlation (1) 1.A relationship between 2 variables X and Y 2.The relationship seen as a straight line 3.Two problems 4.How can we tell.
Chapter 12 Introduction to Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick.
Chapter 13 Repeated-Measures and Two-Factor Analysis of Variance
12-1 Chapter Twelve McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Psy 230 Jeopardy Related Samples t-test ANOVA shorthand ANOVA concepts Post hoc testsSurprise $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
ANOVAs.  Analysis of Variance (ANOVA)  Difference in two or more average scores in different groups  Simplest is one-way ANOVA (one variable as predictor);
Psych 230 Psychological Measurement and Statistics Pedro Wolf November 18, 2009.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
One-Way Analysis of Variance Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test.
Statistics for the Social Sciences Psychology 340 Spring 2009 Analysis of Variance (ANOVA)
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
1 Lecture 18 The Two-way Anova We have learned how to test for the effects of independent variables considered one at a time. However, much of human behavior.
Outline of Today’s Discussion 1.Independent Samples ANOVA: A Conceptual Introduction 2.Introduction To Basic Ratios 3.Basic Ratios In Excel 4.Cumulative.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
ANOVA and Multiple Comparison Tests
1 Lecture 15 One Way Analysis of Variance  Designed experiments usually involve comparisons among more than two means.  The use of Z or t tests with.
Stats/Methods II JEOPARDY. Jeopardy Estimation ANOVA shorthand ANOVA concepts Post hoc testsSurprise $100 $200$200 $300 $500 $400 $300 $400 $300 $400.
Independent Samples ANOVA. Outline of Today’s Discussion 1.Independent Samples ANOVA: A Conceptual Introduction 2.The Equal Variance Assumption 3.Cumulative.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 4 Investigating the Difference in Scores.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Posthoc Comparisons finding the differences. Statistical Significance What does a statistically significant F statistic, in a Oneway ANOVA, tell us? What.
Chapter 12 Introduction to Analysis of Variance
Comparing Several Means: ANOVA
Analysis of Variance (ANOVA)
I. Statistical Tests: Why do we use them? What do they involve?
Randomized Block Design
Chapter 12: Introduction to Analysis of Variance
Presentation transcript:

Tests after a significant F 1. The F test is only a preliminary analysis 2. Planned comparisons vs. post-hoc comparisons 3. What goes in the denominator of our test? 4. What happens to α when we make multiple comparisons among means? 5. t-test for planned comparisons 6. Tukey’s HSD test for post-hoc comparisons 7. Newman-Keuls test for post-hoc comparisons Lecture 16

An aside: We have a set of treatment means, e.g.: X1 X2 X3 X4 X5 From this set, we can form a number of pairs for comparisons of treatment means – here are just a few examples of the possible pairs: X1 vs. X2 X3 X6 vs. X2 vs. X5 Lecture 16

The F test is only a preliminary You have a number of treatments (levels of the independent variable). Each treatment produces a treatment mean. The significant F tells you only that there is a difference among these means somewhere. Pairwise comparisons of the means are then necessary to pinpoint exactly where your effect is. Lecture 16

Planned comparisons Planned comparisons are tests of differences among the treatment means that you designed your experiment to make possible Is different from ? We usually don’t do all possible comparisons among the entire set of treatment means. We choose a few specific comparisons on the basis of a theory of the behavior being studied. Xi Xj Lecture 16

Planned comparisons Doing only a few comparisons is important for two reasons: 1. With α = .05, we would expect to reject H0 by mistake once in 20 tests. If you do all possible comparison, you might do 20 tests for one experiment – so the odds are good that one of them will be “significant” by chance Lecture 16

Planned comparisons 2. When you select a few comparisons out of the set of all possible comparisons, you put your theory in jeopardy. Such specific predictions (of differences between means) are unlikely to be correct by chance. If you put your theory in jeopardy and it survives, you have more confidence in your theory If it doesn’t survive, at least you know the theory was wrong Lecture 16

Planned comparisons Because we only do a few comparisons when using planned comparisons, we do not need to “adjust α.” We do not correct for a higher probability of Type 1 error, when doing a small number of planned comparisons. Lecture 16

The denominator of our t-test Completely Randomized design: planned comparison uses an independent groups t-test. The t-test requires an estimate of 2 for the denominator. Where should that estimate come from? Lecture 16

The denominator of our t-test Previously, to estimate 2, we used a pooled variance based on the two sample variances (SP). In the CRD ANOVA, each sample variance gives an independent estimate of 2 But the average of the sample variances gives a better estimate of 2. 2 Lecture 16

The denominator of our t-test In the ANOVA design, we have multiple samples, so we have multiple sample variances. We can use all of these sample variances to compute an estimate of 2. In fact, we have already computed such an estimate – in the Mean Square Error produced for the ANOVA. Lecture 16

Planned Comparisons t-test √MSE 1 + 1 ni nj Choose pair of means you want to test Find MSE in ANOVA summary table Feed these values into the equation above Evaluate tobt against tα (df MSE) Xi Xj ( ) Lecture 16

Post-hoc tests Post-hoc tests are also tests of differences among treatment means. Here, you decide which means you want to test post-hoc – that is, after looking at the data. “Post-hoc” means “after the fact” – after collecting and looking at the data. “A priori” comparisons are those decided on before data collection – differences predicted on the basis of theory Lecture 16

Post-hoc tests The problem for post-hoc tests is α If you do one test with α = .05, the “long-run” probability of a Type 1 error is .05. But when you do many such comparisons, the probability of one Type 1 error is no longer .05. It is roughly (.05 * k) where k = # of comparisons. Lecture 16

Post-hoc tests IMPORTANT POINT: Even if you do not do all possible comparisons among a set of means explicitly – if you just test the biggest difference among all the pairs of means – you have implicitly tested all the others. This means that the problem alluded to on the previous slides always exists for post-hoc tests. Lecture 16

Two types of Post-hoc Tests 1. Tukey’s Honestly Significant Difference compares all possible pairs of means maintains Type 1 error rate at α for the entire set of comparisons Qobt = (Xi – Xj) √MSE/n (n = sample size) Lecture 16

Tukey’s HSD test To evaluate Qobt, get Qcrit from table. You will need: df = df for MSE k = # of samples in experiment α In Tukey’s HSD tests, use same Qcrit for all the comparisons in the experiment. Lecture 16

Tukey’s HSD test NOTE: If sample sizes are not equal, use the harmonic mean of the sample sizes: n = k Σ(1/ni) (k = # of samples) ~ Lecture 16

Two types of Post-hoc tests 2. Newman-Keuls test The N-K is like Tukey’s HSD in that it makes all possible comparisons among the sample means, and in that it uses the Q statistic. N-K differs from HSD in that Qcrit varies for different comparisons. Lecture 16

Newman-Keuls test As with HSD, Qobt = (Xi – Xj) √MSE/n n = sample size Evaluate Qobt against Qcrit obtained from table, using df, α, and r. r may vary for different comparisons. Lecture 16

Newman-Keuls test To find r for a given comparison, begin by ordering the sample means from highest to lowest. r is then the number of means spanned by the comparison you want to make. X1 X3 X2 X4 77 74 72.5 58.75 r = 4 r = 2 r = 3 Lecture 16

Example 1 1. Students taking Summer School courses sometimes attempt to take more than one course at the same time and/or have a full time job on top of their course(s). To study the effect that these situations may have on a student’s performance, four randomly selected students in each of four conditions are compared on their final exam grades in the statistics course they all took. Lecture 16

Example 1 a. Prior to data collection, it was predicted that students taking just one course (no job) would obtain a significantly higher mean final exam grade than students in the two-courses-plus-job group. It was also predicted that the mean final exam grade of students in the two courses (no job) group would not differ significantly from that of students in the one-course-plus-job group. Perform the necessary analyses to determine whether these predictions are borne out by the data, using   .01 for each prediction. Lecture 16

Example 1a Notice these words: “Prior to data collection, it was predicted that …” That means this question calls for a planned comparison – so to answer the question, you do not have to do the ANOVA first, as you would if this were a post-hoc test. But you do need MSE. Lecture 16

Example 1 We have the raw data, so we can use the computational formulas learned last week: CM = (ΣXi)2 = 11292 = 79665.6625 n 16 SSTotal = ΣXi2 – CM SSE = SSTotal – SSTreat SSTreat = ΣTi2 – CM ni Lecture 16

Example 1 The data: S only S + C.S. S + Job S + C.S. + J 78 67 74 59 69 72 63 62 86 74 81 68 75 77 78 46 308 290 246 235 Lecture 16

Example 1 SSE = ΣXi2 – ΣTi2 ni ΣXi2 = 782 + 692 + … + 462 = 81099 Lecture 16

Example 1 SSE = SSTotal – SST = (ΣXi – CM) – (ΣTi – CM) ni = (ΣXi – ΣTi ) – CM + CM = (ΣXi – ΣTi ) 2 2 2 2 2 2 Lecture 16

Example 1 SSE = 81099 – 80451.25 = 647.75 MSE = SSE = SSE = 647.75 = 53.979 df n–p 12 Now, we’re ready to make the comparisons… SSE = SSTotal – SST = Lecture 16

Example 1 HO: μ1 = μ4 HA: μ1 > μ4 Rejection region: tobt > tn-p,α = t12,.01 = 2.681 Reject HO if tobt > 2.681 Lecture 16

Example 1 1 vs 4: t = 77 – 58.75 53.979 + 53.979 4 4 t = 18.25 = 3.513. Reject HO. 5.195 (prediction is supported) √ See the similarity of the denominator of this test to that of the independent groups t-test. In both cases, we’re using measures of error variability averaged across all the samples available. Lecture 16

Example 1 HO: μ2 = μ3 HA: μ2 ≠ μ3 Rejection region: tobt > tn-p,α/2 = t12,.005 = 3.055 Reject HO if tobt > 3.055 Lecture 16

Example 1 2 vs. 3 t = 72.5 – 74 5.195 t = –0.29 Do not reject HO. Lecture 16

Example 1 b. After data collection, it was decided to compare the mean final exam grades of the one course (no job) and two courses (no job) groups, and also to compare the mean grade of the one-course-plus-job group with the two-courses-plus-job group. Each comparison was to be tested with   .05. Perform the appropriate procedures. Lecture 16

Example 1b Notice these words: “After data collection, it was decided to compare…” This is a post-hoc test. That means we have to do the ANOVA first (by definition – the ANOVA is the hoc this test is post). Lecture 16

Example 1 HO: μ1 = μ2 = μ3 = μ4 HA: At least two means differ significantly Rejection region: Fobt > F3,12,.05 = 3.49 SSTreat = 80451.25 – 79665.6625 = 786.1875 SSTotal = 81099 – 79665.6625 = 1433.9375 CM Lecture 16

Example 1 Source df SS MS F Treatment 3 786.1875 262.0625 4.85 Error 12 647.75 53.979 Total 15 1433.9375 Decision: Reject HO… now, do the post-hoc test. Lecture 16

Example 1 Using the Newman-Keuls procedure: X1 X3 X2 X4 77 74.0 72.5 58.75 r = 3 r = 3 Comparison 1: One course no job vs. two courses no job Comparison 2: One course plus job vs. two courses plus job Lecture 16

Example 1 HO: μi = μj HO: μi ≠ μj Rejection region: Qobt > Qr,n-p,α/2 = Q3,12,.025 = 3.77 Note: this Qcrit applies to both following tests, because both ‘span’ 3 means. Lecture 16

Example 1 1 vs. 2 Qobt = 77 – 72.5 53.979 4 = 4.5 = 1.23 (Do not reject HO.) 3.67 √ Lecture 16

Example 1 3 vs. 4 Qobt = 74 – 58.75 53.979 4 = 15.5 = 4.16 (Reject HO) 3.67 √ Lecture 16

Example 2a HO: μ1 = μ2 = μ3 HA: At least two means differ significantly Rejection region: Fobt > F2,87,.05 ≈ F2,60,.05 = 3.15 Note: We cannot use computational formulas because we do not have raw data. So, we’ll use the conceptual formulas. Lecture 16

Example 2 1. Compute XG (the Grand Mean). Since ns are all equal: XG = 10.5 + 18.0 +21.1 3 = 16.533 Lecture 16

Example 2 SSTreat = Σni(Xi – XG)2 = 30 [(10.5-16.53)2 + (18.0-16.53)2 + (21.1-16.53)2] = 1782.2 Now we can create the summary table… Lecture 16

Example 2 Source df SS MS F Treatment 2 1782.2 891.1 32.7 Error 87 ???? 27.25 Total 90 Decision: Reject HO – Rotation skill differs significantly across the grades. Lecture 16

Example 2b HO: μ8 = μ4 HA: μ8 > μ4 Rejection region: tobt > t87,.05 ≈ t29,.05 = 1.699 Reject HO if tobt > 1.699 Lecture 16

Example 1 8 vs 4: t = 18.0 – 10.5 27.25 + 27.25 30 30 t = 7.5 = 5.56. Reject HO. 1.348 (prediction is supported) √ See the similarity of the denominator of this test to that of the independent groups t-test. In both cases, we’re using measures of error variability averaged across all the samples available. Lecture 16