Psych 5500/6500 ANOVA: Single-Factor Independent Means Fall, 2008.

Slides:



Advertisements
Similar presentations
Overview of Lecture Partitioning Evaluating the Null Hypothesis ANOVA
Advertisements

Psych 5500/6500 t Test for Two Independent Groups: Power Fall, 2008.
Lecture 10 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Chapter 9 - Lecture 2 Some more theory and alternative problem formats. (These are problem formats more likely to appear on exams. Most of your time in.
Independent Samples and Paired Samples t-tests PSY440 June 24, 2008.
Chapter 3 Analysis of Variance
Lecture 9: One Way ANOVA Between Subjects
One-way Between Groups Analysis of Variance
Lecture 7 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Statistics for the Social Sciences
Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).
Intro to Parametric Statistics, Assumptions & Degrees of Freedom Some terms we will need Normal Distributions Degrees of freedom Z-values of individual.
The t Tests Independent Samples.
Hypothesis Testing Using The One-Sample t-Test
Introduction to Analysis of Variance (ANOVA)
Chapter 9: Introduction to the t statistic
Richard M. Jacobs, OSA, Ph.D.
Relationships Among Variables
Basic Analysis of Variance and the General Linear Model Psy 420 Andrew Ainsworth.
Inferential Statistics
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
ANOVA Greg C Elvers.
The t Tests Independent Samples. The t Test for Independent Samples Observations in each sample are independent (not from the same population) each other.
Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
1 Psych 5500/6500 t Test for Two Independent Means Fall, 2008.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 4): Power Fall, 2008.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Chapter 14 – 1 Chapter 14: Analysis of Variance Understanding Analysis of Variance The Structure of Hypothesis Testing with ANOVA Decomposition of SST.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
1 Psych 5500/6500 t Test for Dependent Groups (aka ‘Paired Samples’ Design) Fall, 2008.
1 Psych 5500/6500 Introduction to the F Statistic (Segue to ANOVA) Fall, 2008.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Review Hints for Final. Descriptive Statistics: Describing a data set.
General Linear Model 2 Intro to ANOVA.
Chapter 14 – 1 Chapter 14: Analysis of Variance Understanding Analysis of Variance The Structure of Hypothesis Testing with ANOVA Decomposition of SST.
Lecture 9-1 Analysis of Variance
Previous Lecture: Phylogenetics. Analysis of Variance This Lecture Judy Zhong Ph.D.
1 Psych 5510/6510 Chapter 14 Repeated Measures ANOVA: Models with Nonindependent ERRORs Part 3: Factorial Designs Spring, 2009.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Chapter 12 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 12: One-Way Independent ANOVA What type of therapy is best for alleviating.
1 Psych 5500/6500 Measures of Variability Fall, 2008.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Chapter 10 The t Test for Two Independent Samples
Chapter 12 Introduction to Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick.
Chapter 13 Repeated-Measures and Two-Factor Analysis of Variance
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
Handout Six: Sample Size, Effect Size, Power, and Assumptions of ANOVA EPSE 592 Experimental Designs and Analysis in Educational Research Instructor: Dr.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 10 Introduction to the Analysis.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Inferential Statistics Psych 231: Research Methods in Psychology.
Chapter 9 Introduction to the t Statistic
Dependent-Samples t-Test
Central Limit Theorem, z-tests, & t-tests
I. Statistical Tests: Why do we use them? What do they involve?
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Chapter 10 – Part II Analysis of Variance
Presentation transcript:

Psych 5500/6500 ANOVA: Single-Factor Independent Means Fall, 2008

ANOVA ANOVA is short for ‘Analysis of Variance’, it is also known as the F test. It is applicable in a variety of experimental designs that involve the comparison of group means to determine whether or not the independent variable had an effect. We will begin with the ‘single-factor independent means’ ANOVA.

‘Single-Factor’ This is similar to the ‘t test for independent means’. As in the t test there is one dependent variable and one independent variable (a ‘factor’ is an independent variable, thus ‘single-factor’ means one IV). But unlike in the t test, with the F test there can be 2 or more levels of the IV (i.e. two or more groups in the experiment).

Independent Means The ANOVA (F test) we will begin with assumes that the scores are independent across groups. In other words, this could be used in a true experimental design, a quasi-experimental design, or a static group design (just like the t test for independent means).

Example 1: True Experimental Design Does Vitamin C affect the length of people’s colds? Randomly divide subjects who are in their first day of a cold into 4 groups, then each group gets a different level of Vitamin C. Measure how long it takes each person to get over their cold (DV). IV= Group 1: 0 mg, Group 2: 100 mg, Group 3: 500 mg, Group 4: 5000 mg

Example 2: Quasi-Experimental Design Do three specific therapies differ in their ability to treat depression? Let subjects select the type of therapy they want (three different kinds are available), then measure their level of depression (DV) after 2 months of therapy (note the IV is manipulated by the experimenter). IV= Group 1: Behavior Modification, Group 2: Gestalt, Group 3: Client-Centered

Example 3: Static Group Design Does the size of a city affect the cancer rate in that city? Randomly select several small cities, several medium sized cities, and several large cities, measure their cancer rate per 10,000 citizens (DV). Note the IV is not manipulated, instead it is the criteria for assigning to groups; IV= Group 1: Small Cities, Group 2: Medium Cities, Group 3: Large Cities

Relationship of t and F If you have two groups (i.e. two levels of your IV) you can use either a ‘t’ or ‘F’ test to analyze the results, and if you are testing a two-tailed hypothesis there is no difference between doing a t test or an F test. The F test cannot test a directional (one-tailed) hypothesis. Thus, if you want to do a one-tailed test use t. The t test, however, cannot be used if you have more than two groups, then you must use F.

Hypotheses If you have three levels of your IV then: H0: μ 1 = μ 2 = μ 3 (one μ for each group) You are saying that all the populations in the experiment have the same mean, and that any differences in the group sample means are just due to chance. So what is HA? HA: μ 1  μ 2  μ 3 ?

H0 and HA No, HA: μ 1  μ 2  μ 3 doesn’t work, as H0 and HA together must cover every possibility (e.g. what about μ 1 = μ 2  μ 3 ?). So, the correct answer is: H0: μ 1 = μ 2 = μ 3 HA: at least one μ is different than the rest

Test Statistic We need a statistic whose value we know if H0 is true. With the t test for independent groups the way we tested whether μ 1 = μ 2 was by using: If H0 is true then we expect But what if we have three or more groups? If H0 is μ 1 = μ 2 = μ 3 what would we expect if H0 is true?

We need a statistic that will measure whether several group means are about the same (H0 true and the means differ only due to chance) or if they differ more than you would expect if only chance were involved (i.e. if the independent variable made the populations—and thus the groups means—more different than you would expect if only random error were involved). What statistic do we know measures how much a bunch of numbers (in this case group means but that doesn’t matter) differ from each other?

Analysis of Variance The essence of the F test for the one factor independent group ANOVA is that it examines the variance of the group means to determine whether the group means differ more than you would expect if H0 were true. The logic of how we will do that is based upon ‘partitioning the Sums of Squares’

Setup We will begin with a simple experiment with three groups, and three scores in each group. Group 1Group 2Group 3 Y1Y1 Y4Y4 Y7Y7 Y2Y2 Y5Y5 Y8Y8 Y3Y3 Y5Y5 Y9Y9

Symbols

Group 1Group 2Group 3 Y1Y1 Y4Y4 Y7Y7 Y2Y2 Y5Y5 Y8Y8 Y3Y3 Y5Y5 Y9Y9 N 1 =3N 2 =3N 3 =3

Partitioning the Deviation We begin by looking at how far each score is from the mean of all of the scores: Then we break (partition) that distance into two pieces, how far the score is from the mean of its group, and how far the mean of the group is from the mean total.

Partitioning the SS Now we use those deviations to create three sums of squares. SS Total = SS WithinGroups + SS BetweenGroups SS Total measures the squared deviations of the scores from the mean of all of the scores. SS Within measures the squared deviations of the scores from the mean of the group they are in. SS Between measures the squared deviations of the group means from the mean of all of the scores.

SS’s Sums of squares are a way of measuring variability. Consequently: SS Total reflects how much all of the scores differ from each other (if all the scores were the same they would all equal the total mean and the squared distances would all be zero). SS Within reflects how much the scores differ from other scores in the same group (if all the scores in a group are the same they would all equal the mean of their group and the squared distances would all be zero). SS Between reflects how much the group means differ from each other (if all of the group means were the same they would all equal the total mean and the squared distances would all be zero.

Example… Refer to the handout on partitioning SS.

Partitioning the df The total df for the experiment would be the total number of scores – 1. We are also going to partition that. df Total = df Within + df Between

What are d.f.s? (Discuss in class)…

Mean Squares A mean square is a Sum of Squares divided by its degrees of freedom.

What is a Mean Square? It is not normally computed as we won’t be needing it, but to make a conceptual point, let’s look at MS Total. Does that look familiar?

Error Variance The term error variance refers to the variance of the population from which the scores were originally sampled. The use of the term ‘error’ will be clearer next semester, it refers to the error of using the mean to predict each score. For now just think of error variance as the variance of the population from which we sampled. The ANOVA assumes that each population in the study has the same variance.

Mean Square Within Groups MS Within is an estimate of error variance based upon how much the scores differ inside of each group. Essentially, it uses each group to estimate error variance, then pools those different estimates into one good estimate. If the N’s of each group are the same then MS Within is literally the mean of the variance estimates from each group.

Mean Square Between Groups MS Between is an estimate of error variance based upon how much the group means differ from each other. Remember that the variance of the population affects the variance of the sample means (the standard error); well it also works the other way, the variance of the sample means tells us something about the variance of the population from which those means were drawn.

F MS Between and MS Within are two, independent estimates of the same thing...error variance.

Logic of the ANOVA When H0 is true: MS Between and MS Within are two, independent estimates of error variance. When H0 is false: the independent variable makes the group means differ more than they would if only chance were involved, which affects MS Between making it larger. The independent variable— however—does not affect the variance inside of each group, thus MS Within is not affected.

Example IV=type of therapy (control group, vs behavior modification vs psychoanalysis vs client- centered vs gestalt) DV=level of depression after 2 months H0: μ C = μ BM = μ PA = μ CC = μ G HA: at least one μ is different than the rest.

Data ControlBeh. Mod. Psycho- analysis Client- Centered Gestalt

Bar Graph

Computations SS Total =54.93 SS Within =15.33 SS Between =39.60 df Total =14 df Within =10 df Between =4 MS Within =9.90 MS Between =1.53 F obt =6.46 F c;.05,df1=4,df2=10 =3.48

We reject H0

p Values & Expressing Results The exact p value can be obtained by either performing the analysis using SPSS or by using my F tool and inputting the df’s and the value of F obt. In this example p=.0078 The way the results are commonly expressed are as F(df 1,df 2 )=F obt, p=... In our example it would be: F(4,10)=6.46, p=.0078

Summary Table SourceSSdfMSFp Between Within Total Another common way of expressing the results of the analysis is in a ‘Summary Table’.

Decision H0: μ C = μ BM = μ PA = μ CC = μ G HA: at least one μ is different than the rest. We have rejected H0, which means that we can conclude that at least one of the population means is different than the rest. It is tempting to say, for example, that the control group (which had a mean level of depression of 8) was more depressed than the Gestalt group (which had a mean level of depression of 3) but we cannot be that specific, we can only say that at least one group was different than the rest. We will learn in a future lecture how to make more specific tests among the group means.

Effect Size Cohen’s d is not capable of determining an overall effect due to the independent variable when there are more than two groups as we can’t expect d to equal zero when H0 is true (i.e. when the independent variable has no effect):

Effect Size (cont.) For the overall effect of the independent variable we will have to turn to measures of association, which examine how much knowing what group the score is in helps us in predicting their score on the dependent variable. We will be covering that next semester. The measure we will be looking at then is called R², and to get a general idea of how it works...

Group 1Group 2Group Group 1Group 2Group R²=0 (knowing which group the score is in doesn’t help at all). R²=1.00 (knowing which group the score is in allows us to know exactly what the score will be). You can see that R² will always be between 0 and 1

Computing R² In our example:

Cohen’s f GPower uses Cohen’s f to express effect size. While R² will be between 0 and 1, f expands that out to be between 0 and infinity. The conventions for relating f to effect size are:.10=small effect.25=medium effect.40=large effect

Our Example A whopping big effect size (because we are in Oakley land rather than using real data).

GPower and Cohen’s f GPower will compute f for you if you give it the N’s of each group, the means of each group, and the standard deviation (the one you assume each group has in common), but the equation on the previous slide is much simpler. With this information you can then compute the power a priori and post hoc as you did with the t test. In our example power=0.98 (ridiculously large for 3 scores per group, due to the big effect of the IV and the small amount of within-group variance, a byproduct of my making up the data).

Assumptions of This Use of the F test 1. Independence of scores (important). 2. All the populations are normally distributed (the F test is ‘robust’ to this assumption, particularly if N’s are large and roughly equal across groups). 3. Homogeneity of Variance (this can be violated if you have roughly equal N’s across the groups). Levenes’ test will evaluate this.

Homogeneity of Variance If N’s are not equal, then the effects of violating this assumption are: 1. If larger sample size is associated with larger variances then alpha decreases (biased towards not making a type 1 error but at the expense of power). 2. If larger sample size is associated with smaller variances then alpha increases (biased towards making a type 1 error). If this is the case then either select a smaller significance level (e.g..01 rather than.05) or ‘transform’ your data.

Levene’s Test Levene’s test for the inequality of variances can be used to test whether a difference exists somewhere among the population variances. In our example with five types of therapy the hypotheses for Levene’s test would be: H0: σ² 1 = σ² 2 = σ² 3 = σ² 4 = σ² 5 Ha: at least one σ² is different than the rest.

Now that we know how ANOVA works it is easy to describe how Levene’s test works. Let’s begin with a simply study with just two groups. Group 1Group Mean=10Mean=30 S²=0.5S²=10

The two groups have very different variances (10 vs. 30), we want to test to see whether it is reasonable to conclude that the populations these groups came from have different variances (our assumption is about populations). H0: σ² 1 = σ² 2 Ha: σ² 1  σ² 2 Group 1Group Mean=10Mean=30 S²=0.5S²=10

The first thing we do is to transform the original scores to deviation scores that reflect how far each score was from the mean of its group. Then we take the absolute values of those deviations Group 1Group Mean=10Mean=30 Group 1Group Original Scores Absolute deviations from the group mean.

We have changed the data to being a measure of how much each score differed from its group mean. In other words, each score is now a measure of variability. We can see in the absolute deviations that the original scores in group 1 did not vary much from their group mean (and thus didn’t vary much from each other). Group 1Group Mean=10Mean=30 Group 1Group Original Scores Absolute deviations from the group mean.

Levene’s test simply performs an ANOVA (with only 2 groups you could use a t test) to see if the mean of the deviations differ significant in the two groups, which tells you whether the variance of the original scores differ significantly. H0: μ 1 = μ 2 Ha: μ 1  μ 2 Group 1Group Mean=.5Mean=3 Absolute deviations from the group mean.

Result of the ANOVA on the absolute deviation scores. F(1,6)=15.00, p=.008, so we can conclude that the difference in variances among the groups (original scores) was statistically significant. Group 1Group Group 1Group Mean=.5Mean=3 Original ScoresAbsolute deviations from the group mean.

Our Example ControlB. M.P.A.C.C.Gestalt ControlB. M.P.A.C.C.Gestalt M=.67M=.89M=.44M=.67M=1.33 Original data Absolute deviation scores. For the |deviations|, F(4,10)=0.782, p=.562

Levene’s: Considerations The previous example would actually have a problem in real life, Levene’s test is not accurate when there are very small N’s in each group. This problem becomes negligible when you have 10 or more scores per group. Also, you should know that since Levene’s procedure involves simply applying ANOVA to the absolute mean deviation scores, that Levene’s too has the assumption that the absolute deviation scores are normally distributed and that the groups have equal variances.

Levene’s: Assumptions Levene’s test is fairly robust to violations of the ANOVA assumptions. A study by Brown and Forsythe (1974), however, suggests that if the populations (of the original scores) are fat tailed that you use the ’10 percent trimmed mean’ instead of the mean when finding the absolute mean deviations, and if the populations are skewed that you find the absolute deviation from the median rather than the absolute deviation from the mean. To find the 10 percent trimmed mean first chop off the 10% highest scores, and then the 10% lowest scores before finding the mean.

Levene’s: Assumptions The problem with using either the 10% trimmed mean or the median is that SPSS will do Levene’s for you using the mean, if you want to use the 10 percent trimmed mean or the median you will have to do it yourself in a similar fashion to what I did in demonstrating how Levene’s works, you find the correct deviations and then do an ANOVA on them. Brown, M. G. & Forsythe, A. B. (1974) Robust tests for the equality of variances. Journal of the American Statistical Association, 69,