Download presentation
Presentation is loading. Please wait.
1
Psych 706: Stats II Class #2
2
Today’s AGENDA Homework #1 due to Blackboard 2/23 by 5pm
Significance Testing T-Tests Independent samples (2 groups, 1 condition) Paired samples (1 group, 2 conditions) Analysis of Variance (ANOVA) One-way (3 or more groups, 1 condition) Apriori group contrasts Post-hoc group tests
3
Significance testing
4
Significance testing TEST STATISTIC = VARIANCE EXPLAINED BY MODEL = EFFECT VARIANCE NOT EXPLAINED BY MODEL ERROR Set up null (no effect!) and alternative (effect!) hypotheses Assume that the null hypothesis is true Fit statistical model to data representing the alternative hypothesis and see how well it fits (how much variance it accounts for) Calculate probability (p) of getting our model if null hypothesis is true If probability (p) < .05, we conclude that our model fits the data well and reject the null hypothesis (gain confidence in alternative!)
5
Significance testing
H0 = Null hypothesis = Samples come from the same population! Means should be similar in value! Choices: REJECT OR RETAIN Ha = Alternative hypothesis = Samples come from different populations! Means should be different!
6
Errors WHEN TESTING HYPOTHESES
Rejecting null hypothesis H0 when it is actually true -level typically set at .05 Failing to reject null hypothesis H0 when it is actually false β-level max should be .20 5 out of 100 times we think there is an effect when there isn’t! 20 out of 100 times we miss out on real effects!
7
SIGNIFICANCE TESTING The goal is to minimize both Type I and II errors! Correcting for multiple comparisons when you’re computing a lot of tests on the same sample(s) Power (next class) Ability of a test to find an effect, given a particular sample size This is something grant reviewers and dissertation committees want you to calculate before you begin a study so you don’t waste your time Effect Size (next class) Magnitude (size) of an observed effect If your group sample sizes are large enough, you can get p < .05, but is it a meaningful difference? Journals want you to calculate this to compare to findings of other papers
8
One-tailed or two-tailed TESTS?
Field argues that we should always do two-tailed tests because that way if we get the opposite of what we predicted we’re still allowed to discuss the finding Otherwise technically we can’t say anything that happens at the other tail if we did only a one-tailed test!
9
Significance testing H0 = Null hypothesis = Samples come from the same population! Means should be similar in value! Choices: REJECT OR RETAIN Ha = Alternative hypothesis = Samples come from different populations! Means should be different! Calculate difference in SAMPLE MEANS Use standard error (SE) as gauge of variability between sample means SE small = samples in the distribution will have similar means! SE large = samples in the distribution may have different means! Distributions for 2 Sample Means Are they from the same or different populations?
10
Abelson (1997) and Cohen (1994) What was the take-home message of each article? This will be included on the in-class Exam #1!
11
Assumptions of t and f tests
Normality (K-S, Wilks-Shapiro) Homogeneity of Variance and Homoscedasticity (Levine’s)
12
Independent samples t-test
Between-Subjects Two groups (e.g., Control, n=12, vs. Experimental, n=12) One condition (e.g., Invisibility Cloak: Yes OR No) Do groups differ on dependent variable (e.g., Number of Mischievous Acts) Mean group difference divided by standard error of both groups
13
INDEPENDENT SAMPLES T-TEST (EQUAL SAMPLE SIZES)
(μ1 – μ2) We don’t know these values. Under the null hypothesis we assume this is ZERO Add together standard error of both sample means Standard error formula for one sample mean
14
INDEPENDENT SAMPLES T-TEST (EQUAL SAMPLE SIZES)
15
INDEPENDENT SAMPLES T-TEST (UNEQUAL SAMPLE SIZES)
(μ1 – μ2) We don’t know these values. Under the null hypothesis we assume this is ZERO Compute the pooled variance: each sample’s variance weighted by it’s sample size and divided by the number of degrees of freedom (each sample minus 1) Standard error formula for one sample mean
16
INDEPENDENT SAMPLES T-TEST (UNEQUAL SAMPLE SIZES)
Only difference from the other formula: pooled variance
17
hypotheses H0 = Control (No Invisibility Cloak) and Experimental (Yes Invisibility Cloak) groups are drawn from the same population and will not differ on number of mischievous acts Ha = Control and Experimental groups are likely drawn from different populations and differ on number of mischievous acts
18
SPSS RESULTS
19
SPSS RESULTS Degrees of freedom (df) =
Total sample size minus number of groups Since 95% CI contains zero, that means retain null hypothesis Since p > .05, retain null hypothesis
20
PAIRED samples t-test Within-Subjects One group (Experimental, n=12)
Two conditions (Invisibility Cloak: Yes AND No) Do conditions differ on dependent variable (e.g., Number of Mischievous Acts) Mean condition difference divided by standard error of difference between conditions
21
hypotheses H0 = Yes and No Invisibility Cloak conditions are drawn from the same population and will not differ on number of mischievous acts Ha = Conditions are likely drawn from different populations and differ on number of mischievous acts
22
Paired samples t-test
23
SPSS RESULTS Since 95% CI does not contain zero, reject null hypothesis Degrees of freedom (df) = Total sample size minus 1 Since p < .05, reject null hypothesis
24
HEY! Wait a second… We ran the exact same experiment: once between- subjects (independent samples: subjects either wore the invisibility cloak, or they didn’t) and the other time within subjects (paired samples: all subjects wore the cloak once and didn’t wear the cloak once) Why did we obtain significant results for one analysis and not the other? Within-subjects designs are more POWERFUL than between- subjects designs When you use the same subjects multiple times, you can eliminate error variance due to particular individuals and not the actual manipulation that you are trying to test!
25
Questions?
26
Ha = means are different
One-way anova Three or more groups H0 = means are the same Ha = means are different F test tells us whether the group means are different We have to do planned contrasts and/or post-hoc tests to tell us WHICH means are different from each other
27
Difference between each score and the grand mean
SS Total Difference between each score and the grand mean Three Group Means = pink, green, and blue lines Grand Mean = black line overall mean of all scores regardless of Group Individual scores = pink, green, and blue points SS Model Difference between each group mean and the grand mean SS Residual Difference between each score and its group mean
28
Sums of squares (ss) PROBLEM: THESE ARE ALL SUMMED SCORES SO THE MORE OBSERVATIONS, THE LARGER THESE NUMBERS WILL GET! SS Total Variance of all scores Difference between each score and the grand mean Square each difference Sum up values SS Model Between-subjects variance your model can explain Difference between each group mean and the grand mean Square each difference and multiply number by sample size for each group Sum up values SS Residual Error variance your model cannot explain Difference between each score and the group mean it belongs to Square each difference and multiply it by n-1 of each group Sum up values
29
MEAN squares (Ms) SOLUTION: Divide each SS value by degrees of freedom to scale it down! SS Total Variance of all scores Difference between each score and the grand mean Square each difference Sum up values MS Residual = SS Residual / df Residual Error variance your model cannot explain Difference between each score and the group mean it belongs to Square each difference and multiply it by n-1 of each group Sum up values and divide by df MS Model = SS Model / df Model Between-subjects variance your model can explain Difference between each group mean and the grand mean Square each difference and multiply number by sample size for each group Sum up values + divide by df F = MS Model MS Residual
30
Degrees of freedom Degrees of Freedom (df) are the number of values that are free to vary after estimating population parameters (mean) Independent samples t-test Df = sample size (sum both groups) minus 1 Losing 1 df because estimating population mean difference Paired samples t-test Df = sample size (one group) minus 1 One-way ANOVA SS Model effect / between-subjects / explained variance Df = # groups minus 1 because estimating grand mean SS Residual / within-subjects / unexplained error variance Df = sample size minus # groups because estimating all of the group means
31
Types of follow-up tests
Multiple t-tests Inflates Type 1 Error Rate: DON’T DO IT! Orthogonal Contrasts/Comparisons Hypothesis driven Planned a priori Post Hoc Tests Not Planned (no hypothesis) Compare all pairs of means
32
Planned contrasts The variability explained by the Model (experimental manipulation, SS Model) is due to participants being assigned to different groups This variability can be broken down further to test specific hypotheses about which groups might differ We break down the variance according to hypotheses made a priori (before the experiment) It’s like cutting up a cake, but instead it’s variance!
33
RULES FOR CHOOSING CONTRASTS
Contrasts must be independent: they cannot interfere with each other (they must test unique hypotheses) Each contrast should compare only 2 chunks of variation (why?) You should always end up with one less contrast than the number of groups.
34
Generating hypotheses
Example: Testing the effects of Viagra on Libido using three groups: Placebo (Sugar Pill) Low Dose Viagra High Dose Viagra Dependent Variable (DV) was an objective measure of Libido. What might we expect to happen?
36
How do we choose contrasts?
In most experiments we usually have one or more control groups. The logic of control groups dictates that we expect them to be different to groups that we’ve manipulated. The first contrast will always be to compare any control groups (chunk 1) with any experimental conditions (chunk 2).
37
Planned contrast hypotheses
Hypothesis 1: People who take Viagra will have a higher libido than those who don’t. Placebo (Low, High) Hypothesis 2: People taking a high dose of Viagra will have a greater libido than those taking a low dose. Low High
38
2 Planned coNTRASTS: 3 groups
39
Coding planned contrasts: rules
Groups coded with positive weights compared to groups coded with negative weights Rule 2 The sum of weights for a comparison should be zero Rule 3 If a group is not involved in a comparison, assign it a weight of zero Rule 4 For a given contrast, the weights assigned to the group(s) in one chunk of variation should be equal to the number of groups in the opposite chunk of variation. Rule 5 If a group is singled out in a comparison, then that group should not be used in any subsequent contrasts.
40
Defining contrasts using weights
41
Defining contrasts using weights
42
SPSS Output
43
Post Hoc Tests Compare each mean against all others.
In general terms they use a stricter criterion to accept an effect as significant. Hence, control the familywise error rate. Simplest example is the Bonferroni method:
44
SPSS Post Hoc Test Recommendations
Assumptions met: REGWQ or Tukey HSD Safe Option but Conservative: Bonferroni Unequal Sample Sizes: Gabriel’s (small n), Hochberg’s GT2 (large n) Unequal Variances: Games-Howell
45
Spss output
46
Spss tutorial up next!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.