PSY2004 Research Methods PSY2005 Applied Research Methods Week Six.

PSY2004 Research Methods PSY2005 Applied Research Methods Week Six

Last Week General principles how it works, what it tells you etc. Today Extra bits and bobs assumptions, follow-on analyses, effect sizes

last week independent groups (IG) ANOVA can also use ANOVA with repeated measures (RM) designs e.g., same participants tested after three different time periods

[last week] comparison of drug treatments for offenders 12 step programme cognitive-behavioural motivational intervention standard care DV: no. of days drugs taken in a month

[this week] effect of drug treatments over time [ignoring, for now, different treatments] 1 month 6 months 12 months DV: no. of days drugs taken in a month

repeated measures ANOVA same general principle as IG ANOVA some computational differences [don’t worry about these for now] somewhat different ‘assumptions’ [more of this later] generally (all other things being equal) more sensitive, more ‘powerful’

[revision] why stats? variability in the data lots of different, random sources of variability

[revision] we’re trying to see if changes in the Independent Variable treatment type affects scores on the Dependent Variable no. of days drugs taken

[revision] lots of other things affect the DV individual differences time of day mood level of attention etc etc Lots of random, unsystematic, sources of variation, unrelated to IV ‘noise’

[revision] trying to see any effect due to the Independent Variable on the Dependent Variable through the ‘noise’

multiple scores from the same participants allow you to identify & remove the ‘noise’ in the data due to individual differences

assumptions characteristics our data should possess for statistical tests to be valid & accurate tests ‘assume’ such characteristics if data doesn’t meet assumptions then outcome of test likely to wrong need to know what they are and check they are met

IG ANOVA assumptions independence of observations [no correlations across groups] normally distributed scores [for population, within groups] homogeneity of variance [HoV] [all groups come from populations with same variance] only difference (if any) in terms of means

Levene’s test for HoV what it doesn’t do: does not test for differences between means not relevant to our hypotheses about or interest in any differences between means

Levene’s test for HoV what it does: compares variances of the different groups to make inference about whether the variances of the populations are different

Levene’s test for HoV another NHST: H 0 – population variances are all the same H 1 – population variances not all the same if statistically significant [i.e., p < 0.05] reject H 0 and so accept H 1

Levene’s test for HoV a statistically significant Levene’s test suggests we should not assume HoV should ‘correct’ the F ratio to reduce likelihood of error two commonly used options [Welch, Brown- Forsythe] available within SPSS impacts on choice of post-hoc test as well [more on this later]

Levene’s test for HoV This test is unusual in that we don’t want it to be statistically significant! if Levene’s test NOT significant then you can usually assume you have HoV but be careful if your sample size is low your Levene’s test [like any other NHST] might not have enough ‘power’ to detect differences in variance

‘Robustness’ ANOVA generally quite ‘robust’ not-so-serious breaking of assumptions doesn’t increase probability of error very much as long as sample sizes are equal

RM ANOVA assumptions ‘sphericity’ equality of variances for differences between conditions e.g., measures taken at 3 points in time (1 month, 3 months, 12 months) variance 1-3 = variance 3-12 = variance 1-12

Mauchly’s test for sphericity another NHST H 0 = variances of differences between conditions are equal H 1 = variances of differences between conditions are not equal if statistically significant [i.e., p < 0.05] reject H 0 and so accept H 1

Mauchly’s test for sphericity a statistically significant Mauchley’s test suggests we should not assume sphericity increased probability of making type I error adjust degrees of freedom to control this [various options – covered in lab class] impacts on choice of post-hoc test as well [more on this later]

Mauchly’s test for sphericity This test is unusual in that we don’t want it to be statistically significant! if Mauchly’s test NOT significant then you can usually assume you have sphericity but be careful if your sample size is low your Mauchly’s test [like any other NHST] might not have enough ‘power’ to detect differences in variance

Effect Size NHST tests limited to binary decision reject H 0 or not evidence for an effect or not doesn’t distinguish between small and big effects

With a large enough sample i.e., high power or sensitivity even a very very small effect can reach statistical significance only rejecting hypothesis that effect is exactly equal to zero non-zero effect can be statistically significant but small enough to be trivial

interesting and important to also consider the size of any effect PSY1017 – correlation coefficients magnitude of coefficient strength of relationship – i.e., size of effect ‘benchmarks’.1 = small,.3 = medium,.5 = large

want something similar for ANOVA a statistically significant F ratio suggests some kind of effect need measure of effect size to tell us how big (or small) the effect is

various different effect size measures we shall use partial eta-squared (partial η 2 ) proportion of total variability in the sample ‘explained’ by the effect [i.e., the IV] (and which is not explained by any other variable in the analysis)

between-groups variability between- & within-groups variability NB - measure of between-group variability used as basis for estimate of population variance [see last week] but isn’t equal to it. NB 2 – for one-way anova [only one IV] you don’t have to worry about the “and which is not explained by any other variable in the analysis bit” because there aren’t any other variables! But things are different when you’ve got 2 or more IVs [covered later in the module]

between- groups within- groups variability estimates of population variance

why partial eta-squared? (partial η 2 ) it’s probably the widely used SPSS can compute it for us [this is a rubbish reason] rough ‘benchmarks’ exist.01 = small;.06 = medium;.14 = large

[last week] ANOVA is another NHST probability of getting F-ratio (or more extreme) if H 0 true If p < 0.05, reject H 0 H 0 – all the population means are the same and so accept H 1 – the population means are not all the same NB this doesn’t say anything about which means are different to which other ones

Follow-on analyses if you have very specific hypotheses only interested in a few specific comparisons can specify these in advance (a priori) Planned Comparisons (Contrasts) not following this route in lab classes see Field section 10.2.11 (3 rd ed); 11.4 (4 th ed)

if interested in all comparisons can’t specify those of interest in advance only looking after study completed (post hoc) multiple comparisons / post hoc tests 12-step vs cog-behavioural 12-step vs standard care cog-behavioural vs standard care

[last week] Moving from comparing two means to considering three has complicated matters we seem to face either increased type I error rate or increased type II error rate [lower power]

[last week] This [finally] is where ANOVA comes in It can help us detect any difference between our 3 (or more) group means without increasing type I error rate or reducing power

so, I hear you cry, you end up doing the very procedures ANOVA is meant to spare us the pitfalls of! why bother with the ANOVA?

only bother with post hocs if you have a statistically significant ANOVA the ANOVA serves the purpose of detecting that there is some kind of effect in the first place [controlling Type I error rate & without cost in terms of power] corrected multiple comparisons (post hocs) [with their reduced power] might have missed that effect [ANOVA can do other stuff; more on that later in the module]

Many different post hoc procedures [18 available in SPSS] Bonferroni correction very popular easy to understand very good control over Type I error rate pays heavy prices in terms of power

procedures vary in terms of particular balance they strike between Type I error & Type II error [power] those that favour controlling Type I error over concern with power [‘conservative’] those that relax Type I error control so as to maintain power [‘liberal’] ‘robustness’ accuracy when samples sizes unequal and/or small, when you haven’t got HoV or Sphericity

See Field 10.2.12 (3 rd ed) ; 11.5 (4 th ed) need to make a judgment nature of your data (assumptions, sample sizes) which type of error worst in your situation no single correct answer, but has to be justified Can’t take a ‘cookbook’ approach

PSY2004 Research Methods PSY2005 Applied Research Methods Week Six.

Similar presentations

Presentation on theme: "PSY2004 Research Methods PSY2005 Applied Research Methods Week Six."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PSY2004 Research Methods PSY2005 Applied Research Methods Week Six.

Similar presentations

Presentation on theme: "PSY2004 Research Methods PSY2005 Applied Research Methods Week Six."— Presentation transcript:

Similar presentations

About project

Feedback