Statistical vs. Practical Significance
Statistical Significance Significant differences (i.e., reject the null hypothesis) means that differences in group means are not likely due to sampling error. The problem is that statistically significant differences can be found even with very small differences if the sample size is large enough.
Statistical Significance In fact, differences between any sample means will be significant if the sample is large enough. For example – men and women have different average IQs
Practical Significance Practical (or clinical) significance asks the larger question about differences “Are the differences between samples big enough to have real meaning.” Although men and women undoubtedly have different IQs, is that difference large enough to have some practical implication
Practical Significance The fifth edition of the APA (2001) Publication Manual states: that it is almost always necessary to include some index of effect size or strength of relationship in your Results section.… The general principle to be followed … is to provide the reader not only with information about statistical significance but also with enough information to assess the magnitude of the observed effect or relationship. (pp. 25–26)
Practical Significance Generally assessed with some measure of effect size Effect size can be grouped into two categories: Difference measures Variance accounted for measures
Difference effect sizes Simple mean difference Suppose you design at control group experiment to evaluate the effects of CBT on depression. Experimental group post test score = 18 Control group post test score = 16 Difference = 18 – 16 = 2
Difference effect sizes Problem with simple mean difference Dependent on the scale of measurement Ignores normal variation in scores For example, if the following example was based on a scale with a SD of 15 points, a 2 point difference would be small – treatment would only effect depression by .13 SDs. If the example was based on a scale with a SD of 1 point, a 2 point difference would be very large – treatment had a 2 SD effect
Difference effect sizes We can overcome this problem by standardizing the mean differences One measure of this was done by Gene Glass D = (meantx – meancontrol)/ Sdcontrol Other SDs may be used such as a pooled (combined) SD from the Tx and Control groups
If variances are equal
If variances are unequal
Difference effect sizes: Interpreting Cohen proposed a general method for interpreting these type of effect sizes d = .2 small effect d = .5 medium effect d = .8 large effect This is a guideline for interpretation. You need to interpret effect sizes in the context of the research
Variance accounted for measures When comparing variables, variance accounted for measures tell us how well one variable predicts another or the magnitude of the relation. R2 is one such measure from correlational or regression analysis. Eta squared (η²) is often used in ANOVA as a measure of shared variance. Omega squared (ω2) is also used with ANOVA
Variance accounted for measures: Interpreting Correlations can be judged as: R = .1 small R = .3 moderate R = .5 large For measures of variance based on a squared value – take the square root to get a correlation
Confidence Intervals Statistics are used to estimate the true population value. When providing statistics (estimates of population values) it is useful to provide a range of values that are likely to include the true population value. Calculated with the standard error of the statistic
Confidence Intervals for means Confidence intervals = mean ± z(SEM) Z = 1.96 for a 95% confidence interval (you can estimate with Z=2 for a 95% confidence interval) If the mean of a sample = 100 and the SEM = 2 Then a 95% confidence interval would be: 100 ± 1.96(2) = 100 ± 3.92 Or 100 ± 2(2) = 100 ± 4 is close enough for govt. work
Confidence Intervals Use confidence intervals when you want to show where some true value is likely to be Reporting test results