Effect size calculation in educational and behavioral research Wim Van den Noortgate ‘Power training’ Faculty Psychology and Educational Sciences, K.U.Leuven Leuven, October Questions and comments:
1.Applications 2.A measure for each situation 3.Some specific topics
Applications 1.Expressing size of association 2.Comparing size of association 3.Determining power
Application 1: Expressing size of association Example: M = 8 ; F = 8.5 ; M = F = 1.5 => δ = 0.33 M F
Application 1: Expressing size of association Example: M = 8 ; F = 8.5 ; M = F = 1.5 => δ = 0.33 scsc sEsE p (two-sided) g (*)0.80
Application 1: Expressing size of association Example: M = 8 ; F = 8.5 ; M = F = 1.5 => δ = 0.33 scsc sEsE p (two-sided) g (*) (*)
δ g
g scsc sEsE pg (*) (*) [0.17; 1.43] [-0.63; 0.62] [-0.06; 1.20] [-0.28; 0.98] [-0.57; 0.68] [0.04; 1.30] [-0.77; 0.49] [-0.61; 0.65] [-0.59; 0.67] [-0.51; 0.75]
Suppose simulated data are data from 10 studies, being replications of each other:
Comparing individual study results and combined study results 1.observed effect sizes may be negative, small, moderate and large. 2.CI relatively large 3.0 often included in confidence intervals 4.Combined effect size close to population effect size 5.CI relatively small 6.0 not included in confidence interval
Meta-analysis: Gene Glass (Educational Researcher, 1976, p.3): “Meta-analysis refers to the analysis of analyses”
Example: Raudenbush & Bryk (2002) Study Weeks previous contactgSE Rosenthal et al. (1974) Conn et al. (1968) Jose & Cody (1971) Pellegrini & Hicks (1972) Evans & Rosenthal (1969) Fielder et al. (1971) Claiborn (1969) Kester & Letchworth (1972) Maxwell (1970) Carter (1970) Flowers (1966) Keshock (1970) Henrickson (1970) Fine (1972) Greiger (1970) Rosenthal & Jacobson (1968) Fleming & Anttonen (1971) Ginsburg (1970) Application 2: Comparing the size of association
Results meta-analysis: 1.The variation between observed effect sizes is larger than could be expected based on sampling variance alone: the population effect size is probably not the same for studies. 2.The effect depends on the amount of previous contact
Application 3: Power calculations Power = probability to reject H 0 Power depends on - δ - α - N
‘Powerful’ questions: 1.Suppose the population effect size is small (δ = 0.20), how large should my sample size (N) be, to have a high probability (say,.80) to draw the conclusion that there is an effect (power), when testing with an α-level of.05? 2.I did not find an effect, but maybe the chance to find an effect (power) with such a small sample is small anyway? (N and α from study, assume for instance that δ=g )
A measure for each situation
Dichotomous independent- dichotomous dependent variable Final exam Predictive test
Dichotomous independent- dichotomous dependent variable 1.Risk difference: =.27 2.Relative risk:.87/.60 = Phi: (130 x 20 – 20 x 30)/sqrt (150 x 50 x 160 x 40) = Odds ratio: (130 x 20 / 20 x 30) = 4.33 Final exam Predictive test (87 %) 20 (13 %) 150 (100 %) 030 (60 %) 20 (40%) 50 (100 %)
A measure for each situation
Dichotomous independent- continuous dependent variable 1.Independent groups, homogeneous variance: 2.Independent groups, heterogeneous variance: 3.Repeated measures (one group): 4.Repeated measures (independent groups): 5.Nonparametric measures 6.r pb
A measure for each situation
Nominal independent-nominal dependent variable 1.Contingency measures, e.g.: 1.Pearson’s coefficient 2.Cramers V 3.Phi coefficient 2.Goodman-Kruskal tau 3.Uncertainty coefficient 4.Cohen’s Kappa
Illness BetterSameWorse Experimental1052 Control473 Illness BetterSameWorse Control1052 Experimental473 Illness SameBetterWorse Experimental1052 Control473
A measure for each situation
Nominal independent-continuous dependent variable 1.ANOVA: multiple g’s 2.η² 3.ICC
A measure for each situation
Continuous independent-Continuous dependent variable 1.r 2.Non-normal data: Spearman ρ 3.Ordinal data: Kendall’s τ, Somer’s D, Gamma coefficient 4.Weighted Kappa
More complex situations 1.Two or more independent variables a)Regression models
1.Y continuous: Y i = a + bX + e i 1.X continuous: b estimated by 2.X dichotomous (1 = experimental, 0 = control), b estimated by 2.Y dichotomous: Logit(P(Y=1))= a + bX, If X dichotomous, b estimated by the log odds ratio
More complex situations 1.Two or more independent variables a)Regression models b)Stratification c)Contrast analyses in factorial designs (Rosenthal, Rosnow & Rubin,2000)
Number of treatments weekly 0123Mean Dose100 mg of50 mg Medication0 mg Mean SourceSSDfMSFp Between Treatments Dose Treat.x dose Within Total Note: N=120 (12 x 10)
Number of treatments weekly 0123 Dose100 mg of50 mg medication0 mg Number of treatments weekly 0123Mean Dose100 mg of50 mg medication0 mg Mean
More complex situations 1.Two or more independent variables a)Regression models b)Stratification c)Contrast analyses in factorial designs 2.Multilevel models 3.Two or more dependent variables 4.Single-case studies
Y i = b 0 + b 1 phase i + e i Y i = b 0 + b 1 time i + b 2 phase i +b 3 (time i x phase i ) + e i
Specific topics
Comparability of effect sizes Example: g IG vs. g gain :
Comparability of effect sizes 1.Estimating different population parameters, e.g., 2.Estimating with different precision, e.g., g vs. Glass’s Δ
Choosing a measure 1.Design and measurement level 2.Assumptions 3.Popularity 4.Simplicity of sampling distribution Fisher’s Z = 0.5 log[(1+r)/(1-r)] Log odds ratio Ln(RR) 5.Directional effect size
Threats of effect sizes 1.‘Bad data’ 2.Measurement error 3.Artificial dichotomization 4.Imperfect construct validity 5.Range restriction
Threats of effect sizes 1.‘Bad data’ 2.Measurement error 3.Artificial dichotomization 4.Imperfect construct validity 5.Range restriction 6.Bias