Funded through the ESRC’s Researcher Development Initiative Department of Education, University of Oxford Sessions 1.2-1.3: Effect Size Calculation.

Slides:



Advertisements
Similar presentations
Conducting a Meta-Analysis Jose Ramos Southern Methodist University Paper presented at the annual meeting of the Southwest Educational Research Association,
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
Copyright © Allyn & Bacon (2007) Statistical Analysis of Data Graziano and Raulin Research Methods: Chapter 5 This multimedia product and its contents.
Effect Size Overheads1 The Effect Size The effect size (ES) makes meta-analysis possible. The ES encodes the selected research findings on a numeric scale.
RIMI Workshop: Power Analysis Ronald D. Yockey
Correlation Mechanics. Covariance The variance shared by two variables When X and Y move in the same direction (i.e. their deviations from the mean are.
Introduction to Meta-Analysis Joseph Stevens, Ph.D., University of Oregon (541) , © Stevens 2006.
T-tests Computing a t-test  the t statistic  the t distribution Measures of Effect Size  Confidence Intervals  Cohen’s d.
PSY 307 – Statistics for the Behavioral Sciences
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Lecture 9: One Way ANOVA Between Subjects
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
EXPERIMENTAL DESIGN Random assignment Who gets assigned to what? How does it work What are limits to its efficacy?
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Practical Meta-Analysis -- D. B. Wilson
Today Concepts underlying inferential statistics
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
Intro to Parametric Statistics, Assumptions & Degrees of Freedom Some terms we will need Normal Distributions Degrees of freedom Z-values of individual.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Getting Started with Hypothesis Testing The Single Sample.
Chapter 14 Inferential Data Analysis
Richard M. Jacobs, OSA, Ph.D.
Chapter 9 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 What is a Perfect Positive Linear Correlation? –It occurs when everyone has the.
Relationships Among Variables
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
AM Recitation 2/10/11.
The Campbell Collaborationwww.campbellcollaboration.org C2 Training: May 9 – 10, 2011 Data Analysis and Interpretation: Computing effect sizes.
Overview of Meta-Analytic Data Analysis
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.
ANOVA Greg C Elvers.
Practical Meta-Analysis -- The Effect Size -- D. B. Wilson 1 The Effect Size The effect size (ES) makes meta-analysis possible The ES encodes the selected.
Statistics Primer ORC Staff: Xin Xin (Cindy) Ryan Glaman Brett Kellerstedt 1.
Advanced Statistics for Researchers Meta-analysis and Systematic Review Avoiding bias in literature review and calculating effect sizes Dr. Chris Rakes.
Overview Two paired samples: Within-Subject Designs
Statistical Evaluation of Data
Inferential Statistics 2 Maarten Buis January 11, 2006.
Statistics for the Behavioral Sciences Second Edition Chapter 11: The Independent-Samples t Test iClicker Questions Copyright © 2012 by Worth Publishers.
Hypothesis Testing Using the Two-Sample t-Test
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Statistical Applications for Meta-Analysis Robert M. Bernard Centre for the Study of Learning and Performance and CanKnow Concordia University December.
Effect Size Estimation in Fixed Factors Between- Groups Anova.
1 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson GE 5 Tutorial 5.
Funded through the ESRC’s Researcher Development Initiative Prof. Herb MarshMs. Alison O’MaraDr. Lars-Erik Malmberg Department of Education, University.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Effect Size Calculation for Meta-Analysis Robert M. Bernard Centre for the Study of Learning and Performance Concordia University February 24, 2010 February.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Analysis Overheads1 Analyzing Heterogeneous Distributions: Multiple Regression Analysis Analog to the ANOVA is restricted to a single categorical between.
Lab 9: Two Group Comparisons. Today’s Activities - Evaluating and interpreting differences across groups – Effect sizes Gender differences examples Class.
Chapter 10 The t Test for Two Independent Samples
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
Meta-Analysis Effect Sizes effect sizes – r, d & OR computing effect sizes estimating effect sizes & other things of which to be careful!
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Funded through the ESRC’s Researcher Development Initiative Department of Education, University of Oxford Session 2.1 – Revision of Day 1.
Chapter 13 Understanding research results: statistical inference.
Hypothesis Testing and Statistical Significance
Simple Statistical Designs One Dependent Variable.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
Chapter 9 Introduction to the t Statistic
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Meta-analysis: Conceptual and Methodological Introduction
RDI Meta-analysis workshop - Marsh, O'Mara, & Malmberg
Chapter 10 CORRELATION.
15.1 The Role of Statistics in the Research Process
Chapter Nine: Using Statistics to Answer Questions
RES 500 Academic Writing and Research Skills
Presentation transcript:

Funded through the ESRC’s Researcher Development Initiative Department of Education, University of Oxford Sessions : Effect Size Calculation

2

 The effect size makes meta-analysis possible  It is based on the “dependent variable” (i.e., the outcome)  It standardizes findings across studies such that they can be directly compared  Any standardized index can be an “effect size” (e.g., standardized mean difference, correlation coefficient, odds-ratio), but must  be comparable across studies (standardization)  represent magnitude & direction of the relationship  be independent of sample size  Different studies in same meta-analysis can be based on different statistics, but have to transform each to a standardized effect size that is comparable across different studies

XLS Sample size, significance and d effect size

5 XLS ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 5 Sample size, significance and d effect size

6 XLS ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 6 Simulate ds on homemade calculator (ES.xls)  Change direction of effects  Change Ns (equal or same?)  Change SDs

79% of T above69% of T above 7 Effect size as proportion in the Treatment group doing better than the average Control group person 57% of T above = Control = Treatment

8 Effect size as proportion of success in the Treatment versus Control group (Binomial Effect Size Display = BESD): Success: 55% of T, 45% of C = Control = Treatment Success: 62% of T, 38% of C Success: 68% of T, 32% of C

 Long focus on significance level (safe-guarding against Type I (  ) error) – today focus on practical and meaningful significance.  Cohen, J. (1994). The earth is round (p <.05), American Psychologist, 49, 997– Why effect size?

10 A short history of the effect size (Huberty, 2002; see also Olejnik & Algina, 2000 for review of effect sizes)

 Power: “Finding what is out there”  Type II (  ) error “not finding what is out there”  Power (1 –  ): the probability of rejecting a false H 0 hypothesis  Power of.80 or.90 in primary research 11 Power and effect size

12 Power, sought effect size, at significance level  =.05 in primary research (prior to conducting study)

13 How meaningful is a “small” effect size?  A small effect size changed the course of an RCT in 1987: placebo group participants were given aspirin instead (see Rosenthal, 1994, p. 242) XLS

 Within the one meta-analysis, can include studies based on any combination of statistical analysis (e.g., t-tests, ANOVA, correlation, odds-ratio, chi- square, etc).  The “art” of meta-analysis is how to compute effect sizes based on non-standard designs and studies that do not supply complete data (see Lipsey&Wilson_AppB.pdf).  Convert all effect sizes into a common metric based on the “natural” metric given research in the area. E.g. d, r, OR 14 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg)

 Standardized mean difference  Group contrast research  Treatment groups  Naturally occurring groups  Inherently continuous construct  Correlation coefficient  Association between inherently continuous constructs  Odds-ratio  Group contrast research  Treatment or naturally occurring groups  Inherently dichotomous construct  Regression coefficients and other multivariate effects  Requires access to covariance-variance (correlation) matrices for each included study 15

16 Means and standard deviations Correlations P-values F -statistics d t -statistics “other” test statistics Almost all test statistics can be transformed into an standardized effect size “d” ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 16 Calculating ds (1)

 Represents a standardized group contrast on an inherently continuous measure  Uses the pooled standard deviation  Commonly called “d” ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) Calculating ds (1)

 Cohen’s d  Hedge’s g  Glass’s  18 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) Various contrast effect sizes

19 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) Calculating d (1) using Ms, SDs and ns Remember to code treatment effect in positive direction!

20 ES_calculator.xls

21 Calculating d (2) using ES calculator, using Ms, ns, and t-value

22 Calculating d (3) using ES calculator, using ns, and t-value  The treatment group scored higher than the control group at Time 2 (t [28] = 4.11; p<.001).  From sample description we learn that n 1 = n 2

 Hedges proposed a correction for small sample size bias (ns < 20)  Must be applied before analysis 23 Calculating d (3) correcting for small sample bias

24 Calculating d (4) using ES calculator, using ns, and F-value Remember: in a two-group ANOVA F = t 2

25 Calculating d (5) using ES calculator, using p-value “The mean-level comparison was not significant (p =.53)”

26 T-test table df = (n1 + ns –2) Sometimes authors only report e.g., p<.01 (n = 22). If so, use a conservative approach to reading the t- test table. NOTE: When p = n.s. some researchers code d = 0 in data base

27 Example dataset so far (1) (ES_enter.sav):

28 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) Use all available tools for calculating the following 5 effect sizes  ES 6: M T = 21, M C = 20, n T = 60, n C = 60, t =.55  ES 7: M T = 103.5, M C = 100, SD T = 22.0, SD C = 18.5, n T = 45, n C = 35,  ES 8: n T = 45, n C = 40, p <.05  ES 9: n T = 100, n C = 120, F = 8.73  ES 10: n T = 200, n C = 160, t = 5.66 (see electronic document: “Correct ds for 5 effect sizes.doc”)

29 Example dataset so far (2) (ES_enter.sav):

30 Calculating d (11) using ES calculator, using number of successful outcomes per group

31 Calculating d (11) using ES calculator, using number of successful outcomes per group

32 Calculating d (12) using ES calculator, using proportion of successes per group (53% vs. 48.5%)

33 Calculating d (13) using paired t-test (only one experimental group; “each person their own control” ) Don’t use the SD of the change score! r = correlation between Time 1 and Time 2

34 Calculating d (14) using paired t-test (only one experimental group)  n (pairs) = 90, t-value = 6.5, r =.70

35 Calculating d (15)  “The 20 participants increased.84 z-scores between time 1 and time 2 (p<.01)”  ES =.84  Correct for small sample bias

36 Example dataset so far 3 (ES_enter.sav): Method difference: mean contrast and gain scores

37 Summary of equations from Lipsey & Wilson (2001) ( for more formulae see Lipsey & Wilson Appendix B )

 The effect sizes are weighted by the inverse of the variance to give more weight to effects based on larger sample sizes  Variance for mean level comparison is calculated as  The standard error of each effect size is given by the square root of the sampling variance SE =  v i 38 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 38 Weighting for mean-level differences

39 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 39  Enter_w.xls

 SE for gain scores  Inverse variance for gain scores 40 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 40 Weighting for gain scores T1 and T2 scores are dependent so we need to get correlation between T1 and T2 into equation (not always reported)

41 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 41 XLS  Enter_w.xls

42 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 42 Compute the weighted mean ES and s.e. of the ES in SPSS (var_ofES.sps) (1)

43 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 43 Compute the weighted mean ES and s.e. of the ES in SPSS (var_ofES.sps) (2)

 Weight the ES by the inverse of the s.e.  The average ES  Standard error of the ES 44 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 44 Compute the weighted mean ES and s.e. of the ES

45 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 45  Enter_w.xls

46 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 46

 Does average of ES converge toward the average of the largest (n) study? 47 Funnel plot for x = sample size, y = ES 95% C.I. = ±1.96 * s.e. 99% C.I. = ±2.58 * s.e. 99.9% C.I. = ±3.29 * s.e.

 ES in smaller sample has larger standard error (s.e.) 48 Funnel plot including s.e. of ES

N = ‘size’  = ‘mean’  = ‘effect size’ Population The “likely” population parameter is the sample parameter ± uncertainty  Standard errors (s.e.)  Confidence intervals (C.I.) Interval estimates 49 Sample n = ‘size’ m = ‘mean’ d = ‘effect size’ ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 49

Means and standard deviations (d) 2 2  P-values F -statistics r t -statistics “other” test statistics Almost all test statistics can be transformed into an standardized effect size “r” ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 50 Calculating rs

51 Correlations / relationships between variables  r xy Pearson’s product moment coefficient (continuous  continuous)  R pb Bi-serial correlation (dichotomous  continuous)   2 (dichotomous  dichotomous)  r s Spearman’s rank-order coefficient (ordinal  ordinal) And others, e.g.,   coefficient, Odds-Ratio (OR)  Cramer’s V, Contingency coefficient C  Tetrachoric and polychoric correlations …. (etc)

52 Bias when dichotomising continuous variables  X or Y are both “truly” continuous, but in the study either is dichotomised X = continuous, Y =50/50 split gives an r pb that is 80% of its value, had it been continuous  X or Y are both “truly” continuous, but both are dichotomised Maximum value of  if x = 30/70 split and Y = 50/50 split is  =.33

53 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 53 Calculating rs from d (1) r can be used in all situations d can, but d cannot be used in all situations where r is appropriate

 If inherently continuous X and Y, mean-contrast is a better option than r pb 54 Calculating r pb (2)

55 Calculating r (3) from t-value  Appropriate for both independent and dependent samples t-test values Calculating r (4) from  2 -value

56 Sources of error  Cf. Structural Equation Model (circle = latent/ unobserved construct, rectangle = manifest/ observed variable) Manifest (observed) variable x Manifest (observed) variable y Latent (unobserved) X Latent (unobserved) Y r x*y* r xx r yy r xy

57 Alternatively: transform rs into Fisher’s Z r -transformed rs, which are more normally distributed ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 57

58 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg)  rr.xls

59

60 Calculating OR (chi2.sps)

61

62

63 Pearson’s 5 studies escaping Enteric Fever (1904)

64

XLS 65 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg)

66

67

68 Each study is one line in the data base Effect sizeDurationSample sizes Reliability of the instrument Variance of the effect size

69 Organising effect sizes within study (1) “Flat dataset”

70 Organising effect sizes within study (2) “hierarchical dataset” (effect sizes nested within study)

71 Organising effect sizes within study (3) “hierarchical dataset”, with one construct per DV per study

72 Organising effect sizes within study (4) “hierarchical dataset”, with one DV per study NOTE: alternative to aggregating ESs within study: multilevel meta- analysis

73 Exercise: effect size calculation (4 method/result extracts from journals):  Do boys have higher general (global) self-concept (self-worth) than girls?  Decide which effect size to use (d, r, OR)?  Calculate appropriate effect sizes

74 Effect size literature  Cohen, J. (1969). Statistical Power Analysis for the Behavioral Sciences, 1st Edition, Lawrence Erlbaum Associates, Hillsdale (2nd Edition, 1988).  Cohen, J. (1994). The earth is round (p <.05), American Psychologist, 49, 997–1003.  Gwet, K. (2001). Handbook of interrater reliability. How to estimate the level of agreement between two of multiple raters. Gaithersburg: STATAXIS Publishing.  Huberty, C. J. (2002). A history of effect size indices. Educational and Psychological Measurement, 62,  McCartney, K., & Rosenthal, R. (2000). Effect size, practical importance, and social policy for children. Child Development, 71,  Olejnik, S., & Algina, J. (2000). Measures of effect size for comparative studies: Applications, interpretations, and limitations. Contemporary Educational Psychology, 25,