Estimating the reproducibility of psychological science: accounting for the statistical significance of the original study Robbie C. M. van Aert & Marcel.

Estimating the reproducibility of psychological science: accounting for the statistical significance of the original study Robbie C. M. van Aert & Marcel A. L. M. van Assen Tilburg University & Utrecht University 1

Social Sciences Meta-Research Group http://www.metaresearch.nl/ 2

The Problem Example (Maxwell et al., 2015, in Am Psy) Independent sample t-test Original: d = 0.5, t(78) = 2.24, p = 0.028 Replication (power =.8 at d = 0.5) d = 0.23, t(170) = 1.50, p= 0.135 Conclusion?!? Omnipresent and relevant problem: 61% in RPP Questions considered relevant 1)Does effect exist? (0 or not) 2)What is effect? (best guess) 3

Problem and Solution Problem How to evaluate results of original and replication study? Solution Accurate estimation of effect size … … taking statistical significance of the original study into account 4

The Message (1)Methods should take statistical significance of original study into account (2)We developed such methods (frequentist and Bayesian) (3)Need huge sample sizes (N~1,000) to distinguish 0 from small effect  With current sample sizes in Psychology, one or two studies is not sufficient to accurately estimate effect size (4) Apply methods to Reproducibility Project (2015)  Best guess for only few nonsignificant replications is zero effect 5 Easy, natural, insightful

Overview 1.Publication bias and Reproducibility 2.Why we should take significance original study into account 3.Bayesian method 4.Analytical results Bayesian method 5.Application: Reproducibility Project Psychology 6.Conclusion and discussion 6

1. Publication bias and Reproducibility Publication bias is ‘the selective publication of studies with a statistically significant outcome’ 7 Evidence of publication bias is HUMONGOUS 97% of published original significant in psychology (97% in RPP), but average power much lower (8%, about 20%, 35%, 50%)  So… convinced?

1. Publication bias and Reproducibility Publication bias is the 800-lb gorilla in psychology’s living room (Ferguson and Heene, 2012) 8

1. Publication bias and Reproducibility But.. Psychologists do not see the gorilla ?!? ‘Shock’ after RPP (97%  36%) 9

2. Why we should take significance of original study into account Assume researcher’s goal: replicate significant original i.Selection of high score ii.Score subject to (sampling) error  Regression to the mean: original overestimates, replication accurate ! Holds irrespective of publication bias ! 10

3. Bayesian method [Snapshot Bayesian Hybrid Method] Assumptions –Original study is statistically significant –Both studies estimate the same effect (fixed-effect) –No questionable research practices Basic idea 1) Assume 4 effect sizes (0, small, medium, large [Cohen]) = snapshots 2) Compute posterior probability of four effects = Bayesian 3) Take statistical significance of original study into account = hybrid 11

3. Bayesian method Basic idea Likelihoods replication study 12

3. Bayesian method Basic idea Likelihoods original study 13

3. Bayesian method Basic idea Applied to example Maxwell et al. (2015) Evidence of 0 and small effect increased; best guess = small effect Advantages of method Easy, natural, insightful Easy (re)computation posterior for other (than uniform) prior 14

4. Analytical results Bayesian method Independent variables: ρ = 0; 0.1; 0.3; 0.5 [0, Small, Medium, Large] N both original and replication: 31; 55; 96, and 300, 1000 Dependent variables: Expected posterior probability Probability of strong evidence (posterior >.75 or Bayes Factor > 3) 15

4. Analytical results Bayesian method Expected posterior probability (hybrid) Need huge sample sizes (N~1,000) to distinguish 0 from small effect 16

4. Analytical results Bayesian method Expected posterior probability (WRONG method) Uncorrected for publication bias  overestimation 17

4. Analytical results Bayesian method Expected posterior probability (hybrid) Easier to distinguish medium and strong effect size 18

4. Analytical results Bayesian method Probability of strong evidence (hybrid) High sample size needed for 0 and small effect 19

5. Application: Reproducibility Project Psychology 100 studies from JPSP, Psych Science, JEP  67 could be included Evidence according to posterior probabilities >.25 0 = zero, S(mall), M(edium), L(arge) Strong evidence (posterior probability >.75) Only few studies have strong evidence for zero effect (13.4%) 20

6. Conclusion and discussion 21 Messages (1)Methods should take statistical significance of original study into account (2)We developed such methods (frequentist and Bayesian) (3)Need huge sample sizes (N~1,000) to distinguish 0 from small effect  With current sample sizes in Psychology, one or two studies is not sufficient to accurately estimate effect size (4) Apply methods to Reproducibility Project (2015)  Best guess for few nonsignificant replications is zero effect

6. Conclusion and discussion 22 Other Apply methods to reproducibility project economics Power analysis (how large should N of replication for 80% chance of strong evidence?) Unequal sample size original and replication discarding original studies, i.e. using only replication, is optimal for estimation in some conditions  Start all over again in some fields?!? App / user friendly program

Thank you for your attention 23

1. Publication bias and Reproducibility But.. Psychologists do not see the gorilla ?!? ‘Shock’ after RPP (97%  36%) Denial of results by some psychologist and methodologists Bad replication Not generalizable to other settings/people/time Statistical evaluation of results not right ( e.g. Maxwell et al.) ALL critics true to some extent !  NEED accurate methods! 24

2. Why we should take significance of original study into account Assume researcher’s goal: replicate significant original i.Selection of high score ii.Score subject to (sampling) error  Regression to the mean: expected value of replication is smaller than of original ! Holds irrespective of publication bias ! Assume researcher’s goal: replicate original (was sign.) No researcher’s selection of high score, but… Selection of high score through publication bias  regression to the mean still holds, and should still take significance original study into account 25

Other very important messages we would like to convey, but really have no time for it 1. Analysis shows that discarding original studies, i.e. using only replication, is optimal for estimation if... (i)true effect size is zero-small, and (ii)N replication > N original  Start all over again in some fields?!? 2. Using Bayesian analysis, or Confidence Intervals, rather than frequentist statistics is not the solution...  Using larger sample sizes is part of the solution 26

Estimating the reproducibility of psychological science: accounting for the statistical significance of the original study Robbie C. M. van Aert & Marcel.

Similar presentations

Presentation on theme: "Estimating the reproducibility of psychological science: accounting for the statistical significance of the original study Robbie C. M. van Aert & Marcel."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Estimating the reproducibility of psychological science: accounting for the statistical significance of the original study Robbie C. M. van Aert & Marcel.

Similar presentations

Presentation on theme: "Estimating the reproducibility of psychological science: accounting for the statistical significance of the original study Robbie C. M. van Aert & Marcel."— Presentation transcript:

Similar presentations

About project

Feedback