Presentation is loading. Please wait.

Presentation is loading. Please wait.

David Mellor @EvoMellor david@cos.io Preregistration David Mellor @EvoMellor david@cos.io.

Similar presentations


Presentation on theme: "David Mellor @EvoMellor david@cos.io Preregistration David Mellor @EvoMellor david@cos.io."— Presentation transcript:

1 David Mellor @EvoMellor david@cos.io
Preregistration David Mellor @EvoMellor

2 The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research

3 The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research Fanelli D (2010) “Positive” Results Increase Down the Hierarchy of the Sciences. PLoS ONE 5(4): e doi: /journal.pone

4 The Garden of Forking Paths
The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research The Garden of Forking Paths Control for time? Exclude outliers? Median or mean? “Does X affect Y?” Gelman and Loken, 2013

5 The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research p-values Original Studies Replications 97% “significant” 37% “significant”

6 What is preregistration?
A time-stamped, read-only version of your research plan created before the study. Preregistration Study plan: Hypothesis Data collection procedures Manipulated and measured variables Preregistration Study plan: Hypothesis Data collection procedures Manipulated and measured variables Analysis plan: Statistical model Inference criteria

7 What is preregistration?
A time-stamped, read-only version of your research plan created before the study. Preregistration Study plan: Hypothesis Data collection procedures Manipulated and measured variables Preregistration Study plan: Hypothesis Data collection procedures Manipulated and measured variables Analysis plan: Statistical model Inference criteria

8 What is preregistration?
A time-stamped, read-only version of your research plan created before the study. Preregistration Study plan: Hypothesis Data collection procedures Manipulated and measured variables Preregistration Study plan: Hypothesis Data collection procedures Manipulated and measured variables Analysis plan: Statistical model Inference criteria When the research plan undergoes peer review before results are known, the preregistration becomes part of a Registered Report

9 What problems does preregistration fix?
The file drawer

10 What problems does preregistration fix?
The file drawer P-Hacking: Unreported flexibility in data analysis

11 What problems does preregistration fix?
The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Dataset Hypothesis Kerr, 1998

12 What problems does preregistration fix?
The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Dataset Hypothesis Kerr, 1998

13 What problems does preregistration fix?
Preregistration makes the distinction between confirmatory (hypothesis testing) and exploratory (hypothesis generating) research more clear. The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Dataset Hypothesis Kerr, 1998

14 Confirmatory versus exploratory analysis
Context of confirmation Traditional hypothesis testing Results held to the highest standards of rigor Goal is to minimize false positives P-values interpretable Context of discovery Pushes knowledge into new areas Finds unexpected relationships Goal is to minimize false negatives P-values meaningless! “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.” Presenting exploratory results as confirmatory increases the publishability of results at the expense of credibility of results.

15 Confirmatory versus exploratory analysis
Context of confirmation Traditional hypothesis testing Results held to the highest standards of rigor Goal is to minimize false positives P-values interpretable Context of discovery Pushes knowledge into new areas Finds unexpected relationships Goal is to minimize false negatives P-values meaningless! “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.” Presenting exploratory results as confirmatory increases the publishability of results at the expense of credibility of results.

16 Confirmatory versus exploratory analysis
Context of confirmation Traditional hypothesis testing Results held to the highest standards of rigor Goal is to minimize false positives P-values interpretable Context of discovery Pushes knowledge into new areas/ data-led discovery Finds unexpected relationships Goal is to minimize false negatives P-values meaningless “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.” Presenting exploratory results as confirmatory increases the publishability of results at the expense of credibility of results.

17 Confirmatory versus exploratory analysis
Context of confirmation Traditional hypothesis testing Results held to the highest standards of rigor Goal is to minimize false positives P-values interpretable Context of discovery Pushes knowledge into new areas/ data-led discovery Finds unexpected relationships Goal is to minimize false negatives P-values meaningless “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.” Presenting exploratory results as confirmatory increases the publishability of results at the expense of credibility of results.

18 Collect New Data Example Workflow 1
Theory driven, a-priori expectations Collect New Data

19 Collect New Data Confirmation Phase Hypothesis testing
Example Workflow 1 Theory driven, a-priori expectations Collect New Data Confirmation Phase Hypothesis testing

20 Hypothesis generating
Example Workflow 1 Theory driven, a-priori expectations Collect New Data Confirmation Phase Hypothesis testing Discovery Phase Exploratory research Hypothesis generating

21 Example Workflow 2 Few a-priori expectations Collect Data

22 Hypothesis generating
Example Workflow 2 Few a-priori expectations Collect Data Keep these data secret! Split Data Discovery Phase Exploratory research Hypothesis generating

23 Hypothesis generating
Example Workflow 2 Few a-priori expectations Collect Data Keep these data secret! Split Data Discovery Phase Exploratory research Hypothesis generating Confirmation Phase Hypothesis testing

24 Incentives to Preregister
cos.io/prereg cos.io/badges

25 How do you preregister?

26 https://osf.io/prereg

27

28

29

30 Tips for writing up preregistered work
Include a link to your preregistration (e.g. Report the results of ALL preregistered analyses ANY unregistered analyses must be transparent

31 FAQ: Does preregistration work?
Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Franco, A., Malhotra, N., & Simonovits, G. (2015).

32 FAQ: Does preregistration work?
Reported Tests (122) Median p-value = .02 Median effect size (d) = .29 % p < .05 = 63% Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Franco, A., Malhotra, N., & Simonovits, G. (2015).

33 FAQ: Does preregistration work?
Reported Tests (122) Median p-value = .02 Median effect size (d) = .29 % p < .05 = 63% Unreported Tests (147) Median p-value = .35 Median effect size (d) = .13 % p < .05 = 23% Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Franco, A., Malhotra, N., & Simonovits, G. (2015).

34 FAQ: Does preregistration work?
Positive results dropped from 57% to 8% after preregistration required. Likelihood of Null Effects of Large NHLBI Clinical Trials Has Increased over Time, Robert M. Kaplan , Veronica L. Irvin, 2015

35 FAQ: Can’t someone “scoop” my ideas?

36 FAQ: Can’t someone “scoop” my ideas?
Date-stamped preregistrations make your claim verifiable. By the time you’ve preregistered, you are ahead of any possible scooper. Embargo your preregistration.

37 FAQ: Isn’t it easy to cheat?
Making a “preregistration” after conducting the study. Making multiple preregistrations and only citing the one that “worked.” While fairly easy to do, this makes fraud harder to do and more intentional. Preregistration helps keep you honest to yourself. Makes fraud more explicit.

38 FAQ: Isn’t it easy to cheat?
Making a “preregistration” after conducting the study. Making multiple preregistrations and only citing the one that “worked.” While fairly easy to do, this makes fraud harder to do and more intentional. Preregistration helps keep you honest to yourself.

39 Registered Reports

40 Registered Reports Are the hypotheses well founded?
Are the methods and proposed analyses feasible and sufficiently detailed? Is the study well powered? (≥90%) Have the authors included sufficient positive controls to confirm that the study will provide a fair test?

41 Registered Reports Did the authors follow the approved protocol?
Did positive controls succeed? Are the conclusions justified by the data?

42 None of these things matter
Chambers, 2017

43 49 Journals currently accept RRs. Learn more at: cos.io/rr

44

45

46 Make registrations discoverable across all registries
Provide tools for communities to create and manage their own registry. Allows them to devote resources to community building and standards creation.

47 Thank you! Find this presentation at: https://osf.io/pw28e/
Contact me at Our mission is to provide expertise, tools, and training to help researchers create and promote open science within their teams and institutions. Promoting these practices within the research funding and publishing communities accelerates scientific progress.

48 Literature cited Chambers, C. D., Feredoes, E., Muthukumaraswamy, S. D., & Etchells, P. (2014). Instead of “playing the game” it is time to change the rules: Registered Reports at AIMS Neuroscience and beyond. AIMS Neuroscience, 1(1), 4–17. Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502– Franco, A., Malhotra, N., & Simonovits, G. (2015). Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Political Analysis, 23(2), 306–312. Gelman, A., & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. Department of Statistics, Columbia University. Retrieved from Kaplan, R. M., & Irvin, V. L. (2015). Likelihood of Null Effects of Large NHLBI Clinical Trials Has Increased over Time. PLoS ONE, 10(8), e Kerr, N. L. (1998). HARKing: Hypothesizing After the Results are Known. Personality and Social Psychology Review, 2(3), 196– Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability. Perspectives on Psychological Science, 7(6), 615–631. Ratner, K., Burrow, A. L., & Thoemmes, F. (2016). The effects of exposure to objective coherence on perceived meaning in life: a preregistered direct replication of Heintzelman, Trent & King (2013). Royal Society Open Science, 3(11),


Download ppt "David Mellor @EvoMellor david@cos.io Preregistration David Mellor @EvoMellor david@cos.io."

Similar presentations


Ads by Google