Download presentation
Presentation is loading. Please wait.
1
Incentives for a more #openscience
David Mellor @EvoMellor Find this presentation at
2
Our mission is to increase openness, integrity, and reproducibility of research.
3
Open science… …improves the practice of research.
…exposes our own biases and increases the credibility of our results. …can help us where it needs to help, with publications and credit.
4
Fear: People will discover my lab isn’t perfect
5
Reality: Opening our workflow makes our science better
6
OPEN WORKFLOW Increases process transparency Increases accountability
Facilitates reproducibility Facilitates metascience Fosters collaboration Fosters inclusivity Fosters innovation Protects against lock-in: Open + Accessible
7
Free, open source scientific commons
OPEN SCIENCE FRAMEWORK Free, open source scientific commons
8
Collaboration, Documentation,
It is a Collaboration, Documentation, and Archiving system So here, it’s very easy to curate a project, add contributors, share, but there is also an activity log. To the users that’s either meaningless or an easy way to keep up-to-date, but to us, this is provenance. This is important information about how science happens. Collaboration, Documentation, Archiving, Sharing
9
Manage access and permissions
quite simply, it is a file repository that allows any sort of file to be stored and many types to be rendered in the browser without any special software. This is very important for increasing accessibility to research.
10
Put data, materials, and code on the OSF
quite simply, it is a file repository that allows any sort of file to be stored and many types to be rendered in the browser without any special software. This is very important for increasing accessibility to research.
11
Automatic file versioning
The OSF has built-in versioning for file changes, making it easy to keep your research up to date. When you or any collaborators add new versions of a file with the same name to your OSF project, the OSF automatically updates the file and maintains an accessible record of all previous versions stored in OSF Storage and many of our add-ons. The versions of this file (Bowman.ACS ) are highlighted in this slide. To the user, it’s easy backup—drag and drop version control. For the ideals, it’s important provenance. recent and previous versions of file Automatic file versioning
12
Incentives for individual success are focused on getting it published, not getting it right
Nosek, Spies, & Motyl, 2012
13
Solutions Option A Mandates Culture of distrust. Continued antagonism.
14
Reward actions that are idealized scientific practice.
Solutions Option B Reward actions that are idealized scientific practice. The “truth” is versioned. Pointing out mistakes enhance reputations of all parties.
15
Getting to Option B Allow peers to receive recognition for ideal, optional behaviors. Identify biases in analysis and publication, build tools and workflows that take them into account.
16
Signals: Making Behaviors Visible Promotes Adoption
Badges Open Data Open Materials Preregistration Psychological Science (Jan 2014) Kidwell et al., 2016
17
40% 30% % Articles reporting data available in repository 20% 10% 0%
20
The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research
21
The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research
22
The Garden of Forking Paths
The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research The Garden of Forking Paths Control for time? Exclude outliers? Median or mean? “Does X affect Y?” Gelman and Loken, 2013
23
The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research “Many analysts, one dataset: Making transparent how variations in analytical choices affect results” Image via 538.com
24
The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research p-values Original Studies Replications 97% “significant” 37% “significant”
25
Is there a Reproducibility Crisis?
Baker, 2016
26
Is there a Reproducibility Crisis?
Nature survey of 1,576 researchers Baker, 2016
27
Preregistration increases credibility by specifying in advance how data will be analyzed, thus preventing biased reasoning from affecting data analysis. cos.io/prereg
28
What is a preregistration?
A time-stamped, read-only version of your research plan created before the study. Study plan: Hypothesis Data collection procedures Manipulated and measured variables Analysis plan: Statistical model Inference criteria
29
What problems do preregistration fix?
The file drawer
30
What problems do preregistration fix?
The file drawer P-Hacking: Unreported flexibility in data analysis
31
What problems do preregistration fix?
The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Dataset Hypothesis Kerr, 1998
32
What problems do preregistration fix?
The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Dataset Hypothesis Kerr, 1998
33
What problems do preregistration fix?
The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Preregistration makes the distinction between confirmatory (hypothesis testing) and exploratory (hypothesis generating) research more clear. Dataset Hypothesis Kerr, 1998
34
Confirmatory versus exploratory analysis
“In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.”
35
Confirmatory versus exploratory analysis
Context of confirmation Traditional hypothesis testing Results held to the highest standards of rigor Goal is to minimize false positives P-values interpretable “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.”
36
Confirmatory versus exploratory analysis
Context of confirmation Traditional hypothesis testing Results held to the highest standards of rigor Goal is to minimize false positives P-values interpretable Context of discovery Pushes knowledge into new areas/ data-led discovery Finds unexpected relationships Goal is to minimize false negatives P-values meaningless “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.”
37
Confirmatory versus exploratory analysis
Context of confirmation Traditional hypothesis testing Results held to the highest standards of rigor Goal is to minimize false positives P-values interpretable Context of discovery Pushes knowledge into new areas/ data-led discovery Finds unexpected relationships Goal is to minimize false negatives P-values meaningless “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.” Presenting exploratory results as confirmatory increases the publishability of results at the expense of credibility of results.
38
What problems do preregistration fix?
The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Registered Reports also address publication bias. They ALSO improve the study design, by getting peer review into the process sooner.
39
Collect New Data Example Workflow 1
Theory driven, a-priori expectations Collect New Data
40
Collect New Data Confirmation Phase Hypothesis testing
Example Workflow 1 Theory driven, a-priori expectations Collect New Data Confirmation Phase Hypothesis testing
41
Hypothesis generating
Example Workflow 1 Theory driven, a-priori expectations Collect New Data Confirmation Phase Hypothesis testing Discovery Phase Exploratory research Hypothesis generating
42
Example Workflow 2 Few a-priori expectations Collect Data
43
Hypothesis generating
Example Workflow 2 Few a-priori expectations Collect Data Keep these data secret! Split Data Discovery Phase Exploratory research Hypothesis generating
44
Hypothesis generating
Example Workflow 2 Few a-priori expectations Collect Data Keep these data secret! Split Data Discovery Phase Exploratory research Hypothesis generating Confirmation Phase Hypothesis testing
45
How do you preregister?
46
https://osf.io/prereg
50
Tips for writing up preregistered work
Include a link to your preregistration (e.g. Report the results of ALL preregistered analyses ANY unregistered analyses must be transparent
51
What is a Registered Report?
When the research plan undergoes peer review before results are known, the preregistration becomes part of a Registered Report cos.io/rr
52
Registered Reports
53
Registered Reports Are the hypotheses well founded?
Are the methods and proposed analyses feasible and sufficiently detailed? Is the study well powered? (≥90%) Have the authors included sufficient positive controls to confirm that the study will provide a fair test?
54
Registered Reports Did the authors follow the approved protocol?
Did positive controls succeed? Are the conclusions justified by the data?
55
None of these things matter
Chambers, 2017
56
FAQ: Does preregistration work?
Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Franco, A., Malhotra, N., & Simonovits, G. (2015).
57
FAQ: Does preregistration work?
Reported Tests (122) Median p-value = .02 Median effect size (d) = .29 % p < .05 = 63% Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Franco, A., Malhotra, N., & Simonovits, G. (2015).
58
FAQ: Does preregistration work?
Reported Tests (122) Median p-value = .02 Median effect size (d) = .29 % p < .05 = 63% Unreported Tests (147) Median p-value = .35 Median effect size (d) = .13 % p < .05 = 23% Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Franco, A., Malhotra, N., & Simonovits, G. (2015).
59
FAQ: Can’t someone “scoop” my ideas?
60
FAQ: Can’t someone “scoop” my ideas?
Date-stamped preregistrations make your claim verifiable. By the time you’ve preregistered, you are ahead of any possible scooper! Embargo your preregistration (up to 4 years).
61
FAQ: Isn’t it easy to cheat?
Making a “preregistration” after conducting the study. Making multiple preregistrations and only citing the one that “worked.” While fairly easy to do, this makes fraud harder to do and more intentional. Preregistration helps keep you honest to yourself.
62
Thank you! Find this presentation at https://osf.io/9kyqt/
Start your preregistration now! Links to lots of resources at Find me or Our mission is to provide expertise, tools, and training to help researchers create and promote open science within their teams and institutions. Promoting these practices within the research funding and publishing communities accelerates scientific progress.
63
Literature cited Chambers, C. D., Feredoes, E., Muthukumaraswamy, S. D., & Etchells, P. (2014). Instead of “playing the game” it is time to change the rules: Registered Reports at AIMS Neuroscience and beyond. AIMS Neuroscience, 1(1), 4–17. Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502– Franco, A., Malhotra, N., & Simonovits, G. (2015). Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Political Analysis, 23(2), 306–312. Gelman, A., & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. Department of Statistics, Columbia University. Retrieved from Kaplan, R. M., & Irvin, V. L. (2015). Likelihood of Null Effects of Large NHLBI Clinical Trials Has Increased over Time. PLoS ONE, 10(8), e Kerr, N. L. (1998). HARKing: Hypothesizing After the Results are Known. Personality and Social Psychology Review, 2(3), 196– Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability. Perspectives on Psychological Science, 7(6), 615–631. Ratner, K., Burrow, A. L., & Thoemmes, F. (2016). The effects of exposure to objective coherence on perceived meaning in life: a preregistered direct replication of Heintzelman, Trent & King (2013). Royal Society Open Science, 3(11),
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.