Preregistration on the Open Science Framework

Slides:

Advertisements

Similar presentations

Introduction to Research

Advertisements

Aim: How to test a hypothesis and design an experiment Do Now: 1) What do you need to know in order to test your hypothesis? 2)Take out your homework.

Research Design. Research is based on Scientific Method Propose a hypothesis that is testable Objective observations are collected Results are analyzed.

Undergraduate Dissertation Preparation – Research Strategy.

1 Copyright © 2011 by Saunders, an imprint of Elsevier Inc. Chapter 8 Clarifying Quantitative Research Designs.

Copyright © Allyn & Bacon 2008 Intelligent Consumer Chapter 14 This multimedia product and its contents are protected under copyright law. The following.

Fall 2009 Dr. Bobby Franklin.  “... [the] systematic, controlled empirical and critical investigation of natural phenomena guided by theory and hypotheses.

Introduction to Research. Purpose of Research Evidence-based practice Validate clinical practice through scientific inquiry Scientific rational must exist.

Experimental Psychology PSY 433 Chapter 5 Research Reports.

Scientific Utopia: Improving Openness and Reproducibility Brian Nosek University of Virginia Center for Open Science.

Practical Steps for Increasing Openness and Reproducibility Courtney Soderberg Statistical and Methodological Consultant Center for Open Science.

Critiquing Quantitative Research.  A critical appraisal is careful evaluation of all aspects of a research study in order to assess the merits, limitations,

Webinar on increasing openness and reproducibility April Clyburne-Sherin Reproducible Research Evangelist

Practical Steps for Increasing Openness and Reproducibility Courtney Soderberg Statistical and Methodological Consultant Center for Open Science.

Brian Nosek University of Virginia -- Center for Open Science -- Improving Openness.

David Mellor, PhD Project Manager at Improving Openness and Reproducibility of Scientific Research.

BERKELEY INITIATIVE FOR TRANSPARENCY IN THE SOCIAL SCIENCES Garret Christensen, Research Fellow BITSS and Berkeley Institute for Data Science.

Sara Bowman Center for Open Science | Promoting, Supporting, and Incentivizing Openness in Scientific Research.

Brian Nosek University of Virginia -- Center for Open Science -- Improving Openness.

1 Basics of Inferential Statistics Mark A. Weaver, PhD Family Health International Office of AIDS Research, NIH ICSSC, FHI Lucknow, India, March 2010.

Sara Bowman Center for Open Science | Promoting, Supporting, and Incentivizing Openness in Scientific Research.

WHAT IS THE NATURE OF SCIENCE?

David Preregistration David

David Mellor Building infrastructure to connect, preserve, speed up, and improve scholarship David Mellor

Writing a sound proposal

Safer science: making psychological science more Transparent & replicable Simine Vazire UC Davis.

KARL POPPER ON THE PROBLEM OF A THEORY OF SCIENTIFIC METHOD

What is Open Science and How do I do it?

Research Problem, Questions and Hypotheses

Transparency increases credibility and relevance of research

Study pre-registration

Methods of Science Chapter 1 Section 3.

Improving Openness and Reproducibility of Scientific Research

Improving Openness and Reproducibility of Scientific Research

What is Open Science and How do I do it?

By Prof. Dr. Salahuddin Khan

Statistics in Clinical Trials: Key Concepts

Chapter 1 The Science of Biology.

How to improve our science

Shifting the research culture toward openness and reproducibility

Experimental Psychology

Three points 1. Scientists’ Conflict of Interest 2

David Mellor Building infrastructure to connect, preserve, speed up, and improve scholarship David Mellor

Preregistration Challenge

AF1: Thinking Scientifically

Anna E. van ‘t Veer1, Roger Giner-Sorolla2, Christiane J. A

Transparency increases the credibility and relevance of research

Scientific Method What is the Scientific Method?

WHAT IS THE NATURE OF SCIENCE?

The Reproducible Research Advantage

Reinventing Scholarly Communication by Separating Publication From Evaluation Brian Nosek University of Virginia -- Center for Open Science

Shifting incentives from getting it published to getting it right

Incentives for a more #openscience

Methods Hour: Preregistration

Study Pre-Registration

Disrupting Scholarly Communication

IS Psychology A Science?

Challenges for Journals: Encouraging Sound Science

IS Psychology A Science?

School of Psychology, Cardiff University

Nature of Science Dr. Charles Ophardt EDU 370.

Methods of Science Chapter 1 Section 3.

Psych 231: Research Methods in Psychology

Psych 231: Research Methods in Psychology

Debate issues Sabine Mendes Lima Moura Issues in Research Methodology

Social Research.

Scientific Laws & Theories

Open Science & Reproducibility

Psychological Experimentation

Presentation transcript:

Preregistration on the Open Science Framework David Mellor @EvoMellor david@cos.io Find this presentation at https://osf.io/9kyqt/

The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research

The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research https://simplystatistics.org/2017/07/26/announcing-the-tidypvals-package/

The Garden of Forking Paths The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research The Garden of Forking Paths Control for time? Exclude outliers? Median or mean? “Does X affect Y?” Gelman and Loken, 2013

The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research “Many analysts, one dataset: Making transparent how variations in analytical choices affect results” https://psyarxiv.com/qkwst/ Image via 538.com https://fivethirtyeight.com/features/science-isnt-broken/#part1

The combination of a strong bias toward statistically significant findings and flexibility in data analysis results in irreproducible research p-values Original Studies Replications 97% “significant” 37% “significant”

Is there a Reproducibility Crisis? Baker, 2016

Is there a Reproducibility Crisis? Nature survey of 1,576 researchers Baker, 2016

Preregistration increases credibility by specifying in advance how data will be analyzed, thus preventing biased reasoning from affecting data analysis. cos.io/prereg

What is a preregistration? A time-stamped, read-only version of your research plan created before the study. Study plan: Hypothesis Data collection procedures Manipulated and measured variables Analysis plan: Statistical model Inference criteria

What is a preregistration? When the research plan undergoes peer review before results are known, the preregistration becomes part of a Registered Report cos.io/rr

Registered Reports

Registered Reports Are the hypotheses well founded? Are the methods and proposed analyses feasible and sufficiently detailed? Is the study well powered? (≥90%) Have the authors included sufficient positive controls to confirm that the study will provide a fair test?

Registered Reports Did the authors follow the approved protocol? Did positive controls succeed? Are the conclusions justified by the data?

None of these things matter Chambers, 2017

What problems do preregistration fix? The file drawer

What problems do preregistration fix? The file drawer P-Hacking: Unreported flexibility in data analysis

What problems do preregistration fix? The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Dataset Hypothesis Kerr, 1998

What problems do preregistration fix? The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Dataset Hypothesis Kerr, 1998

What problems do preregistration fix? The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Preregistration makes the distinction between confirmatory (hypothesis testing) and exploratory (hypothesis generating) research more clear. Dataset Hypothesis Kerr, 1998

Confirmatory versus exploratory analysis “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.”

Confirmatory versus exploratory analysis Context of confirmation Traditional hypothesis testing Results held to the highest standards of rigor Goal is to minimize false positives P-values interpretable “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.”

Confirmatory versus exploratory analysis Context of confirmation Traditional hypothesis testing Results held to the highest standards of rigor Goal is to minimize false positives P-values interpretable Context of discovery Pushes knowledge into new areas/ data-led discovery Finds unexpected relationships Goal is to minimize false negatives P-values meaningless “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.”

Confirmatory versus exploratory analysis Context of confirmation Traditional hypothesis testing Results held to the highest standards of rigor Goal is to minimize false positives P-values interpretable Context of discovery Pushes knowledge into new areas/ data-led discovery Finds unexpected relationships Goal is to minimize false negatives P-values meaningless “In statistics, hypotheses suggested by a given dataset, when tested with the same dataset that suggested them, are likely to be accepted even when they are not true. This is because circular reasoning (double dipping) would be involved: something seems true in the limited data set, therefore we hypothesize that it is true in general, therefore we (wrongly) test it on the same limited data set, which seems to confirm that it is true. Generating hypotheses based on data already observed, in the absence of testing them on new data, is referred to as post hoc theorizing (from Latin post hoc, "after this"). The correct procedure is to test any hypothesis on a data set that was not used to generate the hypothesis.” Presenting exploratory results as confirmatory increases the publishability of results at the expense of credibility of results.

What problems do preregistration fix? The file drawer P-Hacking: Unreported flexibility in data analysis HARKing: Hypothesizing After Results are Known Registered Reports also address publication bias. They ALSO improve the study design, by getting peer review into the process sooner.

Collect New Data Example Workflow 1 Theory driven, a-priori expectations Collect New Data

Collect New Data Confirmation Phase Hypothesis testing Example Workflow 1 Theory driven, a-priori expectations Collect New Data Confirmation Phase Hypothesis testing

Hypothesis generating Example Workflow 1 Theory driven, a-priori expectations Collect New Data Confirmation Phase Hypothesis testing Discovery Phase Exploratory research Hypothesis generating

Example Workflow 2 Few a-priori expectations Collect Data

Hypothesis generating Example Workflow 2 Few a-priori expectations Collect Data Keep these data secret! Split Data Discovery Phase Exploratory research Hypothesis generating

Hypothesis generating Example Workflow 2 Few a-priori expectations Collect Data Keep these data secret! Split Data Discovery Phase Exploratory research Hypothesis generating Confirmation Phase Hypothesis testing

How do you preregister?

https://osf.io/prereg

Tips for writing up preregistered work Include a link to your preregistration (e.g. https://osf.io/f45xp) Report the results of ALL preregistered analyses ANY unregistered analyses must be transparent

FAQ: Does preregistration work? Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Franco, A., Malhotra, N., & Simonovits, G. (2015).

FAQ: Does preregistration work? Reported Tests (122) Median p-value = .02 Median effect size (d) = .29 % p < .05 = 63% Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Franco, A., Malhotra, N., & Simonovits, G. (2015).

FAQ: Does preregistration work? Reported Tests (122) Median p-value = .02 Median effect size (d) = .29 % p < .05 = 63% Unreported Tests (147) Median p-value = .35 Median effect size (d) = .13 % p < .05 = 23% Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Franco, A., Malhotra, N., & Simonovits, G. (2015).

FAQ: Can’t someone “scoop” my ideas?

FAQ: Can’t someone “scoop” my ideas? Date-stamped preregistrations make your claim verifiable. By the time you’ve preregistered, you are ahead of any possible scooper! Embargo your preregistration (up to 4 years).

FAQ: Isn’t it easy to cheat? Making a “preregistration” after conducting the study. Making multiple preregistrations and only citing the one that “worked.” While fairly easy to do, this makes fraud harder to do and more intentional. Preregistration helps keep you honest to yourself.

Thank you! Find this presentation at https://osf.io/9kyqt/ Start your preregistration now! https://osf.io/prereg Links to lots of resources at https://cos.io/prereg Find me online @EvoMellor or email: david@cos.io Our mission is to provide expertise, tools, and training to help researchers create and promote open science within their teams and institutions. Promoting these practices within the research funding and publishing communities accelerates scientific progress.

Literature cited Chambers, C. D., Feredoes, E., Muthukumaraswamy, S. D., & Etchells, P. (2014). Instead of “playing the game” it is time to change the rules: Registered Reports at AIMS Neuroscience and beyond. AIMS Neuroscience, 1(1), 4–17. https://doi.org/10.3934/Neuroscience2014.1.4 Franco, A., Malhotra, N., & Simonovits, G. (2014). Publication bias in the social sciences: Unlocking the file drawer. Science, 345(6203), 1502–1505. https://doi.org/10.1126/science.1255484 Franco, A., Malhotra, N., & Simonovits, G. (2015). Underreporting in Political Science Survey Experiments: Comparing Questionnaires to Published Results. Political Analysis, 23(2), 306–312. https://doi.org/10.1093/pan/mpv006 Gelman, A., & Loken, E. (2013). The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis was posited ahead of time. Department of Statistics, Columbia University. Retrieved from www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf Kaplan, R. M., & Irvin, V. L. (2015). Likelihood of Null Effects of Large NHLBI Clinical Trials Has Increased over Time. PLoS ONE, 10(8), e0132382. https://doi.org/10.1371/journal.pone.0132382 Kerr, N. L. (1998). HARKing: Hypothesizing After the Results are Known. Personality and Social Psychology Review, 2(3), 196–217. https://doi.org/10.1207/s15327957pspr0203_4 Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific Utopia: II. Restructuring Incentives and Practices to Promote Truth Over Publishability. Perspectives on Psychological Science, 7(6), 615–631. https://doi.org/10.1177/1745691612459058 Ratner, K., Burrow, A. L., & Thoemmes, F. (2016). The effects of exposure to objective coherence on perceived meaning in life: a preregistered direct replication of Heintzelman, Trent & King (2013). Royal Society Open Science, 3(11), 160431. https://doi.org/10.1098/rsos.160431