Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research transparency and Reproducibility in Social Sciences

Similar presentations


Presentation on theme: "Research transparency and Reproducibility in Social Sciences"— Presentation transcript:

1 Research transparency and Reproducibility in Social Sciences
Tale of Two Futures Research transparency and Reproducibility in Social Sciences Soazic Elise WANG SONNE PhD Fellow, UNU-MERIT September 14th , 2016 Research Transparency training workshop; 2016 British Society for Population Study Conference (BSPS) University of Winchester, UK

2 Tale of Two Futures WHAT IS BITSS ? (1) The Berkeley Initiative for Transparency in the Social Sciences (BITSS) was established in 2012 by UC Berkeley’s Center for Effective Global Action (CEGA) Strengthen the quality of social science research and evidence used for policy-making, by enhancing the practices of economists, psychologists, political scientists, and other social scientists (demographers). Source: BITSS

3 WHAT IS BITSS ? (2) Tale of Two Futures Norms + Consensus Build standards of openness, integrity, and transparency Tools + Resources Identify, fund, and develop tools and resources Education (BITSS CATALYST) Deliver coursework for students, faculty, and researchers through our network. Research Understand the problem, explore solutions, and monitor progress Recognition Create incentives and reward researchers making effort to disclose their data. Reward exceptional achievements in the advancement of transparent social science. (SSMART grant, LEAMER ROSENTHAL Prize, 10,000 US Dollars)

4 ACADEMIC MISCONDUCT Tale of Two Futures Source: BITSS
This year, a group of epidemiologists at the London School of Hygiene & Tropical Medicine replicated the study and reanalyzed the original data of that Kenya trial, and uncovered a number of flaws in the research.

5 OVERVIEW OF UNRELIABLE RESEARCH (1)
Tale of Two Futures Publication bias P-hacking Non-disclosure/ Selective reporting Failure to replicate Lack of transparency (Universalism, Communality, Disinterestedness , Organized skepticism) Source : BITSS Catalyst Resource: Stephanie Wystra

6 OVERVIEW OF UNRELIABLE RESEARCH :Publication Bias
Tale of Two Futures OVERVIEW OF UNRELIABLE RESEARCH :Publication Bias Publication Bias “File drawer problem” Source: BITSS catalyst resources: Stephanie Wykstra

7 OVERVIEW OF UNRELIABLE RESEARCH (2)
Tale of Two Futures Statistically significant results more likely to be published, while null results are buried in the file drawer. Source: Franco et al. (Science 2014) found that of a group of NSF-funded studies run through TESS (time-sharing experiments in the social sciences), 60% of those with positive, statistically significant results had been published versus only 20% of those finding null results. (BITSS Catalyst Resource: Stephanie Wykstra)

8 OVERVIEW OF UNRELIABLE RESEARCH :P Hacking
Tale of Two Futures OVERVIEW OF UNRELIABLE RESEARCH :P Hacking P-Hacking Also called “data fishing,” “data mining.” or “researcher degree of freedom.” :“If you torture your data long enough, they will confess to anything.” -Ronald Coase Researchers test hypotheses i.e. look for relationships among variables (e.g. schooling, test scores). In particular, a result which is statistically significant at p<0.05 is often considered noteworthy and thus more publishable. Source : BITSS Catalyst Resources: Stephanie Wykstra

9 OVERVIEW OF UNRELIABLE RESEARCH :P Hacking
Tale of Two Futures OVERVIEW OF UNRELIABLE RESEARCH :P Hacking In economics… Brodeur et al (AEJ 2016). Data: 50,000 tests published in AER, JPE, QJE ( ) Source : BITSS Catalyst Resources: Stephanie Wykstra *Figure 1 shows a skewed distribution of p-values (which are used to determine the statistical significance of results) across various publications. There is a non-random increase in reported p-values just below 0.05 (a value commonly used as a threshold in the social sciences), suggesting researchers are tweaking data to verify hypotheses and increase the likelihood of publication (or else journal editors are discriminating against “barely not significant” estimates.) This figure alone does not tell us if it is data mining that leads to the skewed results, or if researchers are honest but journal editor discriminate against “barely not significant” estimates. In fact this curve should bend in the opposite direction: there should be more outcomes with p-values above 0.05 – or, for a null effect, we should see a uniform distribution (flat line). Using 50,000 tests published between 2005 and 2011 in the AER, JPE and QJE, we identify a residual in the distribution of tests that cannot be explained by selection. The distribution of p-values exhibits a camel shape with abundant p-values above :25, a valley between :25 and :10 and a bump slightly under :05. Missing tests are those which would have been accepted but close to being rejected (p-values between :25 and :10). We show that this pattern corresponds to a shift in the distribution of p-values: between 10% and 20% of marginally rejected tests are misallocated. Our interpretation is that researchers might be tempted to inflate the value of their tests by choosing the specification that provides the highest statistics.

10 OVERVIEW OF UNRELIABLE RESEARCH :Selective Reporting
Tale of Two Futures Selective Reporting Cherry-picking results for reporting Malhotra, Franco, Simonovits (2015) find that of the studies run through TESS, roughly 60% of papers report fewer outcome variables than are listed in the questionnaire. If many relevant outcomes aren’t mentioned in the paper (and aren’t listed elsewhere), how can we be confident that the reported results weren’t just noisy, rather than true effects? Source : BITSS Catalyst Resources: Stephanie Wykstra

11 OVERVIEW OF UNRELIABLE RESEARCH :Replication
Tale of Two Futures “Replication” is often used to mean different things. Here are three different activities which replication refers to: (1) Verification and re-analysis: Checking that original data/code can produce published results, as well as going further to check the robustness of the results. (2) Reproduction: Testing whether the results hold up when the study conducted in a very similar way. (3) Extension: Investigating whether the results hold up when the study is conducted in another place, under different conditions, etc., to test external validity. Source : Michael Clemens: 2014; BITSS Catalyst Resources: Stephanie Wykstra

12 Well-known replications:
Tale of Two Futures OVERVIEW OF UNRELIABLE RESEARCH :Replication failure Well-known replications: David Broockman, Joshua Kalla, and Peter Aronow. (2015): Attempt to reproduce Lacour M. and Green D. (2015) paper "When contact changes minds: An experiment on transmission of support for gay equality" Reinhart-Rogoff (2013) Spreadsheet errors found by grad student (Herndon et al 2013). Deworming debate (2015) Miguel/Kremer (2004) and Aiken/Davey et al. (2015) Big debate within development econ/epidemiology Reproducibility project in psychology (2015) ~40% of studies successfully reproduced. But big debate this past week! What does it mean for a replication to “fail”? Begley et al. (2012): Attempt to reproduce “landmark” pre-clinical cancer lab studies at Amgen (6 out of 53 studies reproduced).

13 OVERVIEW OF UNRELIABLE RESEARCH : Transparency (Lack of)
Tale of Two Futures OVERVIEW OF UNRELIABLE RESEARCH : Transparency (Lack of) Sharing data, code and surveys (along with clear documentation) can allow others to check one’s work. Yet relatively few researchers share their data: Alsheichk-Ali et al. 2011: Review of 10 first research papers of 2009 published in top 50 journals by impact factor. Of the 500 papers, 351 were subject to a data availability policy of some kind: 59% did not adhere to the policy, most commonly by not publicly depositing the data (73%). Overall, only 47 papers (9%) deposited the full primary raw data online. And relatively few journals have data-sharing policies: In 2013, only 18 out of 120 political science journals inspected had data-sharing policies (Gherghinaa and Katsanidoua 2013) and another review found that only 29 out of 141 journals reviewed in economics had policies (Vlaeminck 2013). Even though having a mandatory policy makes it much more likely that researchers will share data: The rate of data archiving compliance for surveyed journals with the strictest policies, which required data archiving along with data accessibility statements in manuscripts, was nearly 1000x higher than having no policy. (Vines 2014)

14 SOLUTIONS TO INCREASE THE RELIABILITY OF RESEARCH
What are the solutions ? “The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true.” (PLOS 2005)

15 SOLUTIONS TO INCREASE THE RELIABILITY OF RESEARCH
Research transparency! Study registration and pre-analysis plans “Results-neutral publishing” Replications Publish / share all study results Data-sharing: sharing data, code, surveys, readme files

16 We recommend registering prior beginning the intervention.
Tale of Two Futures SOLUTIONS TO INCREASE THE RELIABILITY OF RESEARCH (REGISTRATION) Study registration Creating a public record of a study and basic information about the study (e.g., intervention, outcomes, location, dates). All IPA required and J-PAL studies strongly encouraged to pre-register studies on the AEA registry ( We recommend registering prior beginning the intervention. The AEA Registry: Before going into the background and motivation behind the registry, what is it? It’s a searchable database of all completed, ongoing and planned RCT trials. All new and ongoing J-PAL and IPA projects will register in the system. The registry was born at the January 2012 American Economic Association Executive Committee meeting and over the past year, the AEA worked with J-PAL to complete the design and implementation of the registry. So the AEA makes the decisions, and J-PAL implements their decisions.

17 SOLUTIONS TO INCREASE THE RELIABILITY OF RESEARCH (REGISTRATION)
Other registries

18 Combats publication bias by providing public record of the study.
SOLUTIONS TO INCREASE THE RELIABILITY OF RESEARCH (REGISTRATION) Why registration? Combats publication bias by providing public record of the study. Pre-registration is a long-standing requirement in the medical community (e.g., for FDA approval, for publication in medical journals) and is increasingly a focus in the social sciences as well. Clinicaltrials.gov (185K+ studies registered).

19 Structure of a Pre-Analysis plan:
SOLUTIONS TO INCREASE THE RELIABILITY OF RESEARCH (PAPs) Pre-analysis Plan Pre-analysis plans are more detailed write-ups about the study hypotheses, outcomes, and planned analysis (Extensive listing of econometric specifications to be estimated.) The goal is to combat data-mining by tying the hands of the researcher. Structure of a Pre-Analysis plan: Take a look at one of the first Pre-Analysis Plan in Economics by Casey, Miguel and Glennester (Reshaping institutions: Evidence on impacts using a Pre-Analysis Plan)

20 Share all results (including null results!)
SOLUTIONS TO INCREASE THE RELIABILITY OF RESEARCH (Results sharing) Share all results (including null results!) Publication bias leads to a skew towards positive (exciting) results. Problem: how do we disseminate null results, if journals tend to accept positive results at greater rates? One solution: reporting all results in a public registry. For example: the FDA in the US requires that all results be reported on clinicaltrials.gov within one year after trial completion. For social sciences studies, the results could be reported in the place they are registered (e.g. AEA registry). All trials is a group that campaigns to push for all clinical trials to report results.

21 Share all results (including null results!)
SOLUTIONS TO INCREASE THE RELIABILITY OF RESEARCH (Results sharing) Share all results (including null results!) Submitting a protocol with details on methodology of study and planned analysis, before carrying out the study. The article is then accepted based on the design, before any data are collected. “Because the study is accepted in advance, the incentives for authors change from producing the most beautiful story to producing the most accurate one.” –Chris Chambers, editor of Cortex (OSF citation)

22 Making data, code, and other materials publicly available.
SOLUTIONS TO INCREASE THE RELIABILITY OF RESEARCH (Data sharing) Tale of Two Futures Why? Making data, code, and other materials publicly available. Facilitates replication of published results Re-use of data for further studies and meta-analysis Promote better quality data/code/metadata The data sharing movement : Open science groups advocating for open data Berkeley Initiative for Transparency in the Social Sciences Center for Open Science Meta-research institute at Stanford (Ioannidis) YODA at Yale (clinical trials data-sharing portal)

23 Minimally: “publication” dataset – code/data underlying publication.
SOLUTIONS TO INCREASE THE RELIABILITY OF RESEARCH (Data sharing) What to share? 1. Datasets Recommended: cleaned, study dataset ( Personally Identifiable Information removed!) Minimally: “publication” dataset – code/data underlying publication. Ideal: start to finish reproducibility (more on this in a minute!) 2. Readme files explaining relation between data and code, as well as any further data documentation. 3. Surveys 4. Study-level metadata

24 How to build a reproducible workflow?

25 When you start a project
Tale of Two Futures Reproducible Workflow Start of the Project Data Collection Data Cleaning/Analysis Sharing Data When you start a project Key Point: Have an eye towards sharing materials at all stages of the project – is what you’re doing understandable and usable for outsiders? If it’s understandable to outsiders, then it’s easier for you to understand too! Set up a logical folder structure before you start creating files in them What about inheriting an existing structure? Remember data security – ensuring PII is kept separate and secure is even more critical when sharing Source: BITSS catalyst Resource: Erica Chuang

26 Reproducible Workflow
Tale of Two Futures Reproducible Workflow The ideal: sharing “start to finish” data/code to permit replication from initial raw dataset to final tables. Includes all code (variable construction and cleaning as well). Key point: sharing isn’t all or nothing: sharing some data/code e.g. underlying the published results … is better than sharing nothing! Preparing data/code early on is crucial, since there are serious limitations in what can be shared and understood later on, if good practices aren’t followed. Source: BITSS catalyst Resource: Erica Chuang

27 Example of Folder Structure
Reproducible Workflow (Start of the project) Tale of Two Futures Example of Folder Structure Source: BITSS catalyst Resource: Erica Chuang

28 Reproducible Workflow (Start of the project)
Tale of Two Futures Project Log or Living Readme file Living Readme should contain: Project name (+ Publication name if at that point in the project lifecycle) Author name Stata version, user-written commands + versions All of the most important files (surveys, code, data, etc.), where they are and what information do they contain Project log can contain (in addition to above): Basic metadata about the study, changes made to metadata and why, etc. For your own sanity, keep this up-to-date! Source: BITSS catalyst Resource: Erica Chuang

29 Reproducible Workflow (Start)
Tale of Two Futures Decide on naming and labeling conventions for your project in advance: Folder names File names Variable names and labels Document these conventions somewhere (Project Log, Living Readme, etc.) Build the habit early! Source: BITSS catalyst Resource: Erica Chuang

30 Reproducible Workflow (Start)
Tale of Two Futures Keep data collection instruments (e.g. questionnaires) clearly marked and annotated: where and when were they used? Keep key project metadata up-to-date in a central location such as in the Project Log – e.g. survey location, sample size, intervention details, etc., any changes made and why Source: BITSS catalyst Resource: Erica Chuang

31 Reproducible Workflow (Data Cleaning/ Analysis)
Tale of Two Futures All variable names and labels should be created in a consistent, logical manner name: boys10, label: “[q65] Has boys under 10” name: q65_boys10, label: “Has boys under 10” Critical to add labels when you create variables in your cleaning/analysis Remember to create and assign clear value labels as well! E.g. name: male 0 = “female”, 1 = “male” 1 = “male”, 2 = “female” Source: BITSS catalyst Resource: Erica Chuang

32 Reproducible Workflow (Data cleaning/ Analysis)
Tale of Two Futures Create a master do-file: a file that runs ALL code in your project You might want to have master do-files for specific stages of the project as well, e.g. master_clean, master_analysis, etc. In addition, a master might be useful for: Setting any globals that might be used across do-files Installing user-written commands Source: BITSS catalyst Resource: Erica Chuang

33 Reproducible Workflow (Data Analysis)
Tale of Two Futures Use relative references when referring to any file path in your files. You can: use the cd or fastcd commands to set working directory use globals to store file path and then call it throughout e.g.: cd “C:/Users/echuang/My Project Folder/Datasets” use “data.dta” global data “C:/Users/echuang/My Project Folder/Datasets” use “${data}/data.dta” Source: BITSS catalyst Resource: Erica Chuang

34 Reproducible Workflow (Data Analysis)
Tale of Two Futures Source: BITSS catalyst Resource: Erica Chuang

35 Reproducible Workflow (Data Analysis)
Tale of Two Futures Source: BITSS catalyst Resource: Erica Chuang

36 Data Cleaning/Analysis
Tale of Two Futures Review Start of the Project Data Collection Data Cleaning/Analysis Sharing Data Consider future useability! Logical folder structure PII encryption Project Log/Readme Questionnaires clearly marked Keep project metadata updated Naming and labeling variables logically Master Relative references Comments! Version control Readme file Codebooks User-written commands Final check for PII Source: BITSS catalyst Resource: Erica Chuang

37 Reproducible Workflow
Helpful Resources Best Practices for Data and Code Management/ Project TIER Haverford College Workflow What are the essential elements of creating reproducible research? GitHub User Guide Why use GitHub, walk through of how to use it? The Scott Long Book on ‘ Workflow of data analysis using STATA’

38 Overview of current issues and initiatives in research transparency (1)
Tools + Resources New BITSS.org website that connect you to our network, research and resources (Haverford TIER Protocol) Library of slideshows and lectures Manual of Best Practices in Research Transparency, Workflow of Data Analysis in Stata and R.

39 Tale of Two Futures Overview of current issues and initiatives in research transparency (2) Education BITSS Summer Institute with participants annually from across social sciences – (fully paid in Berkeley!!!) Semester-long course on transparency available online Over 150 participants in international workshops Source : BITSS Catalyst resources

40 Tale of Two Futures Overview of current issues and initiatives in research transparency (3) Research Social Science Meta-Analysis and Research Transparency (SSMART) grant program, $450,000 in funding 10 funded projects in 2015; 7-8 pending in 2016 Growing internal capacity, publications by BITSS research staff (e.g., PLoS One, SPARC, OpenCon 2016 (call for application pending !!!, close 11th July 2016 ) Source: BITSS Catalyst resources: our activities focused around recognition are not just about rewarding exceptional achievements in the practice and advancement of transparency social science; but also drawing attention to them so that we incentivize others outside our network to take on these changing norms and practices. Of course, we are really excited about the Leamer-Rosenthal Prizes, and have both Ed Leamer and Bob Rosenthal joining us for today’s award ceremony for the 10 recipients. We have two categories of prizes: The Emerging Researchers Prize awards early-career researchers – junior faculty, postdoctoral researchers or graduate students – who adopt transparent research practices or pioneer new methods to increase the rigor of research; and the Leaders in Education Prize, which awards the work of professors who incorporate instruction in transparent practices in social science research into their curricula. We had over 50 nominations in this inaugural round, and are excited to present the top 10 nominees today. This prize is really about acknowledging and awarding that there is great work being done across the disciplines to change norms and practices around transparency. And we should know….because we’re working in a growing eco-system in open social science…

41 Tale of Two Futures Overview of current issues and initiatives in research transparency (4) Recognition 2016- Ten recipients of the inaugural Leamer-Rosenthal Prize for Open Social Science Cash awards of $10-$15K to individuals who are providing leadership through their research and/or practice on research transparency and reproducibility Call for Nominations just announced Source : BITSS catalyst Resource

42 Overview of current issues and initiatives in research transparency (5)
Growing Ecosystem Source: BITSS Catalyst Resources

43 QUESTIONS/ SUGGESTIONS THANK YOU FOR YOUR KIND ATTENTION
Tale of Two Futures QUESTIONS/ SUGGESTIONS THANK YOU FOR YOUR KIND ATTENTION

44 Tale of Two Futures DISCUSSIONS Following the presentation of the different features of what is a transparent research: What according to you are the main gaps and challenges preventing researchers from being transparent in research. What are the most and least promising ways to improve the reliability of research? (mostly in Europe and the UK?)


Download ppt "Research transparency and Reproducibility in Social Sciences"

Similar presentations


Ads by Google