Download presentation
Presentation is loading. Please wait.
Published byEugene Neal Modified over 8 years ago
1
Software Sustainability Institute www.software.ac.uk There’s No Such Thing As Irreproducible Research (Software Credit Edition) http://dx.doi.org/10.6084/m9.figshare.3159295 6-7 th April 2016, ATI Symposium on Reproducibility, Oxford Neil Chue Hong (@npch), Software Sustainability Institute ORCID: 0000-0002-8876-7606 | N.ChueHong@software.ac.uk Slides licensed under CC-BY where indicated: Supported by Project funding from
2
The UK research community relies on software Do you use research software? What would happen to your research without software Survey of researchers from 15 Russell Group universities conducted by SSI between August - October 2014. 406 respondents covering representative range of funders, discipline and seniority. 56% Develop their own software 71% Have no formal software training
3
£840m Investment in 2013-2014 financial year, an amount that has risen by 3% on average over last four years The cost of research that relies on software 30% Of total research investment has been spent on research which relies on software over the last four financial years Analysis of data from 49,650 grant titles and abstracts published on Gateway to Research covering 2010-2014.
4
The UK research community relies on software Survey of researchers from 15 Russell Group universities conducted by SSI between August - October 2014. 406 respondents covering representative range of funders, discipline and seniority. 56% Of UK researchers develop their own research software 71% Of UK researchers have had no formal software development training 140,000 UK researchers are relying on their own coding skills
5
Software Sustainability Institute www.software.ac.uk Raise standards for preclinical cancer research 47 out of 53 “landmark” publications could not be replicated Begley, Ellis. Nature, 483, 2012 doi:10.1038/483531a
6
Software Sustainability Institute www.software.ac.uk Repeatability of published microarray gene expression analyses 56% of analyses could not be repeated, of which 30% were because of software issues. 50% did not state software version, 39% did not provide raw data. Only 11% could be reproduced satisfactorily. Ioannidis et al. Nature Genetics, 41, 2010 doi:10.1038/ng.295
7
Software Sustainability Institute www.software.ac.uk Errors due to bioinformatics pipeline The results presented in the Report “Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent“ were affected by a bioinformatics error Llorente et al. Science, 350, 6262 doi:10.1126/science.aad2879
8
Software Sustainability Institute www.software.ac.uk T T Nullius in verba “Take nobody’s word for it”
9
Software Sustainability Institute www.software.ac.uk T T There’s no such thing as irreproducible research There’s reproducible* research and there’s ignorance It’s not research, if it’s not transparent * Caveat: “Reproducible” is a horribly overloaded term Slides: http://dx.doi.org/10.6084/m9.figshare.3159295
10
Software Sustainability Institute www.software.ac.uk Sharing software is a key part of reproducibility Improves transparency Improves understanding Elimination of errors Encourages collaboration Easier on-ramping Improves trust “Deep intellectual contributions now encoded only in software" – Stodden “Scholarship is the full software environment, code and data, that produced the result” - Claerbout Slides: http://dx.doi.org/10.6084/m9.figshare.3159295
11
Software Sustainability Institute www.software.ac.uk Repeatability in Computer Science Of 601 papers in ACM Computer Science journals and proceedings, only 85 provided a link to software. For 176 the software could not be obtained. Collberg, Proebsting, Warren, University of Arizona TR 14-04, 2015 http://reproducibility.cs.arizona.edu/v2/RepeatabilityTR.pdf
12
Software Sustainability Institute www.software.ac.uk The Research Cycle Create Test Interpret Publish Revise Paper Data Software Research Outputs Research is a continuous cycle. When we publish we are contributing to the body of knowledge.
13
Software Sustainability Institute www.software.ac.uk Research/Reuse/Reward Cycle Index Identify Cite Reward Create Test Interpret Publish Revise Research Reuse Reuse is also a cycle. We build our research on the work of others. Reward mechanisms should encourage reuse.
14
Software Sustainability Institute www.software.ac.uk Research/Reuse/Reward Cycle Index Identify Cite Reward Create Test Interpret Publish Revise Research Reuse Reuse is also a cycle. We build our research on the work of others. Reward mechanisms should encourage reuse. Slides: http://dx.doi.org/10.6084/m9.figshare.3159295
15
Software Sustainability Institute www.software.ac.uk T T “You must publish your software because it’s important for research”
16
Software Sustainability Institute www.software.ac.uk T T Mainstream attention has focused on problems Nature, doi:10.1038/467775a
17
Software Sustainability Institute www.software.ac.uk, it’ Victoria Stodden, AMP 2011 http://www.stodden.net/AMP2011/, Special Issue Reproducible Research Computing in Science and Engineering July/August 2012, 14(4) Howison and Herbsleb (2013) "Incentives and Integration In Scientific Software Production" CSCW 2013.
18
Software Sustainability Institute www.software.ac.uk T T Research Culture Needs Changing “This particular project was something I wrote a couple years ago to help me out with a workflow… I’d put it up on Github, so that others could potentially use it or use the code. So I went to see what people were saying about this project. It seemed liked I’d done something fundamentally wrong, so stupid that it flabbergasts someone... So of course I start sobbing. Then I see these people’s follower count, and I sob harder. I can’t help but think of potential future employers that are no longer potential.” http://www.software.ac.uk/blog/2013-01-25-haters-gonna- hate-why-you-shouldnt-be-ashamed-releasing-your-code
19
Software Sustainability Institute www.software.ac.uk http://arxiv.org/abs/1504.00062
20
Software Sustainability Institute www.software.ac.uk Research Culture Needs Changing Things I get credit for: Publishing papers Getting grants Societal impact (maybe) Things I don’t get credit for: Releasing my software Making my software easy to use Supporting software for others to use Investing effort in learning new tools Being helpful IDEAS versus IMPLEMENTATION? Different forms of credit? Support? Slides: http://dx.doi.org/10.6084/m9.figshare.3159295
21
Software Sustainability Institute www.software.ac.uk T T Research Culture Needs Changing Our research culture presents barriers but few incentives to sharing software There is a fear of being “found out” for poor software, but no encouragement or resources to improve software engineering skills There is no reward for publishing software in the current system of metrics. Researchers fear being “scooped” or losing ability to publish. Many organisations do not understand how to support researchers publishing software Slides: http://dx.doi.org/10.6084/m9.figshare.3159295
22
Software Sustainability Institute www.software.ac.uk Learning to be a carpenter Teach basic lab skills for scientific computing so that researchers can do more in less time and with less pain. Teach basic concepts, skills and tools for working more effectively with data. Workshops are designed for people with little to no prior computational experience. admin@datacarpentry.org admin@software-carpentry.org Open source learning, that can be tailored to disciplines. “Train the trainers”: building a capable base of instructors.
23
Software Sustainability Institute www.software.ac.uk T T Vandewalle (2012) DOI: 10.1109/MCSE.2012.63
24
Software Sustainability Institute www.software.ac.uk Getting credit for software Roles Project Credit http://credit.casrai.org/http://credit.casrai.org/ Transitive Credit http://doi.org/10.5334/jors.behttp://doi.org/10.5334/jors.be Mechanisms Software papers http://bit.ly/softwarejournalshttp://bit.ly/softwarejournals Software citation e.g. Software Citation Working Group https://www.force11.org/group/software-citation- working-group https://www.force11.org/group/software-citation- working-group Tools Researcher Identifiers e.g. ORCID http://orcid.org/http://orcid.org/ Alt-Metrics e.g. ImpactStory http://impactstory.org/http://impactstory.org/ Slides: http://dx.doi.org/10.6084/m9.figshare.3159295
25
Software Sustainability Institute www.software.ac.uk Research Software Engineer Join the RSE community at http://www.rse.ac.uk/ Research Software Engineers
26
Software Sustainability Institute www.software.ac.uk T T Reproducibility of data-intensive research relies on sharing of software We have many tools and techniques to support this… … but we need to change the way we think as a research community to make it common practice Slides: http://dx.doi.org/10.6084/m9.figshare.3159295
27
Software Sustainability Institute www.software.ac.uk Open Source lets others benefit Slide courtesy of Nancy Wilkins-Diehr BEAST software licensed under LGPL
28
Software Sustainability Institute www.software.ac.uk Further reading! Stodden, V and Miguez, S 2014 Best Practices for Computational Science: Software Infrastructure and Environments for Reproducible and Extensible Research. Journal of Open Research Software, 2(1): e21, pp. 1-6, DOI: 10.5334/jors.ay 10.5334/jors.ay Osborne JM, Bernabeu MO, Bruna M, Calderhead B, Cooper J, Dalchau N, et al. (2014) Ten Simple Rules for Effective Computational Research. PLoS Comput Biol 10(3): e1003506. doi:10.1371/journal.pcbi.1003506 Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. doi:10.1371/journal.pcbi.1003285 Prlić A, Procter JB (2012) Ten Simple Rules for the Open Development of Scientific Software. PLoS Comput Biol 8(12): e1002802. doi:10.1371/journal.pcbi.1002802 Noble WS (2009) A Quick Guide to Organizing Computational Biology Projects. PLoS Comput Biol 5(7): e1000424. doi:10.1371/journal.pcbi.1000424 Jones C (2010) Software Engineering Best Practices: Lessons from Successful Projects in the Top Companies. McGraw-Hill. ISBN: 978-0071621618
29
Software Sustainability Institute www.software.ac.uk The Software Sustainability Institute A national facility for cultivating better, more sustainable, research software to enable world- class research Software reaches boundaries in its development cycle that prevent improvement, growth and adoption Providing the expertise and services needed to negotiate to the next stage Developing the policy and tools to support the community developing and using research software Supported by EPSRC Grant EP/H043160/1 + EPSRC/ESRC/BBSRC grant EP/N006410/1
30
Software Policy Training Community Outreach Delivering essential software skills to researchers via CDTs, institutions & doctoral schools Helping the community to develop software that meets the needs of reliable, reproducible, and reusable research Collecting evidence on the community’s software use & sharing with stakeholders Bringing together the right people to understand and address topical issues Exploiting our platform to enable engagement, delivery & uptake
31
Website & blog Campaigns Advice Guides Courses Workshops Fellowship Research Software Policy Training Community Consultancy 50+ projects 130+ evaluations 4 surgeries 35+ UK SWC workshops 1000+ learners 80+ guides 50,000 readers 61 domain ambassadors 20+ workshops organised 740 researchers 50,000 grants analysed 150+ contributed articles 20,000 unique visitors per month 3,000 Twitter followers 300+ RSEs engaged2100 signatures 13 issues highlighted Outreach
32
Software Sustainability Institute www.software.ac.uk Find out more about the SSI Community Engagement (Lead: Shoaib Sufi) Fellowship Programme Fellowship Programme Events and Workshops Events and Workshops Consultancy (Lead: Steve Crouch) Open Call for Projects / Collaborations Open Call for ProjectsCollaborations Software Evaluation Software Evaluation Policy and Publicity (Lead: Simon Hettrick) Case Studies / Policy Campaigns Case StudiesPolicy Campaigns Software and Research Blog Software and Research Blog Training (Lead: Aleksandra Pawlik) Software Carpentry (300+ students/year) Software Carpentry Guides and Top Tips GuidesTop Tips Journal of Open Research Software (Editor: Neil Chue Hong) Journal of Open Research Software Collaboration between universities of Edinburgh, Manchester, Oxford and Southampton Supported by EPSRC Grant EP/H043160/1 + EPSRC/ESRC/BBSRC grant EP/N006410/1
33
Software Sustainability Institute www.software.ac.uk
34
Software Sustainability Institute www.software.ac.uk Community and contributions Open source is more than just a license It’s about making it easier for people to collaborate And reducing replication of effort Encouraging and expanding at the right pace Efficient tools for research, not “generic” software Increased robustness, reuse Being open from the start rather than waiting until it’s “ready” Code is never perfect, always evolving
35
Software Sustainability Institute www.software.ac.uk Cultivate community Basics: Website, mailing list, code repository, issue resolution Remove barriers to participation, increase efficiency 1993: First public release; 2 devs 1995: Code open sourced; 3 devs 1996: r-testers list set up 1997: lists split: r-announce, r-help, r-devel; public CVS; 11 devs 2000: CRAN split and mirror 2001: BioConductor 2003: Namespaces 2005: I8n, L8n 2007: R-Forge Today: BioConductor (33 core devs), R-Forge (532 projects, 1562 devs), CRAN (1400+ packages) 35 http://cran.r-project.org/doc/html/interface98-paper/paper_2.html
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.