Software Sustainability Institute There’s No Such Thing As Irreproducible Research 27 th January 2016, Digging Into Data, Glasgow Neil Chue Hong Software Sustainability Institute ORCID: | Slides licensed under CC-BY where indicated: Supported by Project funding from
Software Sustainability Institute Or… “A personal journey as a data explorer” /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute T T /m9.figshare
Software Sustainability Institute T T “Virtual data warehouse” /m9.figshare
Software Sustainability Institute T T Public Health Data - Name - Address - Visits - Symptoms - Treatments /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute Mapping courtesy of Robin T Wilson /m9.figshare
Software Sustainability Institute T T This taught me three things… /m9.figshare
Software Sustainability Institute T T 1. The power of data is when it’s brought together /m9.figshare
Software Sustainability Institute T T 2. Software can help solve difficult data integration problems /m9.figshare
Software Sustainability Institute T T 3. No-one can spell diarhea diereah dyereeah diarrheah dioreah diarrhoea! /m9.figshare
Software Sustainability Institute Positive selection in large genomic datasets Selection at pleiotropic loci underlies disease co-occurrence in human populations. Navarro, Haley, Karosas et al. Submitted to Nature Genetics /m9.figshare
Software Sustainability Institute Hapbin: fast haplotype based scans hapbin: An Efficient Program for Performing Haplotype-Based Scans for Positive Selection in Large Genomic Datasets. DOI: /molbev/msv /m9.figshare
Software Sustainability Institute Open Source lets others benefit Slide courtesy of Nancy Wilkins-Diehr BEAST software licensed under LGPL /m9.figshare
Software Sustainability Institute /m9.figshare
Software Sustainability Institute Errors due to bioinformatics pipeline /m9.figshare
Software Sustainability Institute Raise standards for preclinical cancer research 47 out of 53 “landmark” publications could not be replicated Begley, Ellis. Nature, 483, 2012 doi: /483531a /m9.figshare
Software Sustainability Institute it’ Victoria Stodden, AMP Special Issue Reproducible Research Computing in Science and Engineering July/August 2012, 14(4) Howison and Herbsleb (2013) "Incentives and Integration In Scientific Software Production" CSCW /m9.figshare
Software Sustainability Institute Errors due to bioinformatics pipeline The results presented in the Report “Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent“ were affected by a bioinformatics error Llorente et al. Science, 350, 6262 doi: /science.aad /m9.figshare
Software Sustainability Institute T T Nullius in verba “Take nobody’s word for it” /m9.figshare
Software Sustainability Institute T T There’s no such thing as irreproducible research There’s reproducible research and there’s ignorance It’s not research if it’s not transparent /m9.figshare
Software Sustainability Institute
Software Sustainability Institute ^ and Software /m9.figshare
Software Sustainability Institute T T /m9.figshare
Software Sustainability Institute T T Vandewalle (2012) DOI: /MCSE /m9.figshare
Software Sustainability Institute T T Without data it’s difficult to validate results. But without code, we waste the opportunity to advance science /m9.figshare
Software Sustainability Institute Acknowledgements The SSI team: -Aleksandra Pawlik -Carole Goble -Claire Wyatt -Clem Hadfield -Dave De Roure -Devasena Prasad -Giacomo Peru -Graeme Smith -Iain Emsley -John Robinson -Les Carr -Mario Antonioletti -Mark Parsons -Mike Jackson -Olivier Philippe -Shoaib Sufi -Simon Hettrick -Stephen Crouch The SSI Fellows and collaborators especially: -James Baker -James Hetherington -Martin Hammitzsch -Robin Wilson (for contributing examples) EPCC Industry Projects: -Mark Sawyer -Maureen Wilkinson -Paul Graham -Rob Baxter -Terry Sloan Mouse Atlas: -James Sharpe -Richard Baldock Epigenetic analysis - James Prendergast - Colin Maclean Scientific software: -Dan Katz -Heather Piowowar -James Howison -Jeff Carver -Jennifer Schopf -Kaitlin Thaney -Martin Fenner -Victoria Stodden Software/Data Carpentry -Greg Wilson -Jonah Duckles -Katy Huff -Tracy Teal Slides at: