Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson #MMW2014 Merlion Metabolomics Workshop
Researcher bias Positive result bias 20 teams do studies, 1 publishes p<0.05 Poorly explained analyses DOI: /journal.pmed
85% of research resources are wasted! We must... favor... unbiased, transparent, collaborative research with greater standardization Share data, protocols, materials, software, other tools DOI: /journal.pmed
Data sharing Supported by gov policy: e.g. UK and NIH MetaboLights repository NIH Metabolomics Data Repository ISA-Tab for metadata
What about methods? “The good news is that I was able to find some code. I am just hoping that it is a stable working version of the code... I have lost some data... The bad news is that the code is not commented and/or clean. So, I cannot really guarantee that you will enjoy playing with it.” 613 papers tested 123 successful reproductions
Problem There is a reproducibility crisis Published results are untrustworthy Research is a waste of government money (85%) What's the solution? Share data AND methods
Galaxy Over 36,000 main Galaxy server users Over 1,000 papers citing Galaxy use Over 55 Galaxy servers deployed Open source
Galaxy – Toolshed Many 'omics, stats, visualisations Metabolomics can plug into this tools! Download; Run instantly
Any tool in Galaxy python myfunction input1 Basic xml 'wrapper' Describe inputs and outputs Calls command Monitors for output Logs/returns to 'history'
Galaxy Tool ListTool ParametersHistory/results
Birmingham metabolomics workflow SIM-Stitch DOI: /j.jasms XCMS DOI: /ac051437y MI-Pack DOI: /j.chemolab KNN Impute DOI: /s PQ-Normalisation DOI: /ac051632c G-Log transform DOI: / PCA (with statistical test of scores)
Birmingham metabolomics workflow Many tools Many languages Complex to learn Many parameters Complex to report
Metabolomics workflow in Galaxy User sees website (intuitive) Centrally stored (secure) Workflow is recorded Methods shareable
View, share, edit, rerun workflow
Citable workflow Add as supplemental files or publish with distinct DOI via GigaDB or FigShare
Where to get our workflow Coming soon! Galaxy Toolshed Github Submitted to GigaScience (gigasciencejournal.com) VM/Code/TestData to be available on GigaDB.org Test server to be available at GigaGalaxy
Summary Share your data Share your software Share your workflow – in full Galaxy is not a new 'software', it's a flexible sharing platform Add your tools to ours, in Galaxy Toolshed Help make metabolomics: Trustworthy, meaningful, reproducible
Acknowledgements University of Birmingham Ralf Weber Mark Viant GigaScience Pete Li Funding NERC NE/K011294/1
Me: Rob L. DavidsonThis presentation: