Rewarding Reproducibility and Method Publishing the GigaScience Way Scott Edmunds

Slides:



Advertisements
Similar presentations
Software workflows as research objects & GigaGalaxy Rob L Davidson, Chris I Hunter ISI CODATA International Training Workshop on Big Data 11 th March 2015.
Advertisements

Teula Morgan The Adaptable Repository: Swinburne Online Journals.
INTRODUCTION TO RESEARCH DATA MANAGEMENT Robin Desmeules Janice Kung J W Scott Health Sciences Library University of Alberta Libraries.
SCOPUS AND SCIVAL EVALUATION AND PROMOTION OF UKRAINIAN RESEARCH RESULTS PIOTR GOŁKIEWICZ PRODUCT SALES MANAGER, CENTRAL AND EASTERN EUROPE KIEV, 31 JANUARY.
Publishing and crediting different shaped research objects the way Scott Edmunds, #FORCE2015.
Promoting data dissemination and reproducibility. Christopher I. Hunter, Scott C. Edmunds, Peter Li, Xiao Si Zhe, Robert L Davidson, Laurie Goodman. Submit.
Tools for reproducible and accessible science VMs, KnitR and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015.
Open Data, Open Source: preparing for Big Data in Metabolomics Rob L Davidson #MetSoc2015 This presentation DOI: /m9.figshare
Scientific Data Infrastructure in CAS Dr. Jianhui Scientific Data Center Computer Network Information Center Chinese Academy of Sciences.
Bioinformatics Core Facility Ernesto Lowy February 2012.
Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: /m9.figshare
Open Data, Open Source: preparing for Big Data in Metabolomics Rob L Davidson #MetSoc2015 This presentation DOI: /m9.figshare
Software workflows as research objects & GigaGalaxy Rob L Davidson, Chris I Hunter ISI CODATA International Training Workshop on Big Data 11 th March 2015.
Introduction to GigaScience journal & database Chris I Hunter & Rob L Davidson ISI CODATA International Training Workshop on Big Data 11 th March 2015.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
ANDS and its Services Phenomics Data & Informatics Workshop 2010, Friday, 23rd April 2010.
IPlant Collaborative Hands-on Cyberinfrastructure Workshop – Part 2 R. Walls University of Arizona Biodiversity Information Standards (TDWG) Sep. 29, 2015,
Bioinformatics Core Facility Guglielmo Roma January 2011.
TWC Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Observatory Community Xiaogang (Marshall) Ma, Yu Chen, Han Wang, Patrick West,
SiZhe Xiao GigaScience 2013 POSTER Open Access GigaDB – revolutionizing data dissemination, organization and use Xiao Si Zhe 1, Chris Hunter, Tam P. Sneddon,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
The iPlant Collaborative Using iPlant for sharing, managing, and analyzing ecological data Ramona Walls Presented at ESA 2014 – Ignite session August 12,
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Infrastructures for Social Simulation Rob Procter National e-Infrastructure for Social Simulation ISGC 2010 Social Simulation Tutorial.
WHAT ARE WE GOING TO DO WITH DATA? Rob L Davidson #WCSJ2015 This presentation DOI: /m9.figshare
PIXUS - The JISC Image Portal Demonstrator Portals & Portlets 2003 e-Science Institute Sandy Buchanan
Construction of Shanghai Life Science & Bio-technology Service Platform for Data Access and Sharing International Workshop on Strategies Presentation of.
Children’s Health Exposure Analysis Resource (CHEAR) CHEAR Center for Data Science Susan Teitelbaum, PhD November 4, 2015.
GigaScience ( is an online, open-access journal that includes, as part of its publishing activities, the database GigaDB.
Merging and sharing Metabolomics analysis tools with Galaxy: transparent, reproducible, open 'omics Robert L Davidson #MMW2014 Merlion.
Linking Embargoed Datasets: A Plan for Improving How Research Data Can Be Shared, Linked and Tracked Arlington, VA, November 19, 2015 Anita de Waard VP.
Introduction to the VO ESAVO ESA/ESAC – Madrid, Spain.
Networks ∙ Services ∙ People Thomas Bärecke Journée Fédération, Paris Collaboration européenne GÉANT SA5 03/07/2015 SA5 T5 team
Lars Ailo Bongo NBS meeting Tromsø, Jan 23, 2016 NeLS Norwegian e-Infrastructure for Life Sciences Overview and recent developments
A Flexible Model for Quality Assurance Frameworks and Quality Management Systems Q2010 Helsinki 4 May 2010 Peter van Nederpelt
ESDIS DOI STATUS DOI process in operation since 2010 Process fully automated (manual review) Implement improvements to the process as we learn along the.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
Transforming Science Through Data-driven Discovery Tools and Services Workshop Atmosphere Joslynn Lee – Data Science Educator Cold Spring Harbor Laboratory,
Transforming Science Through Data-driven Discovery Tools and Services Workshop Data Store Overview.
Transforming Science Through Data-driven Discovery Workshop Overview Ohio State University MCIC Jason Williams – Lead, CyVerse – Education, Outreach, Training.
Kathleen Shearer Data management: The new frontier for libraries.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
CyVerse Data Store Managing Your ‘Big’ Data. Welcome to the Data Store Manage and share your data across all CyVerse platforms.
Transforming Science Through Data-driven Discovery Using CyVerse Cyberinfrastructure to Enable Data Intensive Research, Collaboration, and Education Atmosphere.
DATA CITATION Laurie Goodman, PhD Editor-in-Chief, GigaScience ORCID ID: Twitter:
Bringing visibility to food security data results: harvests of PRAGMA and RDA Quan (Gabriel) Zhou, Venice Juanillas Ramil Mauleon, Jason Haga, Inna Kouper,
Enhancements to Galaxy for delivering on NIH Commons
Peter Li GigaScience GigaDB and Galaxy: revolutionizing data dissemination, organization and analysis Peter Li GigaScience.
Nicolas Granier / June 7th
Scholarly Workflow: Federal Prototype and Preprints
EUDAT: collaborative pan-European infrastructure providing research data services, training and consultancy This work is licensed.
Tools and Services Workshop
Edmunds GigaScience 2013 POSTER Open Access
Joslynn Lee – Data Science Educator
Tin-Lap, LEE School of Biomedical Sciences,
INTAROS WP5 Data integration and management
Data Ingestion in ENES and collaboration with RDA
GigaDB – revolutionizing data dissemination, organization and use
PresQT - Preservation Quality Tool
Optimize your research performance using SciVal
University of Edinburgh
Open Access to your Research Papers and Data
PhUSE Computational Science
Cyber-Infrastructure for Marine Biodiversity Data
Going digital: our next steps
ELIXIR Competence Center
Skill gaps and planning for training pilots
LOSD Publication Deirdre Lee
Large-scale spectroscopic surveys management
Evaluate the integral {image}
Presentation transcript:

Rewarding Reproducibility and Method Publishing the GigaScience Way Scott Edmunds

The Issue: = growing reproducibility gap Data-driven science era brings: Huge opportunities Huge challenges with: data curation, review/QA, handling, sharing

GigaSolution: deconstructing the paper Take data publication approach further and reward: Data availability Metadata/curation Interoperability Availability of workflows Transparent analyses Data Metadata Methods Analyses

GigaSolution: deconstructing the paper Worlds largest genomics organisation with: 17PB storage, 20.5K cores, 212TFlops, >1000 bioinformaticians Utilizes big-data infrastructure and expertise from: Combines and integrates: Open-access journal Data Publishing Platform Data Analysis Platform

How are we supporting data reproducibility? Data sets Analyses Linked to DOI Open-Paper Open-Review DOI: / X-1-18 >6500 accesses Open-Code 8 reviewers tested data in ftp server & named reports published DOI: / Open-Pipelines Open-Workflows DOI: / Open-Data 78GB CC0 data Code in sourceforge under GPLv3: >4000 downloads Enabled code to being picked apart by bloggers in wiki

SOAPdenovo2 workflows implemented in galaxy.cbiit.cuhk.edu.hk

SOAPdenovo2 workflows implemented in galaxy.cbiit.cuhk.edu.hk Implemented entire workflow in our Galaxy server, inc.: 3 pre-processing steps 4 SOAPdenovo modules 1 post processing steps Evaluation and visualization tools Also available to download by >25K Galaxy users in

“Deconstructed” Journal “Regular” Journal “Conscientious” Online Journal

“Deconstructed” Journal “Regular” Journal “Conscientious” Online Journal

“Deconstructed” Journal “Regular” Journal “Conscientious” Online Journal

Image Source: “Deconstructed” Journal “Regular” Journal “Conscientious” Online Journal

Ultimate Goal: Executable papers Data Papers Executable (Methods) Papers Analysis Papers

Give us your data & pipelines! * What is needed to make it happen? Contact us: * APC’s currently generously covered by BGI

Ruibang Luo (BGI/HKU) Shaoguang Liang (BGI-SZ) Tin-Lap Lee (CUHK) Huayen Gao (CUHK) Qiong Luo (HKUST) Senghong Wang (HKUST) Yan Zhou (HKUST) Thanks facebook.com/GigaScience blogs.openaccesscentral.com/blogs/gigablog/ Peter Li Chris Hunter Jesse Si Zhe Nicole Nogoy Tam Sneddon Alexandra Basford Laurie Goodman Follow us: galaxy.cbiit.cuhk.edu.hk CBIIT Funding from: Our collaborators: team:

Ruibang Luo (BGI/HKU) Shaoguang Liang (BGI-SZ) Tin-Lap Lee (CUHK) Huayen Gao (CUHK) Qiong Luo (HKUST) Senghong Wang (HKUST) Yan Zhou (HKUST) Thanks facebook.com/GigaScience blogs.openaccesscentral.com/blogs/gigablog/ Peter Li Chris Hunter Jesse Si Zhe Nicole Nogoy Tam Sneddon Alexandra Basford Laurie Goodman Follow us: galaxy.cbiit.cuhk.edu.hk CBIIT Funding from: Our collaborators: team: Happy New Year!