Download presentation
Presentation is loading. Please wait.
Published byColin Welch Modified over 9 years ago
1
SiZhe Xiao GigaScience 2013 POSTER Open Access GigaDB – revolutionizing data dissemination, organization and use Xiao Si Zhe 1, Chris Hunter, Tam P. Sneddon, Scott C. Edmunds, Alexandra T. Basford, Peter Li, and Laurie Goodman. Abstract GigaScience, the online open-access open-data journal, has recently developed GigaDB, a new integrated database of ‘big-data’ studies from the life and biomedical sciences. The initial goals of GigaDB are to assign DOIs to datasets to allow them to be tracked and cited, and to provide a user-friendly web interface to provide easy access to selected GigaDB datasets and files. We will be working with authors to make the raw data, computational tools and data processing pipelines described in the GigaScience papers available and, where possible, executable on an informatics platform. We hope that by making both the data and processes involved in their analysis freely accessible, this novel form of publication will help articles published in GigaScience to have a much higher impact in the scientific literature, and maximize their reuse within the community. GigaDB currently accepts submissions in Excel format. Example submission and template files can be found on the website (http://gigadb.org/). To date, GigaDB comprises over 56 datasets and includes Genomic, Transcriptomic, Epigenomic and Metagenomic dataset types but we accept many other dataset types including proteomic and neuroimaging studies. Future goals include integration with the BGI Cloud, and with the Galaxy software tools to enable users to directly upload files to Galaxy for further analysis. We are also working with ISA- Tab and other scientific standards groups to support and extend the usability and interoperability model. Keywords: DOI, Galaxy, big-data, database, informatics platform, GigaScience doi:10.6084/m9.figshare.786486 Cite this poster as: GigaDB – revolutionizing data dissemination, organization and use. Xiao Si Zhe, Chris Hunter, Tam P. Sneddon, Scott C. Edmunds, Alexandra T. Basford, Peter Li, and Laurie Goodman. http://dx.doi.org/10.6084/m9.figshare786486 © 2013 Edmunds et al. This is an Open Access poster distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Correspondence: jesse@gigasciencejournal.com 1. BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong Kong SAR, China. 2. BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen, China. 3. School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China. 4. CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China. 5. HKU-BGI Bioinformatics Algorithms and Core Tecnology Research Laboratory & Department of Computer Science, University of Hong Kong, Pok Fu Lam, Hong Kong 6. Oxford e-Research Centre, University of Oxford, Oxford, UK. Laurie Goodman, Chris Hunter, Scott Edmunds, Tam Sneddon (GigaScience), Shaoguang Liang (BGI-SZ), Qiong Luo, Senghong Wang, Yan Zhou (HKUST), Rob Davidson and Mark Viant (Birmingham Uni), Marco Galardini (Unifi) Acknowledgements Thanks to: Financial support from: Data sets Analyses Linked to DOI Open-Paper DOI:10.5524/100044 Open-Pipelines Open-Workflows DOI:10.5524/100038 Open-Data 78GB CC0 data Linking papers to data and analyses 10/18 microarray papers cannot be reproduced Ioannidis: “Most Published Research Findings Are False” >15X increase in retracted papers in last decade Lack of incentives to make data/methods available Poor metadata quality and lack of interoperability Growing replication gap: Background Combine and integrate (via citable DOIs): Open-access journal www.gigasciencejournal.com Data Publishing Platform gigadb.org Data Analysis Platform galaxy.cbiit.cuhk.edu.hk GigaSolution: deconstructing the paper Submit your next manuscript containing large-scale data and workflows to GigaScience and take full advantage of: No space constraints, and unlimited data and workflow hosting in GigaDB and GigaGalaxy Article processing charges for all submissions in 2013 covered by BGI Open access, open data and highly visible work freely available for distribution Inclusion in PubMed and Google Scholar GigaDB Home page: www.gigadb.org Aspera data transfer Faster download speeds Validation checks Fail – submitter is provided error report Pass – dataset is uploaded to GigaDB. GigaDB Submission Workflow Curator makes dataset public (can be set as future date if required) DataCite XML file Excel submission file Submitter logs in to GigaDB website and uploads Excel submission GigaDB DOI assigned Files Submitter provides files by ftp or Aspera XML is generated and registered with DataCite Curator Review Curator contacts submitter with DOI citation and to arrange file transfer (and resolve any other questions/issues). DOI 10.5524/10000310.5524/100003 Genomic data from the crab-eating macaque/cynomolgus monkey (Macaca fascicularis) (2011) Public GigaDB dataset Datasets public in GigaDB
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.