Download presentation
Presentation is loading. Please wait.
Published byHugh Quinn Modified over 9 years ago
1
BioMed Central’s open data initiatives Alliance for Permanent Access conference 7 th November 2012 Iain Hrynaszkiewicz Publisher (Open Science), BioMed Central iain.hrynaszkiewicz@biomedcentral.com @iainh_z
2
About BioMed Central Launched in 2000, largest global publisher of peer- reviewed open access journals (>240) >136,000 peer-reviewed open access articles published Part of Springer Science+Business Media since 2008 Publish using Creative Commons (CC-BY) licenses Non-journal products include ISRCTN database Interested in innovation and recognise the growing need for data sharing and publication http://blogs.biomedcentral.com/bmcblog/tag/Open-Data/ http://blogs.biomedcentral.com/bmcblog/tag/Open-Data/
3
BioMed Central and open data Increasing transparency in scientific research and scholarly communication is at the core of strategy Data are an increasingly integral part of scholarly communication, with many opportunities for increasing the pace of knowledge discovery Publishers, particularly open access publishers, are well- placed to share information across domain boundaries http://www.biomedcentral.com/about/access http://www.biomedcentral.com/about/access “By ‘open data’ BioMed Central means that these data are freely available on the public internet permitting any user to download, copy, analyse, re-process, pass them to software or use them for any other purpose without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. BioMed Central encourages the use of fully open formats wherever possible.”
4
BioMed Central open data initiatives Data journals and article types Open Data Award Data hosting, citation, deposition and linking Lab notebook-journal integration (LabArchives) Data licensing Guidance and best practice e.g. human subjects – confidentiality and consent Data formats and standards – efficient reuse Facilitation of data/text mining research
5
Problem: Lack of credit/recognition for data sharing and publication In science credit is everything but incentives for data publication are still emerging Datasets are not generally as discoverable and citable as journal articles – yet Requirements for data sharing are field/location- specific Need more empirical evidence of the benefits of data publication for individual scientists
6
Data notes: “[B]riefly describe a biomedical data set or database, with the data being readily accessible and attributed to a source” http://bit.ly/y3Jb3bhttp://bit.ly/y3Jb3b Data notes: “[E]xceptional datasets deposited in our GigaScience repository that have been selected for further peer review” http://bit.ly/yPBsAA http://bit.ly/yPBsAA Research: E.g. The International Stroke Trial database http://www.trialsjournal.com/content/12/1/101 http://www.trialsjournal.com/content/12/1/101 Solution #1: Journals and article types enabling data publication
7
Solution #2: Open Data Award “We... recognize researchers who have... have demonstrated leadership in the sharing, standardization, publication, or re-use of biomedical research data.” http://www.biomedcentral.com/researchawards/opendata
8
Solution #3: Enable and encourage/require data citation “References... Only articles, datasets and abstracts that have been published or are in press, or are available through public e-print/preprint servers, may be cited … “Dataset with persistent identifier Zheng, L-Y; Guo, X-S; He, B; Sun, L-J; Peng, Y; Dong, S-S; Liu, T-F; Jiang, S; Ramachandran, S; Liu, C-M; Jing, H-C (2011): Genome data from sweet and grain sorghum (Sorghum bicolor). GigaScience. http://dx.doi.org/10.5524/100012."http://dx.doi.org/10.5524/100012 http://blogs.biomedcentral.com/bmcblog/2012/01/19/citing-and-linking- data-to-publications-more-journals-more-examples-more-impact/
9
Problem: Where can data be stored – permanently? Publishers not best placed to run repositories for long term preservation of large datasets Mirrors of publisher content not able to accept arbitrary amounts of additional data Many data repositories exist but most are domain/location specific and there are many different types of funding model, license agreement and persistent identifiers in use
10
Solution #1: Journal with integrated database
11
Editor-in-Chief: Laurie Goodman, BGI (USA) www.gigasciencejournal.com www.biomedcentral.com The BGI is covering all APCs for the first year after launch GigaScience publishes ‘big-data’ studies from the entire spectrum of life sciences Novel publishing format - manuscript publication and data hosting Editor: Scott Edmunds, BGI (China) Assistant Editor: Alexandra Basford, BGI (China) Assignment of data DOIs allows separate data citation Benefits
12
http://gigadb.org/
13
GigaDB is a new database integrated with the GigaScience journal to meet the needs of a new generation of biological and biomedical research as it enters the era of “big-data”… (see more)
14
http://gigadb.org/
15
Anatomy of a GigaScience Publication Data Idea Study Analysis Answer Metadata
16
Solution #2: Comprehensive author information on available data repositories http://datacite.org/repolist http://www.biomedcentral.com/about /supportingdata
17
Solution #3: Research on repositories http://publicationethics.org/files/u661/ EthicalEditing_Autumn2012_final.pdf We are looking for repositories with interests in clinical research data – can you help?
18
Problem: Data are not consistently linked to publications Data deposition policies are not established in all fields Even where they are links/accession numbers tend to be inconsistently presented and rarely cited Researchers may, independently of journal requirements, deposit data in repositories A missed opportunity to enhance the literature
19
Solution #1: ‘Availability of supporting data’ article section A tool to put data deposition policies – encouraged or mandated – into practice Provides links in a consistent place within an article to supporting data, regardless of the location or format of the data Data must be permanently available (DOI or equivalent) ~50 journals including GigaScience, BMC series http://www.biomedcentral.com/about/supportingdata
20
Availability of supporting data BMC Res Notes 2012, 5:21 http://www.biomedcentral.com/1756-0500/5/21/http://www.biomedcentral.com/1756-0500/5/21/ GigaScience 2012, 1:3 http://www.gigasciencejournal.com/content/1/1/3http://www.gigasciencejournal.com/content/1/1/3
21
Solution #3: Lab notebook integration BMC authors entitled to LabArchives’ (http://www.labarchives.com/bmc) online lab notebook with 100Mb of free storagehttp://www.labarchives.com/bmc Features include: - Data publishing with DOIs assignment - Citable, linkable data supporting publications - Reusable/integrate-able data with CC0 waiver - Integrated manuscript submission to BMC journals - Additional free storage (standard is 25Mb) http://blogs.openaccesscentral.com/blogs/bmcblog/entry/labarchives_and_biomed_central_a
22
LabArchives partnership
23
24 Oct 2012 Open data partnership leads to release of data from Nobel Prize- winning laboratory for public use http://www.biomedcentral.co m/presscenter/pressreleases/2 0121024c
24
“The data should be released in standardized formats without intellectual property constraints.” Conway PH, VanLare JM: Improving Access to Health Care Data: The Open Government Strategy. JAMA 2010;304(9):1007-1008. http://pantonprinciples.org/ http://www.isitopendata.org/ “[P]eople mis-use copyright licenses on uncopyrightable materials and data sets: the confusion of the legal right of attribution in copyright with the academic and professional norm of citation of one's efforts.” John Wilbanks, VP, Science, Creative Commons, http://bit.ly/djl5Fa August 11, 2010 http://bit.ly/djl5Fa “...any restrictions on use should be strongly resisted and we endorse explicit encouragement of open sharing.” Schofield et al.: Post-publication sharing of data and tools. Nature 2009, 461:171. Problem: Licensing that restricts data integration and (re)use efficiently
25
Why Creative Commons CC0? interoperability: CC0 is human and machine- readable universality: CC0 is global and universal and widely recognized simplicity: no need for humans to make, and respond to, individual data requests – avoids “attribution stacking” with CC-BY licenses Schaeffer P: Why does Dryad use CC0? http://blog.datadryad.org/2011/10/05/why-does-dryad-use-cc0/ http://blog.datadryad.org/2011/10/05/why-does-dryad-use-cc0/ http://creativecommons.org/publicdomain/zero/1.0/
26
Solution: Stakeholder engagement and community collaboration, leadership
27
Public consultation on implementing CC0 for data published in open access journals: closes 10 th November 2012 http://blogs.biomedcentral.com/bmcblog /2012/09/10/put-the-open-in-open-data/ Hrynaszkiewicz I, Cockerill MJ: Open by default: a proposed copyright license and waiver agreement for open access research and data in peer- reviewed journals. BMC Research Notes 2012, 5:494 http://www.biomedcentral.com/1756- 0500/5/494 http://www.biomedcentral.com/1756- 0500/5/494
28
Implementing CC0 in journals – how? Specify a date from which the new license would apply to data (CC-BY remains for other content) Only applies to data submitted to the journal Some relatively minor technical and operational implications Cultural change may be the biggest challenge Consultation is identifying common concerns, FAQs, and further definitions and use cases for open data in journal publications Hrynaszkiewicz I, Cockerill MJ: Open by default: a proposed copyright license and waiver agreement for open access research and data in peer-reviewed journals. BMC Research Notes 2012, 5:494 http://www.biomedcentral.com/1756-0500/5/494 http://www.biomedcentral.com/1756-0500/5/494
29
Problem: Lack of guidance, exemplars, incentives to make date reusable Sharing/publishing detailed human subjects data, in the absence of explicit consent, can potentially infringe privacy (ethically and legally) Data are more (re)usable if published in community endorsed, standard formats Standards and appropriate guidance do not yet exist in all domains Few incentives to follow data standards
30
Solution #1: Work with journal editors to produce guidance where it is needed BMJ 2010;340:c181 Co-published in: Trials 2010, 11:9
31
Solution #2: Publish exemplars
33
Solution #3: Incentivize, promote and share best practice and standards http://www.biomedcentral.com/bmcresnotes/series/datasharing http://biosharing.org/standards_view
34
Problem: Adding value to data of use to researchers, readers and publishers Text/data mining applications often are research project or research specific and not always attractive to commercial publishing platforms and their customers Value to the non-expert can be limited Makes business model/case challenging for publishers
35
http://www.biomedcentral.com/about/datamining/
36
www.casesdatabase.com
37
www.casesdatabase.com – coming soon
40
The future... Image adapted from Gillam et al: The Healthcare Singularity and the Age of Semantic Medicine. In The Fourth Paradigm (2009)
41
Questions? Iain Hrynaszkiewicz Publisher (Open Science), BioMed Central iain.hrynaszkiewicz@biomedcentral.com http://www.mendeley.com/profiles/iain-hrynaszkiewicz/ http://uk.linkedin.com/in/iainhz @iainh_z
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.