Download presentation
Presentation is loading. Please wait.
Published byEileen Bryant Modified over 9 years ago
1
Literature/data integration and Ryan Scherle Data Repository Architect Dryad Digital Repository HighWire Fall Publishers’ Meeting November 20, 2013 You may reuse any of the original content in these slides as you wish, provided you attribute the source
2
CC-BY-NC-SA nic221 http://www.flickr.com/photos/nic221/391536867/
3
Bumpus HC (1898) The Elimination of the Unfit as Illustrated by the Introduced Sparrow, Passer domesticus. Biological Lectures from the Marine Biological Laboratory: 209-226. CC-BY Adamo http://www.piqs.de/fotos/121272.html
5
Who cares if the data is lost? By Agrant141 (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by- sa/3.0)], via Wikimedia Commonshttp://creativecommons.org/licenses/by- sa/3.0 James Cook, portrait by Nathaniel Dance-Holland, c. 1775, National Maritime Museum, GreenwichNathaniel Dance-HollandNational Maritime MuseumGreenwich
6
Source: Publishing Research Consortium, http://publishingresearch.net n=3824 6 Who cares if the data is lost?
7
Data “available upon request” Wicherts and colleagues requested data from from 141 articles in American Psychological Association journals. “6 months later, after … 400 emails, [sending] detailed descriptions of our study aims, approvals of our ethical committee, signed assurances not to share data with others, and even our full resumes…” only 27% of authors complied Wicherts JM, Borsboom D, Kats J, Molenaar D (2006) doi:10.1037/0003- 066X.61.7.726
8
Fighting data entropy 8 Information Content Time Time of publication Specific details General details Accident Retirement or career change Death (Michener et al. 1997)
9
Funder policies o CDC o DOD o DOE o EPA o NASA o NIH o NIST o NOAA o NSF o USDA US funding agencies that require or strongly recommend data sharing:
10
Joint data archiving policy Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. As a condition for publication, data supporting the results in the article should be deposited in an appropriate public archive. Authors may elect to embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information. http://datadryad.org/pages/jdap
11
Piwowar HA, Chapman WW (2008) hdl:10101/npre.2008.1700.1 Impact factor and archiving policies n=70 IF=3.6 IF=4.5 IF=6.0
12
Data archiving landscape There are so many data repositories that we need directories of them: o http://re3data.org http://re3data.org o http://DataBib.org http://DataBib.org These repositories vary along many dimensions: o Datatype focus o Community focus o Allowed file sizes o Curation policies o Data access policies o Funding model
13
Data archiving landscape Datatype Focus Community Focus General Focused Figshare Institutional Repository Supplemental Materials Supplemental Materials Genbank Pangaea Zenodo Lab Database Lab Database Dryad
14
Dryad vs supplementary materials DryadSOM Discoverable: indexed and exposed to both web and bibliographic search engines ✔✗ Identifiable: DataCite DOIs within articles serve as permanent, resolvable identifiers ✔✗*✗* Permanent: processes in place to promote preservation (incl. format migration) ✔✔ / ✗ ** Curated: quality control by both automated processes and human inspection ✔✗*✗* Ease of deposit: streamlined deposit, allowance for large and complex datasets ✔✔ / ✗ ** Formatted for reuse: do not convert reusable formats to PDF ✔✔ / ✗ ** Updatable: new versions of data files can be added, metadata can be enhanced ✔✗ Support for embargoes: can delay release of data in accordance with journal policy ✔✗ Free reuse: no paywall, clear terms of reuse (all data released under CC Zero) ✔✔ / ✗ ** Support for large files: allow data files up to 10GB ✔✗ Economy of scale: cost efficiency from shared infrastructure ✔✔ / ✗ ** Alignment to organizational mission: focus on archiving and reuse of scientific data ✔✗ 14 * A few publisher SOM sites are exceptions to the general rule ** Practices differ among publishers, see Smit (2011), doi:10.1045/january2011-smit
15
What makes Dryad unique 1.Tight focus on data associated with published literature 2.Data packages are curated 3.Open development process allows broad participation 4.Nonprofit organization managed by stakeholders 15DataDryad.org
16
Dryad features Quick and easy submission process…
17
Dryad features …referencing authoritative sources…
18
Dryad features …and leveraging integration with journals…
19
Dryad features …to maximize the submitter’s valuable time.
20
DataDryad.org20
21
Data citations Best practice is to cite both the article and the data – they are both useful research products But limit data citations to one data package per article – this eliminates most concerns about the size/granularity of data files 21DataDryad.org
22
22
23
Materials and Methods References
28
Dryad uptake >4,000 data packages containing >12,000 files associated with articles in 275 journals 200 submissions each month and growing Some data packages have been downloaded more than 10,000 times Fewer than 10% of authors chose to embargo their data when this option is allowed by the journal
29
Price schedule PlanMemberNon-member Minimum Purchase Voucher$65 per data package$70 per data package25 vouchers Deferred Payment $70 per data package$75 per data package 1 year contract Subscription annual fee based on $25 per published research article annual fee based on $30 per published research article 2 year contract Pay on submission N/A $80 per data package, payable by the submitter 1 data package 29
30
Sponsoring open data Functional Ecology Heredity Journal of Heredity Systematic Biology The American Naturalist Ecological Monographs Proceedings A Proceedings B Journal of Ecology Interface Focus Plant Physiology The Plant Call Open Biology Ecology and Evolution Evolutionary Applications eLife Publishers, societies, and other organizations are now sponsoring deposits in 44 Journals Evolution Elementa Palaeontology MycoKeys Comparative Cytogenetics Subterranean Biology Nature Conservation NeoBiota PhytoKeys ZooKeys Paleobiology Biodiversity Data Journal BioRisk Molecular Ecology Molecular Ecology Resources GMS German Medical Science GMS Medizinische Infomatik, Biometric und Epidemiologie Special Papers in Palaeontology Journal of Evolutionary Biology Journal of the Royal Society Interface Journal of Applied Ecology Journal of Animal Ecology Methods in Ecology and Evolution The Journal of Paleontology Journal of Hymenoptera Research Philosophical Transactions A Philosophical Transactions B
31
In development… Added value for journals, including a data display widget and a dashboard for editors
32
Integrated article & data submission Key functionality o Makes data deposition simple for authors (once files are prepared) o Ensures permanent link to data within each article (and vice versa). Options are customized to meet journal policies o Data can be submitted prior to manuscript review or upon acceptance o Journals may allow authors the option of a embargoing data for 1 year after publication 32
33
To learn more Repository home: http://datadryad.orghttp://datadryad.org News: http://blog.datadryad.orghttp://blog.datadryad.org Twitter: @datadryad Ryan Scherle, ryan@datadryad.org 33
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.