Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data enters Scholarly Communication; how publishers can help make things better Integration of Research Data and Publications Project ODE – workpackage.

Similar presentations


Presentation on theme: "Data enters Scholarly Communication; how publishers can help make things better Integration of Research Data and Publications Project ODE – workpackage."— Presentation transcript:

1 Data enters Scholarly Communication; how publishers can help make things better Integration of Research Data and Publications Project ODE – workpackage 4 Eefke Smit International Association of STM Publishers Director, Standards and Technology LONDON, ANNUAL APA CONFERENCE, 9 November 2011

2 A famous paper in Nature: DNA structure - 1953 1 page 2 authors 1 figure no data Source: V. Kiermer, Nature Publishing Group, 2011

3 Nature in 2001: The human genome issue 62 pages, 49 figures, 27 tables Source: V. Kiermer, Nature Publishing Group, 2011

4 The human genome at 10 – 2010 Nature now in an iPad edition: Source: V. Kiermer, Nature Publishing Group, 2011

5 A thousand genomes – 2010 http://www.nature.com/nature/journal/v467/n7319/full/nature09534.html Raw data: 12,145 SRA run ids submitted to Short Read Archive Raw data: 12,145 SRA run ids submitted to Short Read Archive Source: V. Kiermer, Nature Publishing Group, 2011

6 author information live updates Collapsible sections Tool box to print, download reference, share: email, social media, bookmark Figure previewer Related content new publishing models doi article-level metrics Source: V. Kiermer, Nature Publishing Group, 2011

7 From The BioChemical Journal, Portland Press: Every wanted to inspect data referenced in articles? Utopia Documents allows you to interact directly with curated database entries. Play with molecular structures; edit sequence and alignment data; even plot curated tabular data yourself. http://www.biochemj.org/bj/semantic_faq.htm http://www.biochemj.org/bj/semantic_faq.htm

8 8 Elsevier offers gene and protein viewers from within the article, to data stored elsewhere:

9 9 How big is the Data Problem ? Depositions of datasets in archives continue to grow, surpassing journal articles in biomedical research Growth of biomedical research publications (red; current total >19 million), alongside the accumulation of research data, including nucleic acid sequences (black; current total ~163 million), computer-annotated protein sequences (magenta; current total 9 million), manually annotated protein sequences (green; current total 500,000) and protein structures (blue; current total 60,000) Source: Biochemical Journal 2009 424, 317-333 - Teresa K. Attwood, Douglas B. Kell and others.

10 The Graph depicts the average size of a Journal of Neuroscience article and supplemental material in megabytes. As a consequence, the Journal no longer accepts supplementary files to manuscripts, soon the supplementary material would outgrow the article volume. The burden on the peer review process became simply to large. Editors suspect researchers to treat supplements as data dumping grounds (Emily Markus, Cell) Publishers cannot guarantee proper preservation and future accessibility of supp files. Maunsell J J. Neurosci. 2010;30:10599-10600 ©2010 by Society for Neuroscience How big is the Data Problem ? Too big for the Jnl of Neuroscience and Cell:

11 Researchers foresee higher volumes of data per research project: Source: PARSE.Insight survey 2008

12 12 Data Publications Pyramid: there is data, data and data......... Publications with data Processed Data and Data Representations Data Collections and Structured Databases Raw Data and Data Sets

13 Publications with data Processed Data and Data Representations Data Collections and Structured Databases Raw Data and Data Sets (1) Data contained and explained within the article (2) Further data explanations in any kind of supplementary files to articles (3) Data referenced from the article and held in data centers and repositories (4) Data publications, describing available datasets (5) Data in drawers and on disks at the institute The Data Publication Pyramid

14 14 The Pyramid’s likely short term reality: Pubs Supps Data Archives Data on Disks and in Drawers (1) Top of the pyramid is stable but small (2) Risk that supplements to articles turn into Data Dumping places (3) Too many disciplines lack a community endorsed data archive (4) Estimates are that at least 75 % of research data is never made openly avaiable

15 15 The Ideal Pyramid Data In Publications Article Supps Data Archives Data on Disks and in Drawers (1) More integration of text and data, viewers and seamless links to interactive datasets (2) Only if data cannot be integrated in article, and only relevant extra explanations (3) Seamless links (bi-directional) between publications and data, interactive viewers within the articles (4) More Data Journals that describe datasets, data mgt plans and data methods

16 16 How can publishers help to make things better Stricter editorial policies on the availability of underlying data Recommend reliable and trustworthy Data Archives to authors Enhance articles for better integration of underlying data Endorse guidelines for proper citation of data Launch and sponsor Data Journals Ensure persistent identifiers and bi-directional linking Partner with reliable Data Archives for further integration of Data and Publications, including interactivity for re-use.

17 17 What the Future Article might look like Articles will be less linear and more modular, offering layered presentation of different levels of detail, providing multiple entries to deeper depths for specialists, including to underlying data. Data, multimedia and other original material will become separately citable items and even publishable items in their own right. Underlying data will become part of articles, via interactive pdf‘s, via gene and protein viewers, via semantic links. Articles will be interactive; graphs and illustrations offer click throughs to deeper information. Same for semantically tagged terms. Data Archives will ensure links from data to publications, to ensure that all available literature is at hand for those interested in reusing the data.

18 Questions ? Eefke Smit International Association of STM Publishers Director, Standards and Technology smit@stm-assoc.org


Download ppt "Data enters Scholarly Communication; how publishers can help make things better Integration of Research Data and Publications Project ODE – workpackage."

Similar presentations


Ads by Google