H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, On Publishing Data - “Earth System Science Data” a Data Publishing Journal Hans Pfeiffenberger, David Carlson, Sünje Dallmeier-Tiessen, Alfred-Wegener-Institute for Polar and Marine Research, Helmholtz Association - Germany, British Antarctic Survey - Great Britain Bloomsbury Conference 2009, UCL, London
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Agenda Why publish data... and: What is the problem? Developments in the arena of science policy History, state of the art and missing elements ESSD - “Earth System Science Data”, a journal A practical contribution to an emerging genre of scholarly communication Aims and scope; structure of articles, review criteria Conclusion and Outlook: Specific: On ESSD General: Contribution of “classical” academic publishing to data publishing
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, ESF / EuroHORCs European Research Area Vision Interestingly, there is no mention of a world class publishing industry.... Or is this industry a research infrastructure ? !! We will show how publishing can help comply with the requirement for quality assured research data
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Data is the foundation of scientific knowledge Ur, Mesopotamia, 2000 BC: First known recording of a lunar eclipse 700 BC: Babylonians predict lunar, 585 BC: Thales predicts solar eclipse 17th century: Galileo does experiments, Newton explains astronomers’ observations Newton humbly declares: „If I have seen a little further it is by standing on the shoulders of Giants“ 1665 AD: “Philosophical Transactions of the Royal Society of London“ created which virtualize and preserve the giants’ shoulders 2005 AD: Tony Hey, director British eScience programme, declares „...key drivers behind the search for such new scientific tools is the imminent deluge of data...“
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Let me propose a different analogy: Scientific knowledge has been built like a huge building: Books and articles represent important building blocks or bricks between the layers of bricks there is mortar : new evidence, data Are there problems with the “shoulders”? QA !No QA ?! We do have systematic - not 100% effective - quality assurance for the bricks, but effectively no (adequate) systematic quality assurance for the mortar
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Consider Ozone data from satellites: Fusco, L., J. Linford, W.J. Som de Cerff, C. Boone, C. Leroy and M. Petitdidier, Earth Observation Applications Approach to Data and Metadata Deployment on the European DataGrid Testbed Well documented procedures Well defined products Well known instruments ESA / other gov. agencies as stewards => Elaborate infrastructure QA by process!
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Consider ground based ozone profiles from Antarctica Ozone soundings (balloon-carried sonde profiles) in the years when the “ozone hole” first developed balloon data needed for calibration of satellite data and thus, verification of models König-Langlo, G. and Gernandt, H.: Compilation of ozonesonde profiles from the Antarctic Georg-Forster-Station from 1985 to 1992, Earth Syst. Sci. Data, 1, 1-5, 2009
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Handling of Ozone data as State of the Art These two “datasets” exemplify the two prevailing modes of handling data at present: Either at the “Petascale”, where largely homogeneous mounds of data are handled in an industrial fashion, and collated into one super-dataset, comparable to a book holding the work of a lifetime Or at the “Megascale”, where large numbers of heterogeneous datasets are handled as in a factory (manufaktur), by a craftsperson or an artisan. They are communicated on demand through mail or via obscure ftp-server, comparable to the letter from scholar to scholar. There is almost no in-between, yet, to handle the bulk of information at the Giga- to Terascale, which needed to be comparable to the system of academic journals for textual information.
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Summary - Outlook - Part I ESF: “... permanent access to... quality assured research data” Aim: Reuse & Reproduce Digital Longterm Preservation Persistent (and Open) Access, Licensing Quality Assessment Data provided and described by researchers Basic and advanced data infrastructure, provided by ??? Data publishingmust provide Required forData publishing
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Who is who… Advisory Board: Paul J. Crutzen Sydney Levitus Alexander Petrovich Lisitzin Editors in Chief: David Carlson Hans Pfeiffenberger Publishing House Copernicus Publishers – OA Publisher, EGU Managing Editor Suenje Dallmeier-Tiessen
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, The first paper
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Repository Reference
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Estimate of Error and Data Provenance
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Originality: Are the data or methods new - i.e., never measured or employed before Significance: Is there any potential of the data being useful? Uniqueness Usefulness Completeness Data Quality The data must be presented readily available in a usable format. Accuracy, methods, instrumentation and processing as state of the art Review Guidelines
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Today‘s Data Reuse, Citation and Quality Control
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Reuse, Citation and Quality Assessment with ESSD
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Summary - Outlook : Part II Reward for data publication, by being citable (impact factor) Quality assured data and data documentation facilitate future reuse First articles online – first experiences Outlook Special Issue with 18 papers from the CARINA project - oceanic carbon budget - in production Development of more specialized manuscript templates and review guidelines for other types of research data
H. Pfeiffenberger, D.Carlson, S. Dallmeier-Tiessen, Bloomsbury Conf., UCL, London, Summary - Outlook : General Text has been with us for years The printing press, 500 years Digital data, as preserved items, 50 years (World Data Centres) Online access to massive amounts of data, 5 years => Do not expect perfect, final modus operandi for publication of data anytime soon Thank you!