Hannes Thiemann Michael Lautenschlager Deutsches Klimarechenzentrum GmbH, Germany EGU 2010.

Slides:



Advertisements
Similar presentations
The Benefits of Cross- Linking The International Continental Scientific Drilling Program (ICDP) Jens Klump et al. Knowledge by Networking - Digitising.
Advertisements

Criteria for the trustworthiness of data centres Jens Klump Helmholtz Centre Potsdam German Research Centre for Geosciences (GFZ) DataCite Summer Meeting.
Std-doi Publication of Climate Data at WDCC DataCite Summer Meeting 7./8. June 2010 Publication of climate data Heinke Höck World Data Center for Climate.
Introduction to DataCite Adam Farquhar PhD Head of Digital Library Technology, The British Library President, DataCite June 2010.
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
DOIs for Tracking and Citing Scientific Data J. Klump, J. Wächter and M. Lautenschlager CODATA Conference 2006 Beijing, PR China.
Preservation and Long Term Access of Data at the World Data Centre for Climate Frank Toussaint N.P. Drakenberg, H. Höck, M. Lautenschlager, H. Luthardt,
Long-term Archiving of Climate Model Data at WDC Climate and DKRZ Michael Lautenschlager WDC Climate / Max-Planck-Institute for Meteorology, Hamburg Data.
M.Lautenschlager (WDCC/MPI-M) / / 1 The CEOP Model Data Archive at the World Data Center for Climate as part of the CEOP Data Network CEOP / IGWCO.
Pilot Implementation: Publication and Citation of Scientific Primary Data Result of CODATA WG, supported by DFG Jan Brase Learning Lab Lower Saxony, Uni.
CERA / WDCC Hannes Thiemann Max-Planck-Institut für Meteorologie Modelle und Daten zmaw.de NCAR, October 27th – 29th, 2008.
New DFG Information Infrastructure Projects Dr. Stefan Winkler-Nees; Birmingham, 28. March 2011 New DFG Information Infrastructure Projects.
M. Stockhause et al. Martina Stockhause, Michael Lautenschlager, Frank Toussaint Deutsches Klimarechenzentrum (DKRZ) World Data Centre for Climate (WDCC)
German Cluster of WDCs for Earth System Research - Entwurf - Michael Lautenschlager 1, Michael Diepenbroek 2, Hannes Grobe 2, Michael Bittner 3, Jens Klump.
M. Diepenbroek (MARUM), M. Lautenschlager (MPI-M), E. Paliouras (DLR), H. Grobe (AWI) CODATA General Assembly, Berlin World Data Center Cluster.
Review on 5 Years DataCite and 10 Years DOI Registration for Data DataCite Annual Conference 2014 Nancy, August 25th – 26th Michael Lautenschlager (DKRZ.
Preservation and Long Term Access of Data at the World Data Centre for Climate Frank Toussaint N.P. Drakenberg, H. Höck, S. Kindermann, M. Lautenschlager,
M.Lautenschlager (WDCC / MPI-M) / / 1 GO-ESSP at LLNL Livermore, June 19th – 21st, 2006 World Data Center Climate: Status and Portal Integration.
DataCite: Making Data Citable Jan Brase (DataCite/TIB Hannover) Brigitte Hausstein (GESIS) Wolfgang Zenk-Möltgen (GESIS)
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Persistent Identifiers Reinhard.
M.Lautenschlager (WDCC / MPI-M) / / 1 AGU Fall Meeting, San Francisco, December 2005 Michael Lautenschlager - WDC Climate (Max-Planck-Institut.
Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism.
M. Lautenschlager (M&D/MPIM)1 The CERA Database Michael Lautenschlager Modelle und Daten Max-Planck-Institut für Meteorologie Workshop "Definition.
Z EGU Integration of external metadata into the Earth System Grid Federation (ESGF) K. Berger 1, G. Levavasseur 2, M. Stockhause 1, and M. Lautenschlager.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
CIM – The Common Information Model in Climate Research
Johannes Spitzbart Phonogrammarchiv, Austrian Academy of Sciences Österreichische Tage der Digitalen Geisteswissenschaften save the data - workshop on.
Metadata Concepts / Use in Climate Research Stephan Kindermann, Martina Stockhause German Climate Computing Center (DKRZ) Hamburg, Germany.
F. Toussaint (WDCC, Hamburg) / / 1 CERA : Data Structure and User Interface Frank Toussaint Michael Lautenschlager World Data Center for Climate.
The Legislative Library of Ontario’s Ontario Documents Repository Road to Partnership.
Michael Lautenschlager World Data Center Climate Model and Data / Max-Planck-Institute for Meteorology German Climate Computing Centre (DKRZ)
9-Sept-2003CAS2003, Annecy, France, WFS1 Distributed Data Management at DKRZ Distributed Data Management at DKRZ Wolfgang Sell Hartmut Fichtel Deutsches.
M.Lautenschlager (WDCC, Hamburg) / / 1 Semantic Data Management for Organising Terabyte Data Archives Michael Lautenschlager World Data Center.
Publication and Citation of Scientific Primary Data at WDC Climate (WDCC ) Michael Lautenschlager (WDCC) Heinke Höck (WDCC) Jan Brase (TIB) Susanne Waszkewitz.
Electronic publications in the Swiss National Library ELAG 2005 CERN, Geneva, June 1-3, 2005 Barbara Signori Swiss National Library (SNL)
Long-term Archiving of Climate Model Data at WDC Climate and DKRZ Michael Lautenschlager WDC Climate / Max-Planck-Institute for Meteorology, Hamburg Wolfgang.
M.Lautenschlager (WDCC, Hamburg) / / 1 Training-Workshop Facilities and Sevices for Earth System Modelling Integrated Model and Data Infrastructure.
Data Management in Scholarly Journals and possible Roles for Libraries – Some Insights from EDaWaX Sven Vlaeminck | Leibniz-Information Centre for Economics.
Data Publication and Quality Control Procedure for CMIP5 / IPCC-AR5 Data WDC Climate / DKRZ:
Semantic linking of data and journal publications in the STD-DOI project Jens Klump and STD-DOI Team European GeoInformatics Workshop Edinburgh, 7 March.
| Ingest Levels and Persistent Identification | October Ingest Levels and Persistent Identification Services for R & D and heritage organisations.
Recent Developments in CLARIN-NL Jan Odijk P11 LREC, Istanbul, May 23,
- Vendredi 27 mars PRODIGUER un nœud de distribution des données CMIP5 GIEC/IPCC Sébastien Denvil Pôle de Modélisation, IPSL.
Michael Lautenschlager, Hannes Thiemann, Frank Toussaint WDC Climate / Max-Planck-Institute for Meteorology, Hamburg Joachim Biercamp, Ulf Garternicht,
H. Thiemann (M&D) / / 1 Hannes Thiemann M&D Statusseminar, 22. April 2004.
IPCC TGICA and IPCC DDC for AR5 Data GO-ESSP Meeting, Seattle, Michael Lautenschlager World Data Center Climate Model and Data / Max-Planck-Institute.
The Repository of the World Data Centre for Climate Frank Toussaint, Michael Lautenschlager Max-Planck-Institut für Meteorologie Repositories in Research.
PSI Meta Data meeting, Toulouse - 15 November The CERA C limate and E nvironment data R etrieval and A rchiving system at MPI-Met / M&D S. Legutke,
Every bit counts Data management and data publication in the earth sciences Jens Klump et al. International Data Exchange Workshop Kiel, 10 May 2007.
WP6/SA2: Access to IS-ENES Data Federation SA2 is a European distributed data infrastructure providing access to data from ESM simulations produced in.
Lautenschlager + Thiemann (M&D/MPI-M) / / 1 Introduction Course 2006 Services and Facilities of DKRZ and M&D Integrating Model and Data Infrastructure.
1 Accomplishments. 2 Overview of Accomplishments  Sustaining the Production Earth System Grid Serving the current needs of the climate modeling community.
Metadata Training for SEFSC Science Staff Part Two.
1 Summary. 2 ESG-CET Purpose and Objectives Purpose  Provide climate researchers worldwide with access to data, information, models, analysis tools,
Create XML from a template Browse available records WDCC Metadata Generation with GeoNetwork Hans Ramthun, Michael Lautenschlager, Hans-Hermann Winter.
Implementing PREMIS in DigiTool Michael Kaplan ALA 2007 Update.
IPCC WG II + III Requirements for AR5 Data Management GO-ESSP Meeting, Paris, Michael Lautenschlager, Hans Luthardt World Data Center Climate.
M. Lautenschlager (M&D/MPIM)1 WDC on Climate as Part of the CERA 1 Database System Michael Lautenschlager Modelle und Daten Max-Planck-Institut.
CAS2K11 in Annecy, France September 11 – 14, 2011 Data Infrastructures at DKRZ Michael Lautenschlager.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
SciDataCon 2014, WDS Forum, Dehli WDS Certification Objective: building trust in the usage of data & data services Michael Diepenbroek Rorie Edmunds Mustapha.
2005 – 06 – - ESSP1 WDC Climate : Web Access to Metadata and Data Frank Toussaint World Data Center for Climate (M&D/MPI-Met, Hamburg)
Approaches and Challenges in Managing Persistent Identifiers
AP7/AP8: Long-Term Archival of CMIP6 Data
World Conference on Climate Change October 24-26, 2016 Valencia, Spain
Data Citation Service for CMIP6 and IPCC DDC Aspects
Building A Repository for Digital Objects
DATA SPHINX & EUDAT Collaboration
Research data in library catalogues and the joint initiative of European technical libraries for data registration Jan Brase Workshop Primary data for.
Presentation transcript:

Hannes Thiemann Michael Lautenschlager Deutsches Klimarechenzentrum GmbH, Germany EGU 2010

 Approved in 2003  Hosts several projects and Data Centres  WDCC operates as a long-term data archive (10years +)  WDCC is implemented within the CERA data and information system.  Data are stored in conjunction with metadata.  WDCC offers the publication service for primary data. (DOI)  Approximately 5 person staff and 500 TB of data.  Increase of a 1 PB/year starting in year 2011  Calendar year 2009: 800 active users  Data from ◦ 80 projects ◦ 1400 experiments ◦ datasets ◦ 8.7 Billion records  ~ 1 Million downloads  more than 255 TByte in total World Data Centre on Climate

 Most active German Projects ◦ COPS ◦ REMO-UBA / BFG ◦ CLM Consortial Runs ◦ MILLENNIUM_COSMOS  Anticipated projects ◦ CMIP5 ◦ IPCC AR5  Global and Regional ◦ STORM ◦ EUCLIPSE ◦ And many more  Most active International Projects ◦ CEOP ◦ ENSEMBLES ◦ DPHASE ◦ Metafor ◦ IS-ENES ◦ IPCC World Data Centre on Climate

 Traditionelle Architektur

Entry Reference Status Distribution Contact Coverage Parameter Spatial Reference Local Adm. Data Access Data Org Processing on the fly CERA General Architecture CERA2 Data Model CERA2 Data Storage

 CERA als Bestandteil der Struktur am DKRZ darstellen.

Metadata Proxy Entry Reference Status Distribution Contact Coverage Parameter Spatial Reference Local Adm. Data Access Data Org HPSS (10 Pbyte /a ) HPSS (10 Pbyte /a ) StorageTek Silos Total Capacity: Tapes Approx. 60 PB (LTO and Titan)

 DOI Service darstellen.

Publication Process at TIB Technischen Informationsbibliothek Hannover (Registration Agency) TIBORDER Publication Process at WDC-Climate (Publication Agent) Publication of Scientific Primary Data at WDCC Precondition: long term availability of Data and Metadata at WDC-Climate Quality Control of Data and Metadata Metadata and Data Access via Internet DOI-Resolver Creation of STD-DOI metadata Creation of DOI/URN integration DOI URL link integration

 Additionally WDCC offers the primary data publication service for final data entities which are of general scientific interest ◦ Following the STD-DOI concept (Scientific and Technical Data – Digital Object Identifier, URL: ◦ Important aspects of the publication process are  The identification of independent data entities which are suitable for publication at the level of scientific literature,  The execution of an elaborated review process for metadata and climate data,  The assigment of additional metadata for electronic publication (ISO 690-2) and of persistent identifiers (DOI / URN) and  The integration of publication metadata and persistent identifiers into the TIB library catalogue (Technical Information Library, Hannover) so that primary data entities are searchable and citable together with scientific literature.  Quality characteristic is presently “approved by author”, future development should be “peer reviewed”.

STD-DOI data publication workflow

 It is often required to manage ACLs ◦ Data owners want to publish papers before others start using the data ◦ Commercial use shall be prohibited  Statistics on data usage are necessary ◦ Data owners want to know how often or who uses their data ◦ In case of problems or new versions users can be informed ◦ Gives important information how data shall be stored in future projects

 Neue CERA Struktur

14 Appl. Server TDS (or the like) LobServer HPSS CERA DB Layer What Where Who When How Midtier Archive: files Container: Lobs

WDCC as IPCC / CMIP5 Data Node UN WMO / UNEP IPCC UK: BADC ~ 1 PByte HD DE: WDCC 0.7 PByte HD +1.4 PBytes tape US: PCMDI: ~1 PByte HD IPCC Data Federation model output data evaluation paper evaluation:

 CERA as a basis for WDCC ◦ CERA Metadata, DKRZ storage (disk, tape)  Challenge: Integrate project data management into long term archival ◦ More frequent changes in metadata and data  Transition phase ◦ Metadata and data components

Contact hannes.thiemann(at)zmaw.de

 Inhaltlicher Ausblick: Neue Projekte am Beispiel von IPCC AR5 / CMIP  ganz wichtig hierbei die sichere, integre, langfristige Archivierung

 1. Section: General approach to digital Long-Term Preservation (dLTP) The first section will introduce the subject of the seminar. It is intended to illustrate the importance of dLTP in general and give an overview of the heterogeneous requirements of different user communities like (digital) libraries, archives, data- centres, digital repositories, science communities etc. Selected national and international activities and projects will be presented. 2. Section: Technical aspects The third section adds a more technical point of view. The importance of metadata especially those serving the dLTP process will be discussed. What metadata standards, if any, exist and can be recommended? Is there a common approach possible to serve the various communities like life sciences, natural sciences, social sciences, Humanities, etc. One talk should treat data formats, their role in a dLTP context and possible evaluation criteria. Assuring the long term accessibility by using persistent identifiers will be addressed with respect to the results of the eScience seminar held on this subject in March Section: Organisational aspects The second section copes with matters concerning institutional and management requirements of dLTP. Which criteria does an archive/repository have to fulfill to be considered trustworthy? How important are standards, especially those that come from GRID and eScience technics? What roles can be ascribed to institutions involved in the lifecycle and dLTP of data? Who can be considered responsible for data federation and curation and its future access? Furthermore the topic of cost of dLTP will be treated in this section.