Download presentation
Presentation is loading. Please wait.
1
SeaDataNet A Pan-European Infrastructure for Ocean and Marine Data Management www.seadatanet.org Catherine Maillard First Training Session Ostende, February 12-17, 2007 Introduction to Oceanographic Data Management
2
2 Data management ANALYSIS & MODELLING SYSTEMS OBSERVING SYSTEMS User’s Web browser Analysis program Product generation Quality control Checks Data discovery Safeguarding Data sets aggregation Catalogues Data Compilation Data Formatting CDI - Data indexing in local archiving system
3
3 1. Data Compilation The data never go directly to the data centres – therefore it needs to: Locate the data sets not yet archived Request and get a copy of the missing data sets from the source laboratory/scientist – Check that the data sets is properly documented
4
4 COMPILATION 1.1: Locate the data sets which are not yet archived Search in cruise report (CSR) catalogue Or in observation system (EDIOS) Or in EDMED or EDMERP A data set should be identified either + Maintain regular direct contacts
5
5 COMPILATION 1.2: get a copy of the missing data sets from the source laboratory/scientist Request(s) a copy of the missing data sets identified as not archive at any format Emphasize the importance of: long term archiving to follow up the environmental changes Integration in long time series of data of the same type – availability of global/regional/thematic database depends on all contributions Facilitate the use of these databases Get and safeguard the electronic file Sometimes necessity of digitalization (GODAR)
6
6 COMPILATION 1.3: The mandatory meta-data Check that the data sets is properly documented with the mandatory fields described a minimum of meta-data should be included in the data files eg. Reference to cruise or observation system and source laboratory Sensor type Parameter names and units etc. Complete the missing information by asking questions to the originator
7
7 2 - Data Reformatting In general the original formats of the data files cannot be used in data management Incomplete/not standardized meta-data Incompatibility with QC and other processing input format Need of a unique archiving format for safeguarding the data sets of the same type Data management format, Archiving format and dissemination/exchange format(s) may be but not necessarily the same
8
8 2 - Different Data Formats used Archiving format : can be one of the actual exchange format or local format designed according to rules to insure sustainability Exchange/Disemination format(s): joint projects and interoperability require common exchange format(s) Data Management/processing
9
9 2.1 : General rules for sustainability of an archiving format The archiving format should: be independent from the computer (and libraries) – RDBS are not appropriate insure that any isolated data includes enough meta-data to be processed (eg. Location and date) be compatible and include at least the mandatory fields (meta-data) requested for the agreed exchange format(s) Include additional textual or standardized “history” or “comment” fields to prevent any loss of information Provide similar structure and meta-data for different data type such as vertical profiles and time series These rules are normally followed also for exchange formats
10
10 2.2 - SeaDataNet Data transport Formats obligatory formats: NetCDF (Binary) for gridded data and 3D observation data such as ADCP (Modified) ODV spreadsheet for other data types (vertical profiles and time series) optional format: ASCII Medatlas as standard exchange format for the Mediterranean and Black Sea community. èBODC leads the task to modify the present ODV and NetCDF formats for SeaDataNet use (QC flags, parameters semantics etc..and conformity with the international standards) èFormatting exercises to asses the coherence and compatibility of exchange formats
11
11 2.3 – Processing Formats For data management, (QC, cataloguing, selection, extraction, visualisation) the data can be In the archiving format and the In relational database system (RDBS) – the presently most used RDBS in the community are ORACLE and MySQL Note: an interface is needed between the software input format and the local data management system
12
12 3 - Quality Checks What they do Detect missing mandatory information Detect errors made during the transfer or reformatting Detect remaining outliers Detect duplicates Attach a quality flag to each numerical value What they don’t do the preliminary data calibration and validation made by the expert scientists Modify the data points General rule The tools for data QC are not unique (eg. ODV and other local systems), but the procedures are compatible. Any QC of a data set should be reported to the originator to give feedback and ask questions How they are performed Next presentation by Sissy
13
13 4 - Safeguarding The QCed data sets should be safeguarded in a perennial system for further use 2 copies Following up of the backup when the system or the technology changes It is recommended to use the common computer infrastructure of the institutes for making the backup regular and automatic The original not standardized and not QCed data sets should be safeguarded also, for possible further checks by the data manager or the source scientists, but not to be disseminated
14
14 5 - Data Dissemination and service National data sets according to the national rules Aggregated data sets with other data sources Export the data in a unique exchange format With the appropriate documentation on: the format and codes QC performed on the data The source of the data and the condition of use (license)
15
15 5 - Data aggregation Data Aggregation represents a service and a product To answer data requests related to a geographical area or other selection criteria independently from the source Interrogate the local data centre Complete with other sources Eliminate the duplicates
16
16 Other data sources The other data centres of the consortium Regional and project databases: ICES: North-East Atlantic Medatlas 2002, Mater1996-1999 but some data included in Medatlas, MFS/MOON for RT The World Ocean Atlas – delayed mode data The Coriolis/Argo Server – Real Time Data The satellite data
17
17 The consortium data The Common Data Index (CDI) shows what is presently available in the data centres. It will be continously updated during the project http://www.sea-search.net/cdi/ (also from the SeaDataNet website) During the development phase (2006-2007) of the interoperable system, by the Technical Task Team, each data centre is interrogated separately to get access to the the data - Several Data centres provide on line tools for data search and access, including geographical selection and web services.Technical Task Team
18
18 Regional Databases ICES http://www.ices.dk/ocean/ ICES format Medatlas 2002 www.ifremer.fr/medarwww.ifremer.fr/medar + Cdrom +ftp site Developed in the frame of the EU Medar project (a regional DAR) Data selection tools according to various criteria including geographical search available on the Cdrom Also available on line from several partner data centres Medatlas format
19
19 World Ocean Atlas 2005 http://www.nodc.noaa.gov/OC5/WOD05/pr_wod05.html http://www.nodc.noaa.gov/OC5/WOD05/pr_wod05.html Developed by US/NODC – WDC Washington – Ocean Climate Laboratory in the frame of IOC/GODAR project with the contribution of the other data centres Data, mainly delayed mode data, are available through on line selection tool or on DVD (on request) All the fields can be interrogated for data selection. The possibility to select countries by group ( to get all but the own country, or all but the SDN consortium for example) is commonly used.
20
20 Data Types in WOA 2005 Type of observations Ocean Station Data (OSD) [Bottle, low resolution CTD/XCTD, plankton data] High Resolution CTD/XCTD (CTD) Expendable (XBT) and Mechanical (MBT) Bathythermographs Autonomous Pinniped Bathythermographs (APB) Profiling Floats (PFL) Drifting Buoys (DRB) Moored Buoys (MRB) [TAO, PIRATA, others] Undulating Oceanographic Recorder (UOR) [Towed CTD] Glider data (GLD) Surface-Only (SUR) [Bucket, Thermosalinograph] Parameters Pressure, Temperature, salinity + 23 bio-geochemical parameters + biological taxons
21
21 WOA 2005 export format US-NODC format Codes and standards different from SeaDataNet Tools available to process the data: US/NODC tools in fortran, C, Java to read the data SeaDataNet/Ifremer tool to transcribe from WOA to Medatlas by a converter (presently available in Unix only) ODV can visualise the data directly in WOA format
22
22 Coriolis/ Argo Server http://www.coriolis.eu.org/cdc/ http://www.coriolis.eu.org/cdc/ The Coriolis/Argo server is one of the two Argo Global Data Assembly Centres (GDAC) synchronized on a daily basis with the US GODAE Data Centre (Monterey) serving daily real time data (+gridded analyses) from the following national DACs including: Australian, Canadian, Chinese, French, Indian, Korean, Japanese, UK, and US, contributors from Chile, Costa-Rica, Germany, Morocco, Mexico, Norway, Netherlands, Russia, Spain and data from the GTS (sources difficult to establish) On line selection tools allowing to visualize and download in-situ data
23
23 Data Types in Coriolis/Argo Vertical profiles mainly from : XBT, XCTD or XBT from research or opportunity vessels ; Argo profiling floats ; Anchored buoys or moorings ; Drifting buoys. Trajectory data mainly from : Drifting buoys ; Argo floats ; Vessels equipped with a thermosalinograph (GOSUD server) èMany data but few parameters : P, T, S essentially èUnerdevelopment: integration in the SeaDataNet system
24
24 Export Formats from Coriolis/Argo Argo Netcdf – widely used in operational oceanography, designed for TS profiles ASCII – (quasi) Medatlas Important: for Medatlas format extraction, do the data selection data type by data type, to avoid to have all types grouped in the same file.
25
25 Duplicates problem for data dissemination and products preparation Even if the data are checked for duplicates at the national levels, remaining problems may exist: Data insufficiently documented and attributed to two different sources PTS files and same station with other parameters RT and DM profiles Data declassified by the Navies with poor meta-data Data sets from the GTS with decimated and poorly documented profiles
26
26 What tcan be done? Selection country by country (however difficult for the RT) Visualising ship tracks and trajectories and superimposing the position maps of cruises made in the same region in the same period. In case of duplicate data sets, evaluate which is the best set of observations, the more complete and documented etc.. Can lead to a lot of manual work in the QC
27
27
28
28 Template for TA web page All the images in the directory « Template_images »
29
29 Education and Outreach pages SDN-EDU.html
30
30 Conclusive remarks SeaDataNet is developing basic tools for implementing the data management activities in conformity with internationally agreed protocols. The NODC/DNA of the 40 TAP use either the common tools or the existing local systems, but they should be inter-comparable and compatible. The present infrastructure is not yet stabilized in regards of standards and available software, but the main functionalities are available to insure the data circulation from the start of the project. Any new information, result or software is made immediately available on the website. Importance of developing a local page to connect by using the ENEA template
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.