Download presentation
Presentation is loading. Please wait.
Published byPamela Wilkins Modified over 6 years ago
1
Developing a National Plan for Glider Operations: Data Management
IOOS Glider Workshop August 2012 Jim Potemra, UH
2
Goal Develop a national plan for glider operations. Part of this plan should address the management of glider data. Start discussion on developing a common plan for glider-to-archive data streams Motivation is to promote data interoperability as well as making data easier to access and use
3
National Plans The wave plan includes two pages on data management:
Wave Plan: “…all data will flow through IOOS DAC operated by NDBC and CDIP using IOOS-DIF standards and metadata…” HFR Plan: “…data management principles…have been coordinated with IOOS DIF…” The wave plan includes two pages on data management: Metadata: ISO to replace existing FGDC Data content: Standardization of the data content (WMO format -> NWS Telecom gateway -> NOAA broadcast system NOAAPort, and -> GTS (WMO codes) Data archive: to be addressed in implementation plan (NODC and NCDC involvement) The HFR plan includes six pages on data management: Data description Data access and transport: OPeNDAP and OGC (plus database suggestion) Metadata: netCDF CF (more what I’d called vocabulary) Archive: NDBC DAC Data QA/QC: realtime and delayed mode
4
Components of DMP Data formats and standards Data services
Include vocabulary, conventions, metadata Data services How to get data to users Implementation How will this get done and by whom Issues include incorporation of QA/QC
5
1. Data formats and standards
Assumption 1: Essentially three types of gliders, roughly measuring similar variables Assumption 2: Raw formats may differ, but getting to an initial (near-real-time) netCDF file would be possible Proposal: netCDF data model with CF conventions (standardized vocabulary and units)
6
1. Data formats and standards (cont’d)
Advantages: Easy to implement (hopefully) In-line with Argo, OceanSITES, IOOS Large user community and tool set Disadvantages: Even within netCDF there are permutations New featureType attribute might be slow to catch on
7
2. Data services Assumption 1: there will not be a single solution
Assumption 2: the user community is well known Assumption 3: interoperability and open access are goals Proposal: distribute data through four main mechanisms: Direct access: pilots and science PI’s will likely access data via ssh to server machine and/or NFS mounted disk GTS: data to GTS either directly off modem or via DAC/WMO center ftp: local (and maybe remote) research use may want data transfer OPeNDAP: remote users, automatic harvesting done via OPeNDAP
8
2. Data services (cont’d)
Advantages: Easy to implement (hopefully) In-line with IOOS and Argo Will cover almost all possible user requests Disadvantages: Versioning will be difficult Several access logs Maintenance of different servers
9
3. Implementation Data archives Data files (QC issue…)
Distributed centers (data assembly centers) that provide equivalent services for seamless integration Central assembly center where all data providers submit data (GDAC) NODC/NDBC/NCDC? Data files (QC issue…) Single data stream with flags marking raw, real-time QC, delayed-mode data (e.g., Argo) Two data streams (e.g., tide gauge)
10
QA/QC considerations Different layers to this:
Different data streams or over-write Done by provider, aggregator or separate team Either way, a documented plan would be helpful QA/QC extends to data and dimensions (e.g., how accurate are time/location; is this important?) Impacts data file w.r.t. vocabulary and flags (so not just an issue of what tests are run)
11
3. Implementation (cont’d)
Data governance Data management team (real-time operators, delayed- mode QC) Ad hoc, standards-based (articulate best practices and leave to providers) Misc Provide service to users? Others?
12
Suggested Approach User-driven: Issue one is to identify users
Scientific PI Researcher (non-PI) Operational modeling centers Re-analysis modeling Pilots General (non-scientific) users
13
Data processing; conversion to netCDF; QA/QC applied
Complete Picture Iridium modem GTS Operational Center GLIDER Iridium modem Shore Station ssh Pilot console ftp/cp Science PI Data processing; conversion to netCDF; QA/QC applied ftp/ssh ftp/cp Reanalysis modeling http ftp/cp Data Service (OPeNDAP) Researcher http ? Non-science user Archive Center
14
Data format and transport by user
Variable Level Pilot raw Direct access All Science PI netCDF RT and DMQC Operational Center BUFR/GRIB GTS T,S,u,v Reanalysis ftp/OPeNDAP DMQC Research General users Images http/web Products
15
Suggested Approach (cont’d)
Data providers at other end: Issue one is to document existing practices Inventory of gliders? Three main types; data formats for these? Role of manufacturer?
16
Areas to consider: Staring point is glider Multiple ending points
What variables and formats need to be addressed? All transmitting via Iridium? Multiple ending points Pilots, scientists (PI’s), scientists (research), modeling centers (real-time), model reanalysis studies (historical), other users (?) Added aspect of regional and national viewers and/or aggregation centers Implementation Federated (all groups carry on), or centralize (e.g., Argo) with “data assembly centers” (DACs) Maintenance of two data streams Sea level (real-time and delayed mode) Argo (combination)
17
Based on this goal, discuss an implementation plan
Based on this goal, discuss development of a standard format(s) and possible standard transport(s) mechanism Depending on time and interest, discussion on data format could extend to terminology Based on this goal, discuss an implementation plan How to execute data plan, e.g., distributed system, federated system, DAC’s, maintenance of real-time and delayed mode, etc.
18
Data Management Issues
If goal is discovery Need a central catalog (service) If goal is availability Need to provide a service (ftp, OPeNDAP, etc.) If goal is interoperability Need to settle on common data model and/or service (netCDF with ftp) All sorts of other stuff Central vs. distributed archive
19
IOOS model thus far All data available asap
All data available via “standard” service OPeNDAP/THREDDS SOS, ftp ERDDAP/vis tools Data service more or less dictates format/model: netCDF
20
Data availability: IOOS RA
IOOS has 11 Regional Associations. The availability of glider data via these RA’s are in three broad categories: No obvious link to glider data or plots AOOS (Alaska) CaRA (Caribbean) GLOS (Great Lakes) GCOOS (Gulf Coast) NERACOOS (Northeast Atlantic) SECOORA (Southeast Atlantic) Some data available via OPeNDAP, limited plots/maps CenCOOS (Central California) single mission PacIOOS Data, maps and viewer MARACOOS (Mid-Atlantic) Rutgers NANOOS (Pacific Northwest) APL/UW SCCOOS (Southern California) Scripps
22
Data availability: NOAA/NODC
GTSPP: Data by name (e.g., pacific/2012/06): gtspp_ _te_111.nc gtspp_ _te_111.nc gtspp_ _te_111.nc gtspp_ _te_111.nc Files have featureType: profile Deep Water Horizon r_float.html#glider Single lat/lon/time per profile: Temp(time,depth,lat,lon) for a single time,lat,lon
23
Data availability: NOAA/NDBC
Data list and pre-made profile plots
24
Data availability: other
UW/APL Scripps Rutgers C-MORE/HOT
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.