ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley.

Slides:



Advertisements
Similar presentations
Managing Dataset DOIs and Versions in a Changing Archive Steven Worley Bob Dattore Zaihua Ji National Center for Atmospheric Research Boulder, Colorado,
Advertisements

New Resources in the Research Data Archive Doug Schuster.
SCD Research Data For UCAR Data Management Working Group January 10, 2001 Steven Worley Scientific Computing Division Data Support Section.
Surface Marine Data International Comprehensive Ocean-Atmosphere Data Set (ICOADS) Steven Worley, NCAR Scott Woodruff, NOAA/ERSL Eric Freeman, NOAA/NCDC.
The International Surface Pressure Databank (ISPD) and Twentieth Century Reanalysis at NCAR Thomas Cram - NCAR, Boulder, CO Gilbert Compo & Chesley McColl.
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System 1 Zaihua Ji Doug Schuster Steven Worley Computational.
IQuOD Data Flow Tim Boyer NODC. Inflow How will IQuOD quality controlled data get into the World Ocean Database?
Introduction Downloading and sifting through large volumes of data stored in differing formats can be a time-consuming and sometimes frustrating process.
U.S. Surface Archives Sent to China ( ) 7 th PRC-U.S. Joint Coordination Panel for Data and Information Cooperation 29 Nov. – 1 Dec, 2000 Steven.
October 16-18, Research Data Set Archives Steven Worley Scientific Computing Division Data Support Section.
MethodECMS © כל הזכויות שמורות. Methoda Computers Ltd 2 MethodECMS  MethodECMS is a proactive package that enables the establishment.
EGU 2011 TIGGE, TIGGE LAM and the GIFS T. Paccagnella (1), D. Richardson (2), D. Schuster(3), R. Swinbank (4), Z. Toth (3), S.
Research Data at NCAR 1 August, 2002 Steven Worley Scientific Computing Division Data Support Section.
Data for Climate and Energy Studies Steven Worley Computational and Information Systems Laboratory NCAR.
JCOMM in-situ Observing Programme Support Centre SeaDataNet Technical Meeting II Paphos, Cyprus March 2012 Mathieu Belbeoch & Kelly Stroker.
The National Center for Atmospheric Research is operated by the University Corporation for Atmospheric Research under sponsorship of the National Science.
Scientific Investigations; Support from Research Data Archives for Joint Office for Science Support 26 February, 2002 Steven Worley SCD/DSS.
Technical Working Group, II Teruko Manabe Steven Worley Miroslaw Mietus Shawn Smith Simon Tett Volker Wagner Scott Woodruff David Berry Liz Kent.
© 2005 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice The China Digital Museum Project.
Data to Support Ocean-Atmosphere Research NCAR Research Data Archive (RDA), Zaihua Ji, NCAR Steven Worley, NCAR Scott Woodruff,
Archive and Access Practices that Support Data Reuse and Transparency Steven Worley Doug Schuster Bob Dattore National Center for Atmospheric Research.
Describe workflows used to maintain and provide the RDA to users – Both are 24x7 operations Transition to the NWSC with zero downtime NWSC is new environment.
Integrated Model Data Management S.Hankin ESMF July ‘04 Integrated data management in the ESMF (ESME) Steve Hankin (NOAA/PMEL & IOOS/DMAC) ESMF Team meeting.
Data Access to Marine Surface Observations and Products from COADS 29 January, 2002 Steven Worley National Center for Atmospheric Research.
IODE Ocean Data Portal – from data access to integration platform Sergey Belov, Tobias Spears, Nikolai Mikhailov International Oceanographic Data and Information.
ICOADS: Update Status and Data Distribution Steven J. Worley Scott D. Woodruff Sandra J. Lubker Ziahua Ji J. Eric Freeman NCAR, NOAA/ESRL, NOAA/NCDC CLIMAR-III,
Analyzed Data Products Available from NCAR that Support Marine Climate Research JCOMM ETMC-III 9-12 February 2010 Steven Worley Doug Schuster.
_______________________________________________________________CMAQ Libraries and Utilities ___________________________________________________Community.
Observing System Monitoring Center (OSMC) Status Update April 2005 Steve Hankin – PMEL (co-PI) Kevin Kern – NDBC (co-PI)
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
IODE Ocean Data Portal - ODP  The objective of the IODE Ocean Data Portal (ODP) is to facilitate and promote the exchange and dissemination of marine.
Content, Discovery, and Accessibility Enhancements to the NCAR Research Data Archive Doug Schuster and Steve Worley NCAR.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
NQuery: A Network-enabled Data-based Query Tool for Multi-disciplinary Earth-science Datasets John R. Osborne.
TIGGE Data Archive at NCAR 8th GIFS-TIGGE Working Group World Meteorological Organization Geneva February, 2010 Doug Schuster Steven Worley Dave.
The TIGGE Model Validation Portal: An Improvement in Data Interoperability 1 Thomas Cram Doug Schuster Hannah Wilcox Steven Worley National Center for.
29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson.
TIGGE Archive Status at NCAR THORPEX Workshop and 6th GIFS-TIGGE Working Group Meetings WMO Headquarters Geneva September 2008 Steven Worley Doug.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
Marine Surface and Climate Data Gaps in the archives at the National Center for Atmospheric Research for 15 th U.S. – China Marine and Fishery Science.
Global Collecting Centres ETMC-5 Activities of the GCCs Geneva 2015 Activities of the GCCs ETMC-5 22 nd – 25 th June 2015, Geneva, Switzerland.
Cal/Val for physics MED-MFC internal meeting CMCC-INGV-SOCIB Lecce E. Clementi, INGV.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
TIGGE Archive Access at NCAR Steven Worley Doug Schuster Dave Stepaniak Hannah Wilcox.
I-COADS Data and Products Steven J. Worley Scott D. Woodruff Richard W. Reynolds.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
Distributed Data Servers and Web Interface in the Climate Data Portal Willa H. Zhu Joint Institute for the Study of Ocean and Atmosphere University of.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
The TIGGE Model Validation Portal: An Improvement in Data Interoperability 1 Thomas Cram Doug Schuster Hannah Wilcox Michael Burek Eric Nienhouse Steven.
1. Gridded Data Sub-setting Services through the RDA at NCAR Doug Schuster, Steve Worley, Bob Dattore, Dave Stepaniak.
A41I-0105 Supporting Decadal and Regional Climate Prediction through NCAR’s EaSM Data Portal Doug Schuster and Steve Worley National Center for Atmospheric.
( ) 1 Chapter # 8 How Data is stored DATABASE.
Introduction What purpose does a data archive center serve if users can’t find or access the holdings they might need to facilitate their research discoveries?
TIGGE Archives and Access
TIGGE Data Archive and Access System at NCAR
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System Zaihua Ji Doug Schuster Steven Worley Computational.
Development and Futures of Research Data Archives
Research Data Archives at NCAR
Steven Worley, NSF/NCAR/SCD
Steven Worley, Douglas Schuster,
Comeaux and Worley, NSF/NCAR/SCD
Long-Lived Data Collections
Data Management Components for a Research Data Archive
Robert Dattore and Steven Worley
ICOADS: Data, Products, and Access
Comeaux and Worley, NSF/NCAR/SCD
Presentation transcript:

ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley

Topics  Environment setting  Data management tools and principles  ICOADS NCAR Release 2.5 contributions  Background Collections  Future Challenges

Environment Setting  ICOADS is part of a larger collection called the Research Data Archive (RDA)  RDA – briefly  600+ datasets (atmosphere, ocean, geosciences)  4.3M files, 462 TB (primary data)  unique users annually, including ICOADS  Staff, 7 scientific programmers (M.S. degrees), me, and administrative assistant

Data management principles  Always archive 2 copies of observational data  3 rd copy at a partner center (disaster recovery)  Free and open data access world-wide  Internet  Past – other media, cd-roms, tapes, etc.  Share what we have to build archives  E.g. Digitization of Maury data in China in exchange for global land surface data

Data Management Tools Old System: Specialized Software to manage each data input. Inefficient Difficult to Scale RDA Metadata Database RDA Metadata Database Unidata Server University Server NWP Server NWP Server Online Disk Tape Storage GCMD Metadata Server GCMD Metadata Server RDA Data Server Specialized Software Package 2 Specialized Software Package 3 Specialized Software Package 1 New System: Common RDA tools that homogenize data management. Efficient Scalable RDA Data Management Common Tool Set

Data Management tools – a few details  Common scripting structure to do routine dataset updates (dsupdt)  Very tunable  Frequency, multiple server priority list, validation  Fully integrated with RDADB  Users view is automatically update and therefore always current  Common single archiving function (dsarch)  location and copy control (MSS/HPSS storage, and online disk)  Fills all DB entries (e.g. file and dataset relationships)

Data management tools  Harvest file level metadata (gatherxml)  Handle various formats (GRIB1, GRIB2, netCDF, BUFR, IMMA, ON29, etc.)  Save as and populate DB  Benefits  Problem detection  Versioning, replacement, extension  Inventory information  Drive better data service for users

Data management tools  Provide access to data in tape storage archive (dsrqst)  Relatively new, not universally available across RDA - yet  Delayed mode – with DB control (many details)  Why – RDA holds 462 TB  40 TB online – most popular small scale products  Access to more products for greater community

ICOADS Release 2.5 NCAR  Data Preparation – format evaluations, translate native formats to IMMA format  Moored research buoy delayed mode archives  TOA, PIRATA (PMEL, JAMSTEC)  World Ocean Database 2005  Multiple ocean profile types (NODC)  Receive/archive ICOADS data processing results  NOAA/ESRL does processing - source merging, duplicate elimination, preconditioning deletion and fixes, etc.

ICOADS Release 2.5 NCAR  Create and maintain user data access interfaces  File access  IMMA and binary (observations, monthly summary statistics)  Sub-selection (time, space, parameter)  Example coming.  Output is ASCII tabular format  Runs automatically – nearly all requests completed in 10 minutes  Keep user metrics

ICOADS Release 2.5 NCAR  Near-term preliminary extensions to R2.5  Beginning with data in 2008 and forward  Based on NCEP GTS compilation/merge  Runs on day 2 of each month – processes previous month.  Create IMMA observations and binary monthly summary statistics  Harvest file level metadata  Do all archiving of original and processed files  Automatically, update user interfaces

Brief drive through NCAR

World-wide User Access

File Level Metadata – ICOADS IMMA Example

8 pages of information like this

A look at 2009

What is happening in 2009?

World-wide User Access

Similar service for the monthly summary statistics

Who uses the sub-setting interfaces? Countries

Background Collections  Historical  Most complete set of ALL source data used to create ALL ICOADS Releases  Beginning in mid-1980s  Copies of ALL ICOADS Releases  We do not delete any files

Background Collections  Ongoing / Routine data receipts  Format conversions are done at NCDC DescriptionSourceFrequency Marine Surface GTSNCEP (BUFR)Monthly Marine Surface GTSNCDC (IMMA)Monthly SEASNCDC (IMMA)Monthly KeyedNCDC (IMMA)Monthly (nominally) GCCNCDC (IMMA)Quarterly (nominally) VOSClimNCDC (IMMA)Monthly

Future Challenges  Eliminate user interface dependency on java applets – deploy java script instead.  Support “advanced” ICOADS initiative  Bias adjusted / corrected observations  Serve as a central DB / handle data ingest  Build a user interface  Continue as a full U.S. partner.