1. Gridded Data Sub-setting Services through the RDA at NCAR Doug Schuster, Steve Worley, Bob Dattore, Dave Stepaniak.

Slides:



Advertisements
Similar presentations
Data management in SCD Steven Worley General Categories –The Mass Storage System –NCAR user file services (home directories) –Computer attached storage.
Advertisements

New Resources in the Research Data Archive Doug Schuster.
1Key – Report Creation with DB2. DB2 Databases Create Domain for DB2 Test Demo.
SCD Research Data For UCAR Data Management Working Group January 10, 2001 Steven Worley Scientific Computing Division Data Support Section.
ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley.
The Research Data Archive at NCAR Doug Schuster and Steve Worley NCAR.
CERA / WDCC Hannes Thiemann Max-Planck-Institut für Meteorologie Modelle und Daten zmaw.de NCAR, October 27th – 29th, 2008.
The International Surface Pressure Databank (ISPD) and Twentieth Century Reanalysis at NCAR Thomas Cram - NCAR, Boulder, CO Gilbert Compo & Chesley McColl.
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System 1 Zaihua Ji Doug Schuster Steven Worley Computational.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Introduction Downloading and sifting through large volumes of data stored in differing formats can be a time-consuming and sometimes frustrating process.
October 16-18, Research Data Set Archives Steven Worley Scientific Computing Division Data Support Section.
CEOS/WGISS 20, Kyev, September 12-16, WTF-CEOP Implementation Plan #1 Status (WTF-CEOP first prototype, by JAXA) September 12, 2005 Osamu Ochiai.
EGU 2011 TIGGE, TIGGE LAM and the GIFS T. Paccagnella (1), D. Richardson (2), D. Schuster(3), R. Swinbank (4), Z. Toth (3), S.
TIGGE Archive Highlights. First Service Date ECMWF – October 2006 NCAR – October 2006 CMA – June 2007.
Research Data at NCAR 1 August, 2002 Steven Worley Scientific Computing Division Data Support Section.
Data for Climate and Energy Studies Steven Worley Computational and Information Systems Laboratory NCAR.
TIGGE Data Archive and Access System at NCAR 5th GIFS-TIGGE Working Group South African Weather Service Pretoria March 2008 Steven Worley Doug Schuster.
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
Scientific Investigations; Support from Research Data Archives for Joint Office for Science Support 26 February, 2002 Steven Worley SCD/DSS.
Organizational Information So what… Structured 20% Unstructured 80%
Archive and Access Practices that Support Data Reuse and Transparency Steven Worley Doug Schuster Bob Dattore National Center for Atmospheric Research.
Describe workflows used to maintain and provide the RDA to users – Both are 24x7 operations Transition to the NWSC with zero downtime NWSC is new environment.
November 18, 2014 Centers for Medicare and Medicaid Services Virtual Research Data Center.
Supported by EU projects 12/12/2013 Athens, Greece Open Data in Agriculture Hands-on with data infrastructures that can power your agricultural data products.
Analyzed Data Products Available from NCAR that Support Marine Climate Research JCOMM ETMC-III 9-12 February 2010 Steven Worley Doug Schuster.
ESIP Federation 2004 : L.B.Pham S. Berrick, L. Pham, G. Leptoukh, Z. Liu, H. Rui, S. Shen, W. Teng, T. Zhu NASA Goddard Earth Sciences (GES) Data & Information.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
TIGGE, an International Data Archive and Access System Steven Worley Doug Schuster Dave Stepaniak Nate Wilhelmi (NCAR) Baudouin Raoult (ECMWF) Peiliang.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
Content, Discovery, and Accessibility Enhancements to the NCAR Research Data Archive Doug Schuster and Steve Worley NCAR.
The CERA2 Data Base Data input – Data output Hans Luthardt Model & Data/MPI-M, Hamburg Services and Facilities of DKRZ and Model & Data Hamburg,
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
CCGrid, 2012 Supporting User Defined Subsetting and Aggregation over Parallel NetCDF Datasets Yu Su and Gagan Agrawal Department of Computer Science and.
TIGGE Data Archive at NCAR 8th GIFS-TIGGE Working Group World Meteorological Organization Geneva February, 2010 Doug Schuster Steven Worley Dave.
The TIGGE Model Validation Portal: An Improvement in Data Interoperability 1 Thomas Cram Doug Schuster Hannah Wilcox Steven Worley National Center for.
29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson.
1 Adventures in Web Services for Large Geophysical Datasets Joe Sirott PMEL/NOAA.
TIGGE Archive Status at NCAR THORPEX Workshop and 6th GIFS-TIGGE Working Group Meetings WMO Headquarters Geneva September 2008 Steven Worley Doug.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
TIGGE Archive Access at NCAR Steven Worley Doug Schuster Dave Stepaniak Hannah Wilcox.
Research Data Archive (RDA) Access and Services from Yellowstone Grace Peng and Doug Schuster 1.
IPCC WG II + III Requirements for AR5 Data Management GO-ESSP Meeting, Paris, Michael Lautenschlager, Hans Luthardt World Data Center Climate.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision.
The TIGGE Model Validation Portal: An Improvement in Data Interoperability 1 Thomas Cram Doug Schuster Hannah Wilcox Michael Burek Eric Nienhouse Steven.
Science Gateway- 13 th May Science Gateway Use Cases/Interfaces D. Sanchez, N. Neyroud.
A41I-0105 Supporting Decadal and Regional Climate Prediction through NCAR’s EaSM Data Portal Doug Schuster and Steve Worley National Center for Atmospheric.
Introduction What purpose does a data archive center serve if users can’t find or access the holdings they might need to facilitate their research discoveries?
The National Center for Atmospheric Research is operated by the University Corporation for Atmospheric Research under sponsorship of the National Science.
2005 – 06 – - ESSP1 WDC Climate : Web Access to Metadata and Data Frank Toussaint World Data Center for Climate (M&D/MPI-Met, Hamburg)
TIGGE Archives and Access
TIGGE Data Archive and Access System at NCAR
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System Zaihua Ji Doug Schuster Steven Worley Computational.
Development and Futures of Research Data Archives
TIGGE Data Archive at NCAR
Research Data Archives at NCAR
Steven Worley, NSF/NCAR/SCD
Steven Worley, Douglas Schuster,
ExPLORE Complex Oceanographic Data
CISL’s Research Data Archive (RDA) : Description and Methods
Comeaux and Worley, NSF/NCAR/SCD
Data Management Components for a Research Data Archive
Robert Dattore and Steven Worley
Successful Data Curation for Large Data Archives
Comeaux and Worley, NSF/NCAR/SCD
Presentation transcript:

1

Gridded Data Sub-setting Services through the RDA at NCAR Doug Schuster, Steve Worley, Bob Dattore, Dave Stepaniak

Gridded Data Sub-setting Services Through the RDA at NCAR Research Data Archive (RDA) Overview Problem Background Required Infrastructure Current Services Future Directions 3

RDA Overview Total archive volume over 1.3 PB unique users annually 4 Meteorological and Oceanographic Observations Operational and Reanalysis model outputs Remote Sensing Observations Topography/Bathym etry, Vegetation, Land Use

Problem Background 5 Data Volume

Problem Background Large computational/storage resources needed –Store data –Extract desired data from large grids/files –Convert data to desirable format(s) 6 Scientific data centers have these resources Individual researchers generally don’t

Problem Background Goals –Make data more accessible and easier to use for individual researchers Reasonable access volumes Desired data formats User defined parameters/grids 7 Researchers stay focused on research

Required Infrastructure 8 Powerful Computing NCAR HPC/DAV Large Disk Storage (500 TB) Rich and Detailed Metadata Databases (RDADB) Generalized Software Tools -Control system (RDAMS) -Sub-setting -Format conversion Web Interface Command Line Interface

Required Infrastructure Rich Metadata Databases (key ingredient) 9 Metadata DB File attribute metadata: Name, Dataset, Location, Format File attribute metadata: Name, Dataset, Location, Format File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L) File content metadata: T(C,D,T,L,L) RH(C,D,T,L,L) Vort(C,D,T,L,L) Vis(C,D,T,L,L) PcpR(C,D,T,L,L) Drive Interfaces Support Efficient Backend Processing Provide Scalability

Current Services 10 Sub-setting available on 13 datasets –ERA-I, CFSR, Operational Model, EaSM –Also available on select observation sets Sub-setting options –Parameter selection –Spatial region selection (limited availability) Available output formats –Native GRIB formats –NetCDF format

Current Services 11

Current Services 12 Sub-set requests Processed in delayed mode User notified by when request is ready Download data via server provided wget scripts

Current Services 13

Current Services 14

Future Directions Spatial Interpolation Faster Request Processing (NWSC) Include More RDA Datasets Improved Access Portals Additional Output Formats Web Service Access 15

Summary Data Analysis Research Challenges –Large and Growing Data Volumes –Numerous Formats RDA – Supply “User Friendly” Data –Parameter and Spatial Sub-Setting –Format Conversion –Improved and Additional Services 16