5-7 May 2003 SCD Exec_Retr 1 Research Data, May 5-7 2003 Archive Content New Archive Developments Archive Access and Provision.

Slides:



Advertisements
Similar presentations
ECMWF June 2006Slide 1 Access to ECMWF data for Research Manuel Fuentes Data and Services Section, ECMWF ECMWF Forecast Products User Meeting.
Advertisements

Data management in SCD Steven Worley General Categories –The Mass Storage System –NCAR user file services (home directories) –Computer attached storage.
New Resources in the Research Data Archive Doug Schuster.
RAMADDA for Big Climate Data Don Murray NOAA/ESRL/PSD and CU-CIRES Boulder/Denver Big Data Meetup - June 18, 2014.
SCD Research Data For UCAR Data Management Working Group January 10, 2001 Steven Worley Scientific Computing Division Data Support Section.
ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley.
The International Surface Pressure Databank (ISPD) and Twentieth Century Reanalysis at NCAR Thomas Cram - NCAR, Boulder, CO Gilbert Compo & Chesley McColl.
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System 1 Zaihua Ji Doug Schuster Steven Worley Computational.
Introduction Downloading and sifting through large volumes of data stored in differing formats can be a time-consuming and sometimes frustrating process.
Overview of the ODP Data Provider Sergey Sukhonosov National Oceanographic Data Centre, Russia Expert training on the Ocean Data Portal technology, Buenos.
Scientific Investigations; Support from Research Data Archives for Computing in Atmospheric Sciences October, 2001 Steven Worley National Center.
October 16-18, Research Data Set Archives Steven Worley Scientific Computing Division Data Support Section.
EGU 2011 TIGGE, TIGGE LAM and the GIFS T. Paccagnella (1), D. Richardson (2), D. Schuster(3), R. Swinbank (4), Z. Toth (3), S.
Research Data at NCAR 1 August, 2002 Steven Worley Scientific Computing Division Data Support Section.
Growing and Future Datasets in the SCD Research Data Archives for NSF SCD Review Panel 16 October 2001 Steven Worley Scientific Computing Division Data.
Data for Climate and Energy Studies Steven Worley Computational and Information Systems Laboratory NCAR.
Introduction: Databases and Database Users
OCLC Online Computer Library Center Kathy Kie December 2007 OCLC Cataloging & Metadata Services an introduction.
Scientific Investigations; Support from Research Data Archives for Joint Office for Science Support 26 February, 2002 Steven Worley SCD/DSS.
M.Lautenschlager (WDCC, Hamburg) / / 1 Semantic Data Management for Organising Terabyte Data Archives Michael Lautenschlager World Data Center.
Describe workflows used to maintain and provide the RDA to users – Both are 24x7 operations Transition to the NWSC with zero downtime NWSC is new environment.
CISL/DSS & MMM Data Discussion 19 March Who CISL/DSS - maintain NCEP operational analyses and observation datasets – Gregg Walters, Doug Schuster,
08/30/05GDM Project Presentation Lower Storage Summary of activity on 8/30/2005.
1 HYCOM Data Service HYCOM Data Service An overview Ashwanth Srinivasan, (FSU) Steve Hankin (NOAA/PMEL)
Improved Access to RDA from the MSS OSD Executive Meeting April 28, 2009.
Argo workshop in Ghana, December Argo data status & data access.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
IODE Ocean Data Portal - ODP  The objective of the IODE Ocean Data Portal (ODP) is to facilitate and promote the exchange and dissemination of marine.
Content, Discovery, and Accessibility Enhancements to the NCAR Research Data Archive Doug Schuster and Steve Worley NCAR.
Web Portal Design Workshop, Boulder (CO), Jan 2003 Luca Cinquini (NCAR, ESG) The ESG and NCAR Web Portals Luca Cinquini NCAR, ESG Outline: 1.ESG Data Services.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
Review of Meteorological Data Sharing Project in China Anyuan XIONG (National Meteorological Information Center, CMA, CHINA)
TIGGE Data Archive at NCAR 8th GIFS-TIGGE Working Group World Meteorological Organization Geneva February, 2010 Doug Schuster Steven Worley Dave.
TSS Database Inventory. CIRA has… Received and imported the 2002 and 2018 modeling data Decided to initially store only IMPROVE site-specific data Decided.
29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson.
TIGGE Archive Status at NCAR THORPEX Workshop and 6th GIFS-TIGGE Working Group Meetings WMO Headquarters Geneva September 2008 Steven Worley Doug.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
TIGGE Archive Access at NCAR Steven Worley Doug Schuster Dave Stepaniak Hannah Wilcox.
End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009.
Global Change Master Directory (GCMD) Mission “To assist the scientific community in the discovery of Earth science data, related services, and ancillary.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
SCD Research Data for Ocean Observatories Steering Committee June 18, 2001 Steven Worley Scientific Computing Division Data Support Section.
From Missions to Measurements: an Ocean Discipline Experience.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
AOLI 2015 The NMME Experience: A Research Community Archive Lessons learned from Climate Model data archive and use AOLI Meeting 2015 Eric Nienhouse NCAR.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
1. Gridded Data Sub-setting Services through the RDA at NCAR Doug Schuster, Steve Worley, Bob Dattore, Dave Stepaniak.
Introduction What purpose does a data archive center serve if users can’t find or access the holdings they might need to facilitate their research discoveries?
Users Requirements The inconsistencies between the UR and GCOS-2006 identified in some of the URDs will be reduced with the new iteration of the GCOS.
TIGGE Archives and Access
A Digital Tool for the Classroom
TIGGE Data Archive and Access System at NCAR
Operational Dataset Update Functionality Included in the NCAR Research Data Archive Management System Zaihua Ji Doug Schuster Steven Worley Computational.
Intermountain West Data Warehouse
Prepared by: Jennifer Saleem Arrigo, Program Manager
Development and Futures of Research Data Archives
Research Data Archives at NCAR
Steven Worley, NSF/NCAR/SCD
CISL’s Research Data Archive (RDA) : Description and Methods
Comeaux and Worley, NSF/NCAR/SCD
Long-Lived Data Collections
Data Management Components for a Research Data Archive
Robert Dattore and Steven Worley
Successful Data Curation for Large Data Archives
Data Curation in Climate and Weather
Comeaux and Worley, NSF/NCAR/SCD
Presentation transcript:

5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision

5-7 May 2003 SCD Exec_Retr 2 Archive Content Systematic Dataset Updates –Significant and important effort

5-7 May 2003 SCD Exec_Retr 3

5-7 May 2003 SCD Exec_Retr 4 Systematic Dataset Updates, Strategies Absolutely, must keep doing this –“DSS bread and butter” Trend will be toward more network transfers –Some more frequently, per Forum suggestions More behind the scenes work –Tighter data integrity checks –Identify data gaps and unit changes Maintain media transfer capability for both I and O –Tapes and CDROMS are necessary

5-7 May 2003 SCD Exec_Retr 5 Archive Content Harvest, apply, and make more metadata available for the users –Use NNR metadata to fix problems in the databases Applies to forthcoming Reanalyses also –Provide users with relevant metadata –Systematically apply metadata E.G. station history libraries

5-7 May 2003 SCD Exec_Retr 6

5-7 May 2003 SCD Exec_Retr 7 New Archive Developments Acquisition of new datasets –ECMWF’s ERA40 LTO tape transfer, 15 TB, production finished (reruns?) –NCEP’s Regional Reanalysis Network transfer, 12 TB, production started –More ocean datasets for climate analysis and modeling Response to NSF Panel recommendation Will look for collaboration opportunities

5-7 May 2003 SCD Exec_Retr 8 New Archive Developments Acquisition of new datasets –New near real-time collections from the UNIDATA server Currently backing up 2.7 GB/day, beginning Nov –Includes, global station observations, low resolution model products, radar data, and profiler data Collect a few more products Build DSS datasets (metadata), and online access –Do more to help other Divisions “they are a barometer of the University community” E.G. with little effort we can help MMM, per Wei Wang’s comments at the Forum

5-7 May 2003 SCD Exec_Retr 9 Archive Access and Provision Filling one-off data requests for users TBC Data Discovery –Improved DSS guide documents to datasets. Separated by research needs, e.g. precipitation –Improved, UCAR-wide, search success through structured metadata catalogs, e.g. THREDDS catalogs –Better linkages between DSS and USS primers Integrated view for the users!

5-7 May 2003 SCD Exec_Retr 10 Archive Access and Provision Access from the MSS –New datasets will NOT be in COS blocked format –Provide more and better tools with the datasets Simplify access programs, provide in other languages Provide COS unblocking scripts many computer platforms –Have this but not well advertised Provide helpful format conversion scripts –E.G. NCL or line commands to convert GRIB to netCDF.

5-7 May 2003 SCD Exec_Retr 11 New Archive Developments Stay involved with research projects that require data, e.g. reanalyses and CLIVAR –Why? Focus on acquiring new data Focus on improving extant observational archives Leading recipient of new research data output –Benefit to our users

5-7 May 2003 SCD Exec_Retr 12 Archive Access and Provision Access from the DSS server –Summary of status Bottom line: It is working well!

5-7 May 2003 SCD Exec_Retr 13 Access from the DSS server Statistics Period: Jan - Dec 2002 (web only) Server Activity: –Total volume transferred by the server 840 GB User Activity –Number of unique users downloading data files 9136 –Number of repeat users downloading data files 946 Top Ten for 2002 Total = 275 different datasets

5-7 May 2003 SCD Exec_Retr 14 Access from the DSS server Much more data freely online for Web and FTP download More online data request forms –large-scale data extractions, delayed mode processing More real-time processing of data requests –small-scale data extractions Note, these will be complementary access functions with the CDP Currently developing a server upgrade plan (with DSG) to make this possible.

5-7 May 2003 SCD Exec_Retr 15 Access from the CDP Make Reanalysis products available –Push to full scale for the first time –40+ years gridded atmospheric data –About 2 TB for two products Service concerns that require significant effort –Authorization and authentication of users –Integrated discovery and access interfaces Coordinated service with DSS server –Ensure prompt response to user requests Build on this experience – other datasets will follow

5-7 May 2003 SCD Exec_Retr 16 Access from the CDP Should the DSS server and CDP be staged from the same machine? Pros –Co-located CPU and disk Could mean fast multiple services –Remove one server from the long list of servers that need administration, maintenance, and upgrade.

5-7 May 2003 SCD Exec_Retr 17 Access from the CDP Cons –The DSS server must be available 24x7 and provide a stable level of service. –Contrast, the CDP is a developing facility and may not always be stable. Access testing, software testing, and initial debugging –Shared disk cannot be used to reduce data storage requirements. SANS may offer some options in the future. To protect the current successful DSS service –Keep the DSS server and CDP separate for now –Reconsider merging in the future