Robert Dattore and Steven Worley

Slides:



Advertisements
Similar presentations
Data management in SCD Steven Worley General Categories –The Mass Storage System –NCAR user file services (home directories) –Computer attached storage.
Advertisements

ICOADS Archive Practices at NCAR JCOMM ETMC-III 9-12 February 2010 Steven Worley.
The International Surface Pressure Databank (ISPD) and Twentieth Century Reanalysis at NCAR Thomas Cram - NCAR, Boulder, CO Gilbert Compo & Chesley McColl.
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
Introduction Downloading and sifting through large volumes of data stored in differing formats can be a time-consuming and sometimes frustrating process.
October 16-18, Research Data Set Archives Steven Worley Scientific Computing Division Data Support Section.
CEOS/WGISS 20, Kyev, September 12-16, WTF-CEOP Implementation Plan #1 Status (WTF-CEOP first prototype, by JAXA) September 12, 2005 Osamu Ochiai.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
Research Data at NCAR 1 August, 2002 Steven Worley Scientific Computing Division Data Support Section.
Scientific Investigations; Support from Research Data Archives for Joint Office for Science Support 26 February, 2002 Steven Worley SCD/DSS.
IODE Ocean Data Portal - technological framework of new IODE system Dr. Sergey Belov, et al. Partnership Centre for the IODE Ocean Data Portal MINCyT,
Integrated Model Data Management S.Hankin ESMF July ‘04 Integrated data management in the ESMF (ESME) Steve Hankin (NOAA/PMEL & IOOS/DMAC) ESMF Team meeting.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Analyzed Data Products Available from NCAR that Support Marine Climate Research JCOMM ETMC-III 9-12 February 2010 Steven Worley Doug Schuster.
Using the Global Change Master Directory (GCMD) to Promote and Discover ESIP Data, Services, and Climate Visualizations Presented by GCMD Staff January.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
Best Practices for Digital Imaging and Metadata Roy Tennant The Library, University of California, Berkeley
Content, Discovery, and Accessibility Enhancements to the NCAR Research Data Archive Doug Schuster and Steve Worley NCAR.
JRA-25 and JCDAS at NCAR Data from Japanese 25-year Reanalysis (JRA-25) and the operational follow- on JMA Climate Data Assimilation System (JCDAS) are.
Using Portals and Registries: Publishing Metadata to GCMD Lola Olsen 1, Tyler Stevens 2, 1 National Aeronautics and Space Administration (NASA) 2 Wyle.
The NCAR Community Data Portal (CDP) Experiences with OAI metadata record federation presented by Michael Burek (NCAR/SCD/VETS) Acknowledgments:
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
WGISS and GEO Activities Kathy Fontaine NASA March 13, 2007 eGY Boulder, CO.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Registering Earth Science Data and Data Related Services Using NASA’s Global Change Master Directory (GCMD) Tyler Stevens (GIS/Services Coordinator) ESIP.
The TIGGE Model Validation Portal: An Improvement in Data Interoperability 1 Thomas Cram Doug Schuster Hannah Wilcox Steven Worley National Center for.
29 March 2004 Steven Worley, NSF/NCAR/SCD 1 Research Data Stewardship and Access Steven Worley, CISL/SCD Cyberinfrastructure meeting with Priscilla Nelson.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
The Proliferation of Metadata Standards and the Evolution of NASA’s Global Change Master Directory (GCMD) Standard for Uses in Earth Science Data Discovery.
Global Change Master Directory (GCMD) Mission “To assist the scientific community in the discovery of Earth science data, related services, and ancillary.
Data Discovery and Access to The International Surface Pressure Databank (ISPD) 1 Thomas Cram Gilbert P. Compo* Doug Schuster Chesley McColl* Steven Worley.
Distributed Data Servers and Web Interface in the Climate Data Portal Willa H. Zhu Joint Institute for the Study of Ocean and Atmosphere University of.
5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
The TIGGE Model Validation Portal: An Improvement in Data Interoperability 1 Thomas Cram Doug Schuster Hannah Wilcox Michael Burek Eric Nienhouse Steven.
1. Gridded Data Sub-setting Services through the RDA at NCAR Doug Schuster, Steve Worley, Bob Dattore, Dave Stepaniak.
Introduction What purpose does a data archive center serve if users can’t find or access the holdings they might need to facilitate their research discoveries?
The Arctic Observing Network and its Data Management Challenges Florence Fetterer (NSIDC/CIRES/CU), James A. Moore (NCAR/EOL), and the CADIS team Photo.
AIRS Meeting GSFC, February 1, 2002 ECS Data Pool Gregory Leptoukh.
1 Lola M. Olsen CEOS IDN Task Team Lead Technology and Services Subgroup Challenges/Actions for the IDN May 2006.
Evolving Architecture at NSIDC
Simulation Production System
DIAS & DIAS data release 2 years DIAS-GCI Cooperation Hiroko KINUTANI DIAS (Data Integration and Analysis System in Japan) , St. Petersburg.
OceanDocs Digital Repository of Marine Science Research Outputs
Flanders Marine Institute (VLIZ)
Copyright 2012 Lola Olsen & Tyler Stevens.
VI-SEEM Data Repository
Data Discovery Boulder, CO May 15, 2006 Scott Ritz
Outline Pursue Interoperability: Digital Libraries
Enabling direct data access to social science research data
The New Face of Information Retrieval: The Ankara University Open Access Platform Prof. Dr. Sekine Karakaş Prof. Dr. Doğan.
Research Data Archives at NCAR
WGISS Connected Data Assets Oct 24, 2018 Yonsook Enloe
Steven Worley, NSF/NCAR/SCD
Metadata The metadata contains
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Steven Worley, Douglas Schuster,
School of Information Studies, Syracuse University, Syracuse, NY, USA
CISL’s Research Data Archive (RDA) : Description and Methods
Digitization Standards: Issues & Updates
Comeaux and Worley, NSF/NCAR/SCD
Long-Lived Data Collections
Data Management Components for a Research Data Archive
Sound Preservation: First Steps
Successful Data Curation for Large Data Archives
Data Curation in Climate and Weather
Comeaux and Worley, NSF/NCAR/SCD
Preserving Access for the Future
Presentation transcript:

The Research Data Archive at NCAR: A Metadata System that Enables Discovery Across a Diverse Archive Robert Dattore and Steven Worley National Center for Atmospheric Research Boulder, CO, USA 01/25/2011 AMS 2011

Outline Introduction RDA - Then RDA - Now Data Discovery AMS 2011 01/25/2011 AMS 2011

Introduction Purpose - support climate & weather research at NCAR; services are extended worldwide as resources permit Observations, derived products; focus on historical atmosphere/ocean data Metrics Established in 1960s 600+ datasets, 4M files, 600 TB 7000 users annually 01/25/2011 AMS 2011

Changing data landscape Introduction Changing data landscape Then – small datasets, single country/experiment, specialized formats Now – global coverage, high spatial/temporal resolutions, standard formats Result and challenge: Lots of diversity How can we provide uniform discovery? 01/25/2011 AMS 2011

Then 01/25/2011 AMS 2011

Unscalable System! Then Bottom line Increasing data diversity, evolving technology; difficult to develop good systematic discovery README files, directory names Primarily via personal communications Major limiting factor – insufficient metadata No metadata standard, dictionaries Collection not uniform across all datasets Rigidly-structured flat ASCII files Archiving separate from metadata collection Unscalable System! We needed to make a change 01/25/2011 AMS 2011

Now 01/25/2011 AMS 2011

Adopted GCMD3 controlled vocabularies Now Developed local standard for discovery based on DIF1 & THREDDS2; applied across all datasets Adopted GCMD3 controlled vocabularies Local enhancements; e.g. data formats Harvest two types of file metadata File attribute – name, size, compression, … File content - variables, levels, date range, ... Storage using XML 1Directory Interchange Format, NASA/GCMD3 ; 2Thematic Realtime Environmental Distributed Data Services; 3Global Change Master Directory 01/25/2011 AMS 2011

Metadata Collection 01/25/2011 AMS 2011

Tools that automatically capture file metadata Metadata Collection Tools that automatically capture file metadata Integrated with archiving activities Web-based GUI - guided entry of dataset discovery metadata Required fields, constrained entries 01/25/2011 AMS 2011

Relational Databases 01/25/2011 AMS 2011

All together, support accurate data discovery Relational Databases Fast access Dataset discovery metadata Single database (~0.3M rows) File attribute metadata Single database (~45M rows) Maintains dataset/data file relationships File content metadata Four databases structured to handle diversity of data (~920M rows) Maintains detailed parameter relationships All together, support accurate data discovery 01/25/2011 AMS 2011

Data Discovery 01/25/2011 AMS 2011

Data Discovery Dataset discovery Google-like dataset search “Look For Data” interface – user-defined dataset catalogs Auto-generated dataset pages – always up-to-date Collections – all reanalyses, upper air obs, surface obs 01/25/2011 AMS 2011

Data Discovery Data file discovery Other “Create Your Own List” for data file lists Show specific files from terabyte-sized collections Other “Station Viewer” Google maps; see stations, metadata 01/25/2011 AMS 2011

Metadata Sharing OAI-PMH UCAR Community Data Portal (THREDDS) Global Change Master Directory (DIF) also Dublin Core, native easy to add others as necessary 01/25/2011 AMS 2011

Thank You! Web: http://dss.ucar.edu Email: dssweb@ucar.edu Questions/comments? 01/25/2011 AMS 2011