OneStop Metadata Team Lead OneStop: Metadata Plans and Progress Nancy Ritchey1, OneStop Metadata Team Lead Anna Milan1, Philip Jones2, Don Collins1, Yuanjie Li3, Jacqueline Mize4, Ge Peng5 1NOAA’s National Centers for Environmental Information, 2Team ERT/STG, 3Science and Technology, 4Riverside Technology, Inc., 5NOAA’s Cooperative Institute for Climate and Satellites, North Carolina January 4 and 7, 2016 NOAA Satellite and Information Service | National Centers for Environmental Information
OneStop: Data Discovery and Access OneStop supports NOAA's efforts by leveraging existing catalog and access technologies to develop an improved data access framework. The framework will be based on improved discovery, access, and visualization services for the data. In order to support the framework, one important goal of the project is to improve data documentation, access, and machine-readability of the data. Initially limited to NCEI-stewarded data and metadata
OneStop: Metadata Focus OneStop will focus on the following types of metadata for data in NCEI and CLASS: Collection level metadata Granular level metadata Ranking, quality, and relevance metadata Standardized vocabularies collection level development tools such as ATRAC, S2N granule level development tools such as netcdf templates evaluation collection level metadata via rubric
OneStop: Metadata Enables Metadata is the key to discovery, search and data usability Capitalize on NOAA’s Data Documentation PD and existing NCEI best practices for ISO metadata to develop a OneStop Metadata Guide. Use existing tools to develop and evaluate metadata Develop information to allow relevancy ranking of data based on data maturity and data stewardship maturity metadata guide will evolve as OneStop evolves collection level development tools such as ATRAC, S2N granule level development tools such as netcdf templates evaluation collection level metadata via rubric
Metadata Management Services Metadata Repository components will be used to manage metadata. NoSQL graph database is being considered for collection metadata. NoSQL document store is being investigated for granular level metadata. Metadata editing tools will be created from the augmentation and/or combination of existing capabilities and/or COTS products. ATRAC, Send2NCEI, CEDIT Oxygen XML editor, Altova XMLSpy
Metadata Management Services Web Accessible Folders (WAFs) support harvesting by externals and indexing into catalog services OneStop will develop modules to harvest metadata records from any WAF and load into the collection level repository.
Metadata Management Services Metadata evaluation tools (EMMA) will be used to assess the completeness and standardization of metadata records. OneStop personnel will evaluate metadata rubric score and keyword standardization in order to prepare or improve the metadata as needed.
Relevancy Ranking Leverage existing Maturity Matrices for assessing data to improve relevancy ranking and to assist users in finding the best data fit for their purpose Data Maturity Matrix - a systematic means of assessing the ‘usability’ of data based on the metadata, methodology, product validation, documentation and accessibility Data Stewardship Maturity Matrix - a unified framework for measuring stewardship practices applied to individual data products additionally use of personal recommendations/rankings
Data Maturity Matrix
Data Stewardship Maturity Matrix
OneStop Metadata Status Metadata best practices draft - completed Dec 2015 Identification of initial OneStop data sets - expected completion Jan 2016 Assessment of existing tools for future editing tools - in progress Assessment of standard vocabularies - begin Jan 2016 Metadata assessment and improvement for key data sets - begin February 2016 DSMM and DMM assessment - begin in June 2016
Feedback Form http://goo.gl/forms/6c54aH7S1e Three thumbs up/thumbs down questions, add a user story, become a beta tester, and a free text comment field (email address optional) Questions?
www.ncei.noaa.gov www.climate.gov NCEI Climate Facebook: http://www.facebook.com/NOAANCEIclimate NCEI Ocean & Geophysics Facebook: http://www.facebook.com/NOAANCEIoceangeo NCEI Climate Twitter (@NOAANCEIclimate): http://www.twitter.com/NOAANCEIclimate NCEI Ocean & Geophysics Twitter (@NOAANCEIocngeo): http://www.twitter.com/NOAANCEIocngeo
Backup Slides For more information...
A major focus of NOAA OneStop is moving hard-to-access data from deep storage to easily accessible spinning disk storage. Having these data available on spinning disk will enable machine- and human-readable access via interoperable services such as HTTP, FTP, WCS, OpenSearch, and WMS served by THREDDS, OPeNDAP Hyrax, and other servers. OneStop intends to procure, install and operate this disk storage with the help of OSGS, building on their experience in deploying a similar solution for NOMADS and Ocean NOMADS. This approach represents a step in the direction of NESDIS providing enterprise “storage as a service” capabilities for its many large-volume data programs. In this session we will provide an overview and solicit feedback from the community on the system’s concept of operations.