Download presentation
Presentation is loading. Please wait.
Published byJoy Grant Modified over 9 years ago
1
NOAA/NESDIS/National Oceanographic Data Center Following the Flow of Two Underway Data Streams Within the U. S. National Oceanographic Data Center Steven B. Rutz (steven.rutz@noaa.gov) NOAA/NESDIS/NODC, 1315 East West Hwy SSMC3 Fourth Floor, Silver Spring, MD 20910 The stewardship of the nation's oceanographic data archive is an essential responsibility of the U.S. National Oceanographic Data Center (NODC). NODC’s focus continues to be on the long-term preservation, integrity, and accessibility of irreplaceable observational data through multiple technological and scientific generations. NODC has implemented processes to ensure that its data archive stewardship responsibilities are met, that online data discovery and retrieval services are expanded, and that adequate supporting metadata are available to guide use of the provided data. The NODC Archive Management System (AMS), launched in 2004, enables datasets to be accessioned, archived, and disseminated in a web-enabled, browser-based environment (http://www.nodc.noaa.gov/Archive/Search/). For the first time, NODC’s collection of over 20,000 unique accessioned data collections, ranging from individual observations to large datasets of major programs, are managed in a unified system. Two such data collections managed within the AMS are from (1) the Global Data Assembly Center (GDAC) of the International Oceanographic Data and Information Exchange’s Global Ocean Surface Underway Data Project (GOSUD), and (2) the NOAA Office of Marine and Aviation Operations (NMAO), which maintains the Scientific Computer System (SCS) aboard NOAA vessels. The following pages describe NODC’s AMS and how these two data collections (or streams) are managed within it. This poster was based on the contributions by Donald W. Collins, Eric J. Ogata, Francis J. Mitchell, Joseph Shirley and Thaila Thailambal (NODC). GOSUD contributions were aided by Thierry Carval (IFREMER) and John Relph (NODC). ARCHIVE FILE MANAGEMENT Some of the features of the AMS’s file management are: Generates a uniform directory tree structure for each unique, original dataset submission; Creates MD5 checksums for file validation Performs virus checks; Implements dataset versioning; and Provides for automated backups Each dataset archived at NODC is assigned a unique Accession Number. For each new accession, a file directory structure is automatically created. Original data and metadata files are placed in the data directory, while NODC-created information are placed in the about directory. When the ATDB record is complete, each file in the storage area is automatically checked for viruses, MD5 checksums are calculated, and then the Accession is published online for the public. If it is necessary to update an accession, these files are checked out, updated, and then re-published as a new version with the same Accession Number. ARCHIVE MANAGEMENT SYSTEM (AMS) The NODC Archive Management System (AMS) enables datasets to be accessioned, archived, and disseminated in a web-enabled, browser-based environment. NODC’s 20,000 unique accessioned datasets, ranging from individual observations to large collections by major programs, are managed in a unified system. Three major components of the AMS covered in this poster are the Archive File Management, Accession Tracking Database, and Archive Search and Retrieval. Some of the advantages of the AMS for data producers and data consumers are: Long-term data management at no charge to data producer; Data accessible to a worldwide audience long after the data producer is gone; Fulfills contractual obligations of federally funded research; and Low-cost access to global data from a reliable source. NODC ARCHIVE ACCESSION TRACKING DATA BASE (ATDB) Some of the features of the AMS’s ATDB are: Generates a unique Accession Number for tracking each dataset submitted to NODC; Captures basic metadata for data discovery by the public; Exports metadata into XML files that follow the FGDC Content Standard for Digital Geospatial Metadata (CSDGM); Uses a controlled vocabulary for dataset descriptions; and Provides a mechanism for NODC data managers to oversee the management of each dataset. When a record is created in the ATDB for a dataset submitted to NODC, an Accession number is automatically assigned. The Accession number, a unique dataset identifier, is the primary key in the ATDB. The ATDB record also contains a limited amount of descriptive metadata about each accession such as observation date ranges, submitting person and institution, data types, and geographical bounding coordinates. Upon completion of the ATDB record, the data are ready to be published online for the public. ARCHIVE SEARCH AND FILE RETRIEVAL Some of the features of the AMS’s archive search and file retrieval interface (the Ocean Archive System) are: Consumer-driven search, discovery, and retrieval of datasets archived at NODC via a web browser; Searches on nearly two dozen parameters, including data submitting and collecting institutions; Downloads an entire accession at once or individual data files; Provides checksums to ensure validity of downloaded data files. All archived data may be searched by the public at http://www.nodc.noaa.gov/Archive/Search/. Searches are performed on the metadata in the ATDB. Data that are identified as relevant to the information needs of the search can be downloaded as a whole or as individual files. DATA PULLED Via FTP, the GDAC serves the GOSUD data in netCDF files. Each netCDF file has an associated MD5 checksum file, which is generated by the GDAC. The MD5 checksum is regenerated whenever its associated netCDF file is updated. The MD5 file is served from the same FTP directory as its associated netCDF data file. The GOSUD data are assigned NODC Accession Number 0001715. Once a day, NODC pulls the netCDF and MD5 files from the GDAC into the Accession 0001715 directory, which is managed within the AMS. Once downloaded, the MD5 checksums are generated for each netCDF file and compared to the MD5 checksum generated by the GDAC. If the files are corrupted, the transfer is tried again (if still corrupted, an NODC data manager is notified). Uncorrupted files are left in place within the directory structure of the AMS, where the data are served via OPeNDAP (see figure below). GOSUD GDAC INTRODUCTION IFREMER serves as the GDAC for the GOSUD project. The GDAC serves the data via FTP, OPeNDAP, and a Geographical Information System portal. The GDAC receives and QCs real-time and historical underway data from several vessels and sources. For more information on GOSUD and for data from the GDAC, go to http://www.gosud.org. NODC mirrors the GDAC and serves as a long- term archive for the GOSUD data. Below is a description of how the GOSUD data stream flows into NODC and is managed within the AMS. GOSUD GDAC METADATA The GOSUD GDAC data are accessioned within the AMS under Accession 0001715. Metadata to track these data within NODC and for the public to find the data are stored in the ATDB (see figure below). FINDING GOSUD DATA The GOSUD data are available through: NODC OPeNDAP server ( http://data.nodc.noaa.gov/cgi-bin/nph-dods/iode/gosud ); HTTP ( http://data.nodc.noaa.gov/iode/gosud ); FTP ( ftp://data.nodc.noaa.gov/iode/gosud ); and Ocean Archive System, the public archive search and file retrieval interface ( http://www.nodc.noaa.gov/Archive/Search ). Through the Ocean Archive System, the data can be found by searching for GOSUD in the title or as the project or by searching for 0001715 as the Accession number (see figure below). DATA SUBMITTED SCS data are submitted to NODC by shipboard technicians on CD-ROM or uploaded to the NODC’s FTP server. Each SCS submission, which consists of data from several cruises, is assigned an NODC Accession Number and then the files are moved to the archive file management directory structure for that Accession (see figure below). After all data and documentation (e.g., file format description) files are complete and all the required metadata are entered into the ATDB, then the Accession is published online for the public. NMAO SCS INTRODUCTION The NOAA Office of Marine and Aviation Operations (NMAO) developed the Scientific Computer System (SCS) to uniformly log data from a variety of instrument packages aboard NOAA vessels. The SCS is used on U.S. Coast Guard and other vessels, also. NODC serves as a long-term archive for a subset of the NMAO SCS data. Below is a description of how the NMAO SCS data stream flows into NODC and is managed within the AMS. NMAO SCS METADATA The SCS metadata entered into the ATDB are in the header files and are extracted from the data files (see figure below). FINDING SCS DATA The NMAO SCS data archived at NODC are available through the Ocean Archive System (http://www.nodc.noaa.gov/Archive/Search), the public interface to the AMS. Through the Ocean Archive System, the data can be found by searching for NSSDAC (NOAA Shipboard Sensor Data Acquisition Database) as the project or for one of the NOAA vessels as a platform (see figure below).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.