Content Framework for Operational Environmental Remote Sensing Data Sets: NPOESS Concepts Alan M. Goldberg NOTICE This technical data.

Slides:



Advertisements
Similar presentations
Future Directions and Initiatives in the Use of Remote Sensing for Water Quality.
Advertisements

DS-01 Disaster Risk Reduction and Early Warning Definition
Group on Earth bservations Discussion Paper on a Framework Dr. Ghassem Asrar August 1, 2003.
Chapter 10: Designing Databases
Spatial Data Infrastructure: Concepts and Components Geog 458: Map Sources and Errors March 6, 2006.
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Prentice Hall, Database Systems Week 1 Introduction By Zekrullah Popal.
The Experience Factory May 2004 Leonardo Vaccaro.
July 11 th, 2005 Software Engineering with Reusable Components RiSE’s Seminars Sametinger’s book :: Chapters 16, 17 and 18 Fred Durão.
Caro-COOPS Data Management: Metadata. Cast-Net addresses the need for improved connectivity among coastal observing systems by creating a regional framework.
Organizing Data & Information
Physical design. Stage 6 - Physical Design Retrieve the target physical environment Create physical data design Create function component implementation.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Introduction to Database Management
LEVERAGING THE ENTERPRISE INFORMATION ENVIRONMENT Louise Edmonds Senior Manager Information Management ACT Health.
Introduction to the course January 9, Points to Cover  What is GIS?  GIS and Geographic Information Science  Components of GIS Spatial data.
NOAA Metadata Update Ted Habermann. NOAA EDMC Documentation Directive This Procedural Directive establishes 1) a metadata content standard (International.
Chapter 10 Architectural Design
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
05 December, 2002HDF & HDF-EOS Workshop VI1 SEEDS Standards Process Richard Ullman SEEDS Standards Formulation Team Lead
Data Formats: Using Self-describing Data Formats Curt Tilmes NASA Version 1.0 February 2013 Section: Local Data Management Copyright 2013 Curt Tilmes.
Addressing Metadata in the MPEG-21 and PDF-A ISO Standards NISO Workshop: Metadata on the Cutting Edge May 2004 William G. LeFurgy U.S. Library of Congress.
An Introduction to Software Architecture
Discussion on an HDF-GEO concept HDF Workshop X 30 November 2006.
HDF5 A new file format & software for high performance scientific data management.
HDF5 for NPOESS Data Products Alan M. Goldberg The MITRE Corporation Organization: W803 Project: 1400NT01-SE This work was performed.
NPP/ NPOESS Product Data Format Richard E. Ullman NASA/GSFC/NPP NOAA/NESDIS/IPOAlgorithm / System EngineeringData / Information Architecture
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
1 Next Generation of Operational Earth Observations From the National Polar-Orbiting Operational Environmental Satellite System (NPOESS): Program Overview.
Architecture for a Database System
High Level Architecture Overview and Rules Thanks to: Dr. Judith Dahmann, and others from: Defense Modeling and Simulation Office phone: (703)
Why do I want to know about HDF and HDF- EOS? Hierarchical Data Format for the Earth Observing System (HDF-EOS) is NASA's primary format for standard data.
N P O E S S I N T E G R A T E D P R O G R A M O F F I C E NPP/ NPOESS Product Data Format Richard E. Ullman NOAA/NESDIS/IPO NASA/GSFC/NPP Algorithm Division.
February 17, 1999Open Forum on Metadata Registries 1 Census Corporate Statistical Metadata Registry By Martin V. Appel Daniel W. Gillman Samuel N. Highsmith,
The netCDF-4 data model and format Russ Rew, UCAR Unidata NetCDF Workshop 25 October 2012.
Planetary Science Archive PSA User Group Meeting #1 PSA UG #1  July 2 - 3, 2013  ESAC PSA Archiving Standards.
Creating Archive Information Packages for Data Sets: Early Experiments with Digital Library Standards Ruth Duerr, NSIDC MiQun Yang, THG Azhar Sikander,
The GEOSS Architecture. 2 Three perceptions of “A System of Systems” A System for Converging Observation Systems Worldwide A System for Integrating Observation,
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
DATABASE MANAGEMENT SYSTEMS CMAM301. Introduction to database management systems  What is Database?  What is Database Systems?  Types of Database.
Draft GEO Framework, Chapter 6 “Architecture” Architecture Subgroup / Group on Earth Observations Presented by Ivan DeLoatch (US) Subgroup Co-Chair Earth.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
National Geospatial Digital Archive Greg Janée University of California at Santa Barbara.
29 April 2009 Brian Balm Program Manager Integrating New Capabilities in Operational Space Weather Systems.
Transitioning from FGDC CSDGM Metadata to ISO 191** Metadata
National Aeronautics and Space Administration Jet Propulsion Laboratory California Institute of Technology Pasadena, California EDGE: The Multi-Metadata.
Configuration Management and Change Control Change is inevitable! So it has to be planned for and managed.
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
Thoughts on Stewardship, Archive, and Access to the National Multi- Model Ensemble (NMME) Prediction System Data Sets John Bates, Chief Remote Sensing.
NPOESS Enhanced Description Tool - “ned” Richard E. Ullman NASA/GSFC/NPP NOAA/NESDIS/IPO Data / Information Architecture Algorithm / System Engineering.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Why Standardize Metadata?. Why Have a Standard? Think for a moment how hard it would be to… … bake a cake without standard units of measurement. … put.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
Managing Enterprise GIS Geodatabases
COMMON COMMUNICATION FORMAT (CCF). Dr.S. Surdarshan Rao Professor Dept. of Library & Information Science Osmania University Hyderbad
CLASS Metadata and Remote Sensing Extensions CLASS Data Provider’s Conference September 2005 Anna Milan, Ted.Habermann,
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
The HDF Group Introduction to HDF5 Session Two Data Model Comparison HDF5 File Format 1 Copyright © 2010 The HDF Group. All Rights Reserved.
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
LECTURE 5 Nangwonvuma M/ Byansi D. Components, interfaces and integration Infrastructure, Middleware and Platforms Techniques – Data warehouses, extending.
NetCDF Data Model Details Russ Rew, UCAR Unidata NetCDF 2009 Workshop
CEOS Working Group on Information System and Services (WGISS) Data Access Infrastructure and Interoperability Standards Andrew Mitchell - NASA Goddard.
Understanding the Value and Importance of Proper Data Documentation 5-1 At the conclusion of this module the participant will be able to List the seven.
NASA Earth Science Data Stewardship
Advanced Applied IT for Business 2
An Introduction to Software Architecture
Reportnet 3.0 Database Feasibility Study – Approach
Palestinian Central Bureau of Statistics
Presentation transcript:

Content Framework for Operational Environmental Remote Sensing Data Sets: NPOESS Concepts Alan M. Goldberg NOTICE This technical data was produced for the U.S. Government under Contract No. 50-SPNA , and is subject to the Rights in Technical Data - General clause at FAR (JUN 1987) © 2004 The MITRE Corporation Approved for public release; distribution unlimited

NPOESS will collect an unprecedented quantity and variety of satellite environmental data from a constellation of satellites carrying multiple remote sensing and in situ sensors. NPOESS realized the need for standardization and good metadata in connection with environmental data sets. Global data sets, such as those produced by satellites or used in modeling and simulation, are at the leading edge of information stewardship challenges. Data sets will be delivered to operational users by pre-subscription through Interface Data Processors (IDPS) at the four US processing Centrals, and through direct readout Field Terminals (FTS) worldwide. Data will be retrieved by near- and long-term researchers through NOAA’s data archives. NPOESS committed early to using existing standards in a rational manner to support our mission goals. –Standards-based data products ease product usability. –Comprehensive metadata contributes to maximizing the continuing value of large, complex data sets. Standards-based metadata ease product understanding. –Working from a clean sheet of paper, NPOESS can serve as a “workshop” for best practices in data stewardship. This paper presents the results of planning for the NPOESS data product delivery system. It presents a view of the optimal data product design, much of which is being implemented in NPOESS. Motivation

Application Processing, Display, & Dissemination FT Processor Element Central Central Element NPOESS-Unique Processing RF Electronics Demodulator C3S/DRR Space Segment NPOESS-Unique Collection Antenna User Terminal IDPS External I/F Other Data FTS End-to-End Process The National Polar-orbiting Operational Environmental Satellite System (NPOESS) is managed by the NOAA Integrated Program Office, on behalf of the U.S. Dept. of Commerce, Dept. of Defense, and NASA. Northrop-Grumman Space Technology is prime contractor, with Raytheon the subcontractor for ground processing & operations

Architectural Considerations Sizing Overall data rates have increased by two orders of magnitude more products impact the on existing user systems and applications decision to “loosely” couple simple data interface with flexibility on both sides decided to deliver relatively short-duration data granules, containing typically 30 seconds of data Users and use patterns: designed to serve operational users, current science users, and future archival researchers Operational users need effectively all of the data as soon as possible Research users need current environmental information, but usually have time and resources to improve product quality with post-processing Archival researchers look for highly selective data sets Users via Centrals or via field terminals Sensor complexity multiple versions of the data the raw bitstreams originating from the sensors (RDRs) calibrated fluxes measured by the sensor (SDRs) environmental variables estimated at the source (EDRs) sensors themselves produce a wide variety of data types various techniques to maximize performance within bandwidth Anticipating change detailed formats, contents, product lists, and interfaces that must be accommodated by NPOESS data product framework format through the mission lifetime

Data Products Are More Than Scenes IDPS produces mission data sets which recreate or estimate signals at 4 points in the sensing chain. Ancillary data, brought from other systems and used in EDR processing, is captured. Auxiliary data, produced within the NPOESS system to support processing, is kept with the mission data or incorporated in documentation. Metadatata provided for all.

RDRs SDRs EDRs Ancillary Time-Series of Packet Types Binary Headers Multi- spectral Imagery Vector Flux Slit Spectra Sounding FT Spectra uvwuvw Calibration Table Abstract External Data ? Data Volume Geolocation Thematic Layers Quality Imagery Column Data Geolocation Typical Data Organization

First Decision – File Format The NPOESS program identified several key characteristics for the file format within which data products would be delivered. Hierarchical Data Format ver.5 (HDF5) was found to be the best solution within technical and programmatic constraints: A single format with proven ability to handle environmental data products (EDRs); more abstract data structures in RDRs, TDRs & SDRs; and other products delivered to users and archives Capability to incorporate full metadata Supported by the user community and other institutional support, with an adequate practical lifetime; interoperable with DoD standards Ability to handle large data sets (such as full orbits) and small data sets (individual granules) with acceptable efficiency Ability to handle multiple arrays and array types within the same granule, such as observational data arrays and geolocation arrays High efficiency for reading and writing; built-in compression function; capability to “chunk” large data arrays to access prestructured subsets Sufficiently self-documenting to permit variable formats Available with development tools to expedite file definition and applications Acceptable licensing terms Supported on all likely user platforms & operating systems Support for all likely atomic data types Simple data objects and groups which permit application-specific structures to be created

Data Dataspace Dim_1=5 Dim_2=4 2 Dim_3=2 Rank Dimensions Dataset Header Attributes current = 12e-9 temp = 56 time = 32.4 Datatype int16 Storage layout chunked; compressed A Dataset granule attributes Granule Dataset granule attributes Granule file attributes File B File Structure Implemented in HDF5 HDF5 provides a simple, logical file structure based on a Dataset, comprising a data array and a header which describes the dataspace. Datasets can be structured hierarchically. NPOESS datasets and additional attributes – incorporating the metadata – combine into granules, and granules combine into product files. Granules are concatenated in a file in such a way that they can be addressed either collectively or individually. Individual datasets will be created for elements such as mission data, quality assessment, geospatial location, time, illumination, and viewing geometry. Users may access subsets of the full data using HDF5 utilities.

Second Decision - Metadata Based on extensive prior experiences in the earth sciences and other sciences, NPOESS highest level operational requirements specified that comprehensive metadata would be delivered with the data. In our context, metadata incorporates the following “data about data”: Identification Content summary Content meaning Content structure and format Acquisition and processing history – provenance Distribution and availability The National Spatial Data Infrastructure (NSDI) provides basic guidance for metadata. The Federal Geospatial Data Committee (FGDC) sets content standards. The program establishes compatible extensions to the standard. The program also defines the representation. Over one hundred metadata items have been defined for NPOESS. Basic identification metadata is duplicated in a User Block at the front of the file, where it can be read without HDF software.

FGDC Metadata Base & RS Standards Base Standard RSE Standard Metadata 2 Data Quality 1 Identification 4 Spatial Reference 3 Spatial Data Organization 5 Entity & Attribute Information 6 Distribution Information 7 Metadata Reference Information 8 Platform & Mission Information 9 Instrument Information

Concept Analysis Repositories Complete granule metadata Quasi- static NSDI metadata Dynamic NSDI metadata Dynamic detail metadata eHandbook external reference Aggregate metadata Common metadata Unique metadata granule file Identification (UB) NPOESS will create all applicable metadata identified in the FGDC Base Standard and Remote Sensing Extensions, plus mission-unique. Comprehensive metadata at the granule level will be captured in different ways, consistent with usage patterns. Metadata which changes infrequently will be collected in an external online reference. Metadata which changes with each granule is saved with it. Metadata which describes a file or all the granules in the file is extracted and stored with the file.

Third Decision – File Organization Primary characteristics of a design Systematic data organization Simplifies data maintenance, enhancement, documentation, retrieval, visualization, and exploitation NPOESS evaluation looked at lessons learned from EOS, and best practices elsewhere Must work with abstract, generalized scientific, and specifically geospatial data sets Result: Specific derived requirements closely match the “Climate & Forecast Conventions”, primarily developed for use with netCDF. It provides guidelines for consistent, complete, and clear data entity and attribute definition. NPOESS is attempting to implement this data design. Need clear identification of truly independent and dependent variable contents: For most unresampled data, the natural independent variables are the index attributes defining the discrete points in space and time at which the data samples were collected. Index might be a time-series sample number, detector number, energy or spectral bin. Usually include an associated independent variable for each index attribute, such as time, direction, position, or energy level. These relate to indices by calibration, known to be very accurate. Some associated independent variables may be multidimensional. E.g., polar geolocation or solar elevation is a deterministic function of spatial indices. Always a primary dependent variable: function of independent variable(s). May be an abstract binary object, function of time or place. Usually, well defined variables which are a function of temporal, spatial, or spectral indices. Often 1+ associated dependent variable arrays, such as quality estimates or telemetry values, associated by design with the primary independant variable Optional supplementary arrays, such as calibrations, are functions of 1+ index attributes used in the primary and associated data arrays. Concatenation: One index attribute, usually time or time-like, can often be defined as the index which establishes continuity from one granule to the next.

Primary Index n-Dimensional Dependant Variable (Entity) Array Primary Array e.g., Flux, Brightness, Counts, NDVI 2-D Independent Variable Arrays e.g., lat/lon, sun alt/az, land mask Index Attribute Associated Independent Variable(s) Clear Index & Array Definitions

Putting Together the Framework

With careful design, and based on lessons learned from previous programs, a comprehensive data product design can be achieved. The design eases development and maintenance, by providing a common approach to data and metadata. Granules form the basic unit of production and cataloguing. They are essentially self-contained. Each granule contains the primary data arrays, associated data arrays, and descriptive attributes – including metadata – needed to understand the information content. To facilitate efficient delivery, multiple granules of the same type are combined into product files. Common attributes of all granules in a file may be extracted to the file level as common metadata. Summary and identification attributes are created and added to the file. Basic identification metadata is extracted from the HDF format and saved as an ASCII ‘user block’ at the physical start of the file. Finally, metadata which changes only rarely is maintained separately as an electronic handbook, and is incorporated in the granules by reference.

eHandbook Associated Arrays Granule Attributes Granule Metadata Other Granule Attributes Index Variables Primary Array Datasets User Block HDF File Block Root File Attributes File Metadata Common Metadata Other File Attributes Data Product Associated Arrays Granule Attributes Granule Metadata Other Granule Attributes Index Variables Primary Array Datasets Associated Arrays Granule Attributes Granule Metadata Other Granule Attributes Index Variables Primary Array Datasets