1 ORNL DAAC: Data and Services Robert Cook and Suresh SanthanaVannan Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN Presentation at DataONE EVA Working Group Meeting Albuquerque, NM November 17-19, 2009
2 ORNL DAAC: What’s that? Oak Ridge National Laboratory Distributed Active Archive Center Archive data products produced by projects within NASA’s Terrestrial Ecology Program Mission: assemble, distribute, and provide data services for a comprehensive archive of terrestrial biogeochemistry and ecological dynamics observations and models to facilitate research, education, and decision-making in support of NASA’s Earth science. ORNL DAAC’s Web Site
3 ORNL DAAC: Data Collections 3. Regional and Global Studies (147) Studies (147) Climate Soils Vegetation Hydroclimatology 2. Validation of Remote Sensing Products (21) (Number of Data Sets = 826) 1. Field Campaigns (647) 6-9 year intensive study of a region: Amazon (LBA) Northern Canada (BOREAS) Southern Africa (SAFARI 2000) BOREAS LBA S2K In-situ Observations ? Remote Sensing LAI/fPAR NPP LAI/fPAR NPP 4. Model Code (9) Benchmark Models IBIS, BIOME-BGC, LSM Manuscript Models PNeT, Century, Biome-BGC
4 ORNL DAAC Data Holdings (2009) Mean = 512 MBMedian = 488 KB Total = ~500 GB Number of Data Sets 589 bytesSAFARI 2000 ANNUAL SOIL RESPIRATION DATA (RAICH AND SCHLESINGER 1992) 154,703,764,100 bytesSAFARI 2000 MODIS AIRBORNE SIMULATOR DATA, SOUTHERN AFRICA, DRY SEASON 2000
5 Data Characteristics Number of files (granules) per data set Median = 2 granules per data set Mean = 263 granules per data set 9,595AMS (AUTOMATED MET STATION) DATA (FIFE) 13,800NOAA REGIONAL SURFACE DATA (FIFE) 33,472BOREAS RSS-14 GOES-7 LEVEL-1A VISIBLE, INFRARED, AND WATER VAPOR IMAGES 92,619LBA-ECO CD-07 GOES-8 L1 RADIANCE DATA FOR AMAZONIA:
6 Tools and Services: For the highly diverse ORNL DAAC community Global land surface modelers Discovery: FTP browse, OPeNDAP catalog Formats: ASCII Grid, netCDF, HDF, binary Tools: FTP Download, OPeNDAP servers (THREDDS (Thematic Real-time Environmental Distributed Data Services) Data Server catalog) Spatial data / Remote Sensing users Discovery: FTP, Metadata catalog, Catalog of Spatial Data Formats: ASCII Grid, GeoTIFF, shape files Tools: MODIS Subsetting Tools, Spatial data download, WebGIS Field investigators Discovery: Metadata catalog, Google Search Formats: ASCII tables Tools: Databases, spreadsheets, visualization (WebGIS, MODIS Time Series, SPEC/ ISIS), on-line services (MODIS Subsets) Spectrum of users is wide
7 User Working Group Board of Directors function not FACA (Federal Advisory Committee Act) Serves a peer review function for an on-going project (began in 1992) Represent the scientific interests of the research community – members are data providers and data users Scientists funded by NASA and other agencies (NSF LTER) ORNL DAAC’s NASA Program Scientist (Diane Wickland) Assist in defining the DAAC's science goals, setting priorities Driving force for DAAC evolution over the past decade Provide guidance on DAAC activities, including data set acquisition, development of tools and services, incorporation of new technologies
8 Additional Slides
9 Observations Data centers should be a partnership of data managers, data providers, and data users Seek formal guidance from “User Working Group” Data centers should facilitate new science Metric is citations to data sets Changing research demands and advances in information technologies are creating new roles for data centers
10 Data management coordination through User Working Group Close coordination / communication among data managers, those making the measurements, modelers, and other data users is critical Collectors Users Data Management and Analysis System
11 Metadata needed to Understand Data The details of the data …. Measurement date sample ID parameter name location For those on Investigator’s team, amount of metadata required to understand the data is small
12 Metadata Needed to Understand Data: 20-year perspective Measurement QA flag media generator method date sampleID parametername location records Units Sample def. type date location generator lab field Method def. words, words units method Parameter def. org.type name custodian address, etc. coord. elev. type depth Record system date words, words. QA def. Units def. GIS From Raymond McCord, ORNL
13 Provide tutorial on “Best Practices for Preparing Ecological Data to Share” Cook et al Bull. ESA. 82: 138 – 141 Best Practices include: 1. Assign Descriptive File Names 2. Use Consistent and Stable File Formats 3. Define the Parameters 4. Use Consistent Data Organization 5. Perform Basic Quality Assurance 6. Assign Descriptive Data Set Titles 7. Provide Documentation Update on-line: Best Practices for Preparing Ecological and Ground-Based Data Sets to Share and Archive Robert B. Cook, Richard J. Olson, Paul Kanciruk, and Leslie A. Hook Environmental Sciences Division Oak Ridge National Laboratory