1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working.

1 1 CDIAC Data Support for SPRUCE and NGEE Les A. Hook and Ranjeet Devarakonda Environmental Sciences Division Oak Ridge National Laboratory CDIAC User Working Group Meeting September 27-28, 2010 ORNL research was sponsored by the U.S. Department of Energy and performed at Oak Ridge National Laboratory (ORNL). ORNL is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05- 00OR22725.

2 2 Overview Data Support for SPRUCE Data Management Planning Goals outlined in the Science Plan Requirements identified in the Data Policy Actions and resources needed to meet requirements are in the Data Management Plan Implementation SPRUCE web site Resources and products accessible on the web site Data Support for NGEE Data Management Planning Expect planning to be similar to SPRUCE NGEE Web Site Shared Development Effort for Acquisition and Processing of Sensor Data

3 3 Science Plan for the Climate Change Response Scientific Focus Area 3.11 Data and informatics Goals for Response SFA data management are to ensure the fidelity and accessibility of the SFA data, minimize the amount of time research personnel need to spend on data management activities while achieving high quality data and metadata, and ensure that the data and metadata can be located and used by project personnel (initially) and the broader scientific community. The suite of activities that collectively comprise this component of the SFA will naturally evolve over the life of the SFA, and they will be done in collaboration with data management components of other Climate SFAs. Initial data management work will focus on defining the data collection and distribution requirements, identifying key leverage points across SFAs and other projects, ensuring that site characterization data is maintained, and resolving any critical informatics knowledge gaps identified in the requirements definition. As the experiments begin to collect high resolution data, the data management activities will shift to ensuring that the experimental data are properly archived and distributed according to the SFA’s data access policy. Data from the Response SFA will be a combination of observational data recorded by researchers and data collected by automated equipment. Further details can be found in Annex C. The data management component will leverage the expertise and tools in the Environmental Data Science and Systems (EDSS) group, particularly the Carbon Dioxide Information and Analysis Center (CDIAC) and the Atmospheric Radiation Measurement (ARM) program archive, to ensure that both observational and automated data are robustly archived in relational data models with necessary timestamp, spatial, temporal, and provenance metadata. Goals for SPRUCE Data Management Ensure the fidelity of and accessibility of SPRUCE data to the participants to facilitate all the pertinent science questions; Minimize the amount of time research personnel need to spend on data management activities while achieving high quality data and metadata; and Ensure that the data and metadata can be located and used by project personnel (initially) and the broader scientific community and public when appropriate quality checked data are available. Approach to Data Management Planning Provide a structured framework to capture the project-defined requirements Provide data management guidance and best practices Responsibility of ORNL SPRUCE research group, the Task Leaders in particular, and Forest Service Staff, to reach a consensus about what needs to be controlled, to provide processing details, and to establish who is responsible for implementation. Accountability is key. Planning Considerations The plan supports field sampling, measurements, monitoring, and analyses. Data management information collected pre-experiment will inform the final experimental data management processes. SPRUCE tasks are subject to change or modification and experimental technology will evolve. The data management plan will have to be flexible and updated as needed, with version control.

4 4 Version 1.2 2010/05/10 SPRUCE Data Policy: Archiving, Sharing, and Fair-Use The open sharing of all SPRUCE experiment data among researchers, the broader scientific community, and the public is critical to advancing the mission of DOE’s Program of Terrestrial Ecosystem Science. SPRUCE is implementing an experimental platform for the long-term testing of the mechanisms controlling the vulnerability of organisms, ecosystems, and ecosystem functions to increases in temperature and exposure to elevated CO2 treatments within the northern peatland high-carbon ecosystem. All data collected at the SPRUCE facility, all results of any analysis or synthesis of information, and all model algorithms and codes developed in support of SPRUCE will be submitted to the SPRUCE Data Archive in a timely manner such that data will be available for use by SPRUCE researchers and, following publication, the public. This policy is applicable to all SPRUCE participants including the SPRUCE Research Group at the Oak Ridge National Laboratory (ORNL), the U.S. Forest Service, cooperating independent researchers, and to the users of SPRUCE data products (see the Data Fair-Use Statement). SPRUCE data policies are consistent with the sponsoring U.S. DOE Program for Terrestrial Ecosystem Science Data Policy and with the Memorandum of Understanding between the U.S. Forest Service and UT-Battelle.Memorandum of Understanding Data Management Requirements are identified in the Data Policy

5 5 Data Archiving and Discovery Archive at Carbon Dioxide Information Analysis Center (CDIAC) Two levels of data accessibility. First is for sharing recently collected, derived, and processed data products among SPRUCE participants. Second is for access to mature data products by the broader scientific community and public. Public access will be concurrent with open literature or web site publication of SPRUCE results. Discovery facilitated through the compilation of descriptive companion metadata records and their inclusion in searchable metadata databases and clearinghouses. Data Policy, continued

6 6 Data Sharing Timeliness of Data Availability Researchers will actively process, quality assure, and document environmental measurements, etc Task Leaders will define a schedule for submitting data to the Archive for their given measurements. Suggested guidelines for submitting data to the Archive for sharing among SPRUCE participants.  Environmental measurements (automated instruments) -- 30 days after the completion of a month of measurements  Annual surveys and seasonal measurement efforts -- 120 days from the completion of the survey  Laboratory analyses of vegetation nutrient concentrations -- 60 days from completion of analyses Suggested guidelines for submitting data to the Archive for public access. Environmental measurements (automated instruments) -- annual updates  Annual surveys and seasonal measurement efforts -- With publication of papers.  Laboratory analyses of vegetation nutrient concentrations -- With publication of papers. Quality Assurance of Data Task Leader will define the quality assurance checks to be performed prior to data sharing among SPRUCE participants (Quality Level 1) and (Quality Level 2) prior to public access Suggested guidelines for defining data Quality Levels: Level 1 and Level 2 Data Policy, continued

7 7 Data Fair-Use Statement The SPRUCE data provided on the public archive are freely available and were furnished by the SPRUCE Research Group at ORNL, U.S. Forest Service, and cooperating independent researchers who encourage their use.  Please inform SPRUCE scientist(s) of your use of the archived data and of any publications.  Check the Archive frequently to ensure that you are using the latest version of the data.  Please acknowledge (1) data products as a citation as provided in the data archive documentation, (2) web site information downloads as a bibliographic web citation, or (3) general SPRUCE information as an acknowledgment or personal communication if no other citation form is applicable.  When publishing original analyses and results using these data, please acknowledge the agency or organization that supported the collection of the original data.  Please include these terms as publication keywords as applicable: SPRUCE Experiment, ORNL, U.S. DOE Office of Science, Marcell Experimental Forest, Northern Research Station, U.S. Forest Service.  Please provide an electronic reprint of your independent work to the SPRUCE Project so that all publications can be tracked by CDIAC. Disclaimer of Liability Data Policy, continued

8 8 Data and Metadata Reporting Reporting Sampling and Measurement Dates and Times Identifying Descriptive Field Variables, Biological Measurements, Chemical and Physical Variables Reporting Units for Chemical, Physical, and Descriptive Variables Reporting Values below Detection Limits Reporting Missing Data Reporting Uncertainty Estimates Reporting Conventions for Meteorological Data, and Temperature and Pressure Conditions Assigning Project-Specific Data Quality Flags Organization Data Policy Data Flow Project Name Information Identifying Measurement and Sampling Sites Data Processing Data Entry, Transfer, and Transformation Managing Hardcopy Format Project Records Managing Electronic Format Project Records Names and Reporting Formats for Data Files Scripted Programs for Processing and Analysis Quality Level of Data Data Documentation and Archiving Planning to Archive Data for Public Release Creating Archive Documentation Providing Metadata to Searchable Indexes and Clearinghouses Assigning Descriptive Data Set Titles Data Systems Management Day-to-Day Operation of Data Management Systems Data Management System and Software Configuration Control Guidelines Actions and resources needed to meet requirements are in the Data Management Plan

9 9 Task: Environmental Measurements Automated Instruments Task: Environmental Measurements Automated Instruments Existing/Historical Data MEF, NADP, Remote Sensing Ground penetrating radar assessments Additional links to existing data ? Existing/Historical Data MEF, NADP, Remote Sensing Ground penetrating radar assessments Additional links to existing data ? Sources Task R2: Plant growth phenology and NPP Periodic Observations Task R2: Plant growth phenology and NPP Periodic Observations Processing/QA Frequency 30-60 days after collection 30-60 days after collection 120 days after survey, 60 days after sample analyses 120 days after survey, 60 days after sample analyses Selected data uploaded Periodic updates with new data and products Selected data uploaded Periodic updates with new data and products Destination Access Supplemental Information Photos, Videos, Additional ? Supplemental Information Photos, Videos, Additional ? Timing ? SPRUCE Data Flow Compiled by Les Hook, 2010/05/10 Task R6: Modeling of terrestrial ecosystem responses to temperature and CO2 Inputs and Outputs ? Task R6: Modeling of terrestrial ecosystem responses to temperature and CO2 Inputs and Outputs ? Task R3: Community composition Periodic Observations Task R3: Community composition Periodic Observations Task R4: Plant Physiology Periodic Observations Task R4: Plant Physiology Periodic Observations Task R5: Biogeochemical cycling responses Periodic Observations Task R5: Biogeochemical cycling responses Periodic Observations SPRUCE Data Archive (CDIAC) SPRUCE Data Archive (CDIAC) Project Data Sharing Public Data Sharing with publication or per schedule SPRUCE Web Site Project and Public Access to Data and Resources Project Data Access 100% open for Project Team Permission needed by others Project Resources Common reference sources Metadata Content Editor Public Data Archive 100% open to Public Data and Metadata Search Relational Database (e.g., FACE) ? SPRUCE Web Site Project and Public Access to Data and Resources Project Data Access 100% open for Project Team Permission needed by others Project Resources Common reference sources Metadata Content Editor Public Data Archive 100% open to Public Data and Metadata Search Relational Database (e.g., FACE) ? 30-60 days after collection 30-60 days after collection

10 10

11 11

12 12 Shared Development Effort for Acquisition and Processing of Sensor Data SPRUCE Sensors and data loggers Acquisition and evaluation software Independent processing steps Next for SPRUCE and NGEE Number of sensors 25X Need advanced automated processing, displays, and alarms Web accessible Other needs?

13 13 Shared Development Effort for Acquisition and Processing of Sensor Data Next Steps: Purchasing Campbell Scientific (CS) software with more capabilities. Meeting with CS Technical Representative for planning guidance. Making connections with ORNL CS power users. Learn from SPRUCE and NGEE prototypes Starting to look beyond acquisition and processing to analysis.

14 14 Additional Data Flow Diagrams Overview of Task Inputs and Resources S1 Bog Vegetation Survey Task

15 15 Task-Specific Inputs Resources SPRUCE Web Site Project Access to Data and Resources Project Resources Common references: SPRUCE Task Description template SPRUCE Variable Name template SPRUCE Project Names template Site Information template Data Collection Guides Project Data Archive 100% open for Project Team Permission needed by others SPRUCE Web Site Project Access to Data and Resources Project Resources Common references: SPRUCE Task Description template SPRUCE Variable Name template SPRUCE Project Names template Site Information template Data Collection Guides Project Data Archive 100% open for Project Team Permission needed by others Overview of Task Inputs and Resources Compiled by Les Hook, 2010/05/10 Task EM: Existing/Historical Data Task R2: Supplemental Information Task R6: Task R3: Task R4: Task R5: Data Policy Data Flow Task Information Task Description ID Measurements Field Sampling & Measurement Description Laboratory Analysis Description Data Processing Archive Schedule QA Level Defined Task Metadata Task Data SPRUCE Data Archive (CDIAC) SPRUCE Data Archive (CDIAC) Project Data Sharing

16 16 Task-Specific Inputs SPRUCE Web Site Project and Public Access to Data and Resources SPRUCE Web Site Project and Public Access to Data and Resources S1 Bog Vegetation Survey Task >>> Data Management Planning S1 Bog Vegetation Survey Task >>> Data Management Planning Compiled by Les Hook, 2010/04/30, updated 2010/09/20 SPRUCE Data Archive (CDIAC) SPRUCE Data Archive (CDIAC) Project Data Sharing Forest Service Survey Plot Coordinates Data and Metadata Reporting Reporting Sampling and Measurement Dates and Times Identifying Descriptive Field Variables, Biological Measurements, Chemical and Physical Variables Reporting Units for Chemical, Physical, and Descriptive Variables Reporting Values below Detection Limits Reporting Missing Data Reporting Uncertainty Estimates Reporting Conventions for Meteorological Data, and Temperature and Pressure Conditions Assigning Project-Specific Data Quality Flags Data Processing Data Entry, Transfer, and Transformation Managing Hardcopy Format Project Records Managing Electronic Format Project Records Names and Reporting Formats for Data Files Scripted Programs for Processing and Analysis Quality Level of Data Organization Data Policy Data Flow Project Name Information Identifying Measurement and Sampling Sites See DCG – Site Information Data Documentation and Archiving Planning to Archive Data for Public Release Creating Archive Documentation Providing Metadata to Searchable Indexes and Clearinghouses Assigning Descriptive Data Set Titles Project Master List of Site Information Task Metadata Task Description Field Sampling & Measurement Description Laboratory Analysis Description QA Level Defined Archive Schedule Data and Metadata Compilation Data Systems Management Day-to-Day Operation of Data Management Systems Data Management System and Software Configuration Control Guidelines See DCG – Hardcopy Forms See DCG – Task Plan

