U.S. JGOFS Data Management a retrospective Cyndy Chandler U.S. JGOFS Data Management Office 25 January 2005 NACP Data Management Planning Workshop New.

Slides:



Advertisements
Similar presentations
New Services for Data Creators and Providers Louise Corti, Head ESDS Qualidata/ Outreach & Training Alasdair Crockett, ESDS Data Services Manager.
Advertisements

The Live Access Server (Access to observational data) Jonathan Callahan (University of Washington) Steve Hankin (NOAA/PMEL – PI) Roland Schweitzer, Kevin.
V Alyssa Rosemartin 1, Lee Marsh 1, Ellen Denny 1, Bruce Wilson USA National Phenology Network, Tucson, AZ; 2 - Oak Ridge National Laboratory, Oak.
Peter Griffith and Megan McGroddy 4 th NACP All Investigators Meeting February 3, 2013 Expectations and Opportunities for NACP Investigators to Share and.
Visualizing Fitness for Purpose Bob Groman and Dicky Allison Biological and Chemical Oceanography Data Management Office Woods Hole Oceanographic Institution.
Caro-COOPS Data Management: Metadata. Cast-Net addresses the need for improved connectivity among coastal observing systems by creating a regional framework.
Chapter 3 Database Management
U.S. JGOFS Data Management Lessons Learned Authors: Cyndy Chandler and David Glover (Woods Hole Oceanographic Institution) The U.S. JGOFS Data Management.
OGC Technical Committee Meeting National Resources and Environment Working Group 1 The JGOFS/GLOBEC Data Management System for Serving Physical and Biological.
Building a Global Drylands Information System: A Collaborative Approach IALC Conference and Workshop Assessing Capabilities of Soil and Water Resources.
Biological and Chemical Oceanography Data Management Office 1 of 12 An Introduction to the Biological and Chemical Oceanography Data Management Office.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Introduction Downloading and sifting through large volumes of data stored in differing formats can be a time-consuming and sometimes frustrating process.
Research Data Management Philip Tarrant Global Institute of Sustainability.
Coordinated Energy and water-cycle Observations Peroject A Well Organized Data Archive System Data Integrating/Archiving Center at University of Tokyo.
Data Management Practices: BCO-DMO’s Successes and Challenges Bob Groman BCO-DMO Woods Hole Oceanographic Institution NERACOOS/NeCODP Data Management Workshop.
EARTH SCIENCE MARKUP LANGUAGE “Define Once Use Anywhere” INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
DIVISION OF TECHNOLOGY, INDUSTRY AND ECONOMICS / UNEP/ SETAC Life Cycle Initiative: Focus of the Life Cycle Inventory program by Guido.
Data Merge Examples, Toolsets for Airborne Data (TAD): Customized Data Merging Function ASDC Introduction The Atmospheric Science Data Center (ASDC) at.
EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
ATMOSPHERIC SCIENCE DATA CENTER ‘Best’ Practices for Aggregating Subset Results from Archived Datasets Walter E. Baskin 1, Jennifer Perez 2 (1) Science.
U.S. GLOBEC Pan-Regional Synthesis Workshop 1 Presentation to the U.S. GLOBEC Pan-Regional Workshop 29 November 2006 Bob Groman Data Access and Associated.
CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey 1.
Enhancing Linkages Between Projects and Datasets: Examples from LBA-ECO for NACP Lisa Wilcox, Amy L. Morrell,
The Global Video Grid: DigitalWell Update & Plan For SRB Integration Myke Smith, Manager Streaming Media Technologies University of Washington / ResearchChannel.
EARTH SCIENCE MARKUP LANGUAGE Why do you need it? How can it help you? INFORMATION TECHNOLOGY AND SYSTEMS CENTER UNIVERSITY OF ALABAMA IN HUNTSVILLE.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
The Digital Library for Earth System Science: Contributing resources and collections Meeting with GLOBE 5/29/03 Holly Devaul.
VERTIGO data OCB database status update Cyndy Chandler Ocean Carbon and Biogeochemistry Data Management Office Cyndy Chandler Ocean Carbon and Biogeochemistry.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
Reconstituting the Ocean: a tale from U.S. JGOFS Cyndy Chandler (MCG, WHOI) U.S. JGOFS Data Management Office and Ocean Carbon and Biogeochemistry Coordination.
Biological and Chemical Oceanography Data Management Office slide 1 of 19 CAMEO Data Management Bob Groman Biological and Chemical Oceanography Data Management.
NQuery: A Network-enabled Data-based Query Tool for Multi-disciplinary Earth-science Datasets John R. Osborne.
Laura Russell Programmer VertNet Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and.
Jake F. Weltzin United States Geological Survey USA National Phenology Network Integrating phenology data across spatial and temporal scales.
NOAA/NESDIS/National Oceanographic Data Center Following the Flow of Two Underway Data Streams Within the U. S. National Oceanographic Data Center Steven.
Cyberinfrastructure to promote Model - Data Integration Robert Cook, Yaxing Wei, and Suresh S. Vannan Oak Ridge National Laboratory Presented at the Model-Data.
November 16, 2009 Page 1 of 28 Data and Data Management: Introduction to the BCO-DMO Presented to Professor Keiichi Uchida November 16, 2009 Robert C.
U.S. GLOBEC Georges Bank 2007 Phase 4B SI Meeting April 23, 2007 GoMODP, Data Interoperability and the MapServer Interface to U.S. GLOBEC Data Presented.
NOAAServer: Unified access to distributed NOAA data Ernest Daddio, NOAA/ESDIM Steve Hankin, NOAA/PMEL Donald Denbo, NOAA/PMEL/JISAO Nancy Soreide, NOAA/PMEL.
Breakout Session Assignments and Goals. Summary of Objectives and Charge to Breakout Groups Desired outcome: a comprehensive vision for NACP Data Management.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
Institutional Repositories: the DSpace Experience Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
The Giovanni-NEO Oceanographic Education Cookbook Instructional James G. Acker David Herring Gregory Leptoukh Suhung Shen Steven Kempler NASA Goddard Earth.
SCD Research Data Archives; Availability Through the CDP About 500 distinct datasets, 12 TB Diverse in type, size, and format Serving 900 different investigators.
1 Final Report on WTF-CEOP JAXA Prototype system Satoko Horiyama MIURA JAXA/Earth Observation Research Center.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
LAS and THREDDS: Partners for Education Roland Schweitzer Steve Hankin Jonathan Callahan Joe Mclean Kevin O’Brien Ansley Manke Yonghua Wei.
The Proliferation of Metadata Standards and the Evolution of NASA’s Global Change Master Directory (GCMD) Standard for Uses in Earth Science Data Discovery.
Global Change Master Directory (GCMD) Mission “To assist the scientific community in the discovery of Earth science data, related services, and ancillary.
Research Data Management Nova Southeastern University – Halmos College of Natural Sciences and Oceanography – Ocean Campus November 2015 Data Management.
Distributed Data Servers and Web Interface in the Climate Data Portal Willa H. Zhu Joint Institute for the Study of Ocean and Atmosphere University of.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
NATIONAL INCIDENT MANAGEMENT SYSTEM Department of Homeland Security Executive Office of Public Safety.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
NVS New Zealand National Vegetation Survey. What is NVS? NVS (National Vegetation Survey) – New Zealand’s largest archive facility for plot-based vegetation.
Biological and Chemical Oceanography Data Management Office slide 1 of 10 U.S. GEOTRACES Data Management Cyndy Chandler BCO-DMO ~ WHOI 23 September 2008.
Biological and Chemical Oceanography Data Management Office slide 1 of 22 Introduction to Data Management for Ocean Science Research Cyndy Chandler Biological.
Biological and Chemical Oceanography Data Management Office slide 1 of 10 The Biological and Chemical Oceanography Data Management Office (BCO-DMO) Cyndy.
Data Coordinating Center University of Washington Department of Biostatistics Elizabeth Brown, ScD Siiri Bennett, MD.
Training Course on Data Management for Information Professionals and In-Depth Digitization Practicum September 2011, Oostende, Belgium Concepts.
Introduction to BODC and GEOTRACES data office Edward Mawji British Oceanographic Data Centre
R2R ↔ NODC Steve Rutz NODC Observing Systems Team Leader May 12, 2011 Presented by L. Pikula, IODE OceanTeacher Course Data Management for Information.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
Data Browsing/Mining/Metadata
Data and Data Management: Introduction to the BCO-DMO
Data Management: Documentation & Metadata
Prepared by: Jennifer Saleem Arrigo, Program Manager
Live Access Server (LAS)
Presentation transcript:

U.S. JGOFS Data Management a retrospective Cyndy Chandler U.S. JGOFS Data Management Office 25 January 2005 NACP Data Management Planning Workshop New Orleans, LA

26 January 2005 U.S. JGOFS DMO 2 of 45 blizzicane of ‘05

26 January 2005 U.S. JGOFS DMO 3 of 45 Topics in today’s presentation: Introduction Introduction What is U.S. JGOFS?What is U.S. JGOFS? What is the DMO?What is the DMO? Lessons Learned Lessons Learned U.S. JGOFS data serverU.S. JGOFS data server JGOFS d-DBMS and user interface JGOFS d-DBMS and user interface Live Access Server Live Access Server

26 January 2005 U.S. JGOFS DMO 4 of 45 What is U.S. JGOFS? part of multinational JGOFS part of multinational JGOFS U.S. Global Change Research Program (US GCRP), Scientific Committee on Oceanic Research (SCOR), and International Geosphere-Biosphere Programme (IGBP) U.S. Global Change Research Program (US GCRP), Scientific Committee on Oceanic Research (SCOR), and International Geosphere-Biosphere Programme (IGBP) long term (U.S ) long term (U.S ) multidisciplinary (bio, chem, PO, geology) multidisciplinary (bio, chem, PO, geology) process studies (U.S ), time-series, global surveys, synthesis and modeling, data management process studies (U.S ), time-series, global surveys, synthesis and modeling, data management investigate ocean carbon flux investigate ocean carbon flux U.S. Joint Global Ocean Flux Study

26 January 2005 U.S. JGOFS DMO 5 of 45 U.S. JGOFS Data Management Office (DMO)   formed in 1988 specifically to meet needs of U.S. JGOFS   assist PIs to submit their data to DMO   ongoing quality control of data   develop and maintain simple, reliable interface to program data   provide timely, easy access to project results   collaborate with other program DMO   publish U.S. JGOFS data reports   plan final archive of U.S. JGOFS information

26 January 2005 U.S. JGOFS DMO 6 of 45 Basic Principles of Data Management from 1988 JGOFS Working Group on Data Management scientists will generate data in a format useful for their needs scientists will generate data in a format useful for their needs oceanographic data sets are best organized in terms of metadata (temporal and geographical) oceanographic data sets are best organized in terms of metadata (temporal and geographical) data managers should avoid use of coded data values data managers should avoid use of coded data values users should be able to obtain all the data they require from one source and in a consistent format users should be able to obtain all the data they require from one source and in a consistent format data interchange formats should be designed for the convenience of scientific users data interchange formats should be designed for the convenience of scientific users

26 January 2005 U.S. JGOFS DMO 7 of 45 Additional Guidelines metadata is critical and therefore mandatory metadata is critical and therefore mandatory data managers should maintain awareness of emerging standards and strive for compliance data managers should maintain awareness of emerging standards and strive for compliance whatever interface is used to provide access to the data, it’s still important to provide subset and download capability whatever interface is used to provide access to the data, it’s still important to provide subset and download capability data management systems must be dynamic - balancing tension between existing and new technologies data management systems must be dynamic - balancing tension between existing and new technologies

26 January 2005 U.S. JGOFS DMO 8 of 45 Basic Components of a Data Management System data acquisition data acquisition from variety of sourcesfrom variety of sources quality assurance quality assurance data publication data publication synthesis and modeling synthesis and modeling archive archive first four components are active throughout the program

26 January 2005 U.S. JGOFS DMO 9 of 45 lesson defined... 1 : a passage from sacred writings read in a service of worship 2 a : a piece of instruction b : a reading or exercise to be studied by a pupil c : a division of a course of instruction 3 a : something learned by study or experience b : an instructive example 4 : an edifying example or experience 5 : a reprimand Lessons Learned... from the American Heritage Dictionary of the English Language

26 January 2005 U.S. JGOFS DMO 10 of 45 The first lesson learned... is a meta lesson... Lessons Learned... All the lessons learned take on enhanced meaning when applied to science programs of increasing size and complexity.

26 January 2005 U.S. JGOFS DMO 11 of 45 Lessons Learned...   technology is good; people are more important diverse range of expertise and personalities   designers, programmers, data managers someone with authority to set overall vision qualified staff to make effective use of technology guidance from advisory committee which includes active investigators

26 January 2005 U.S. JGOFS DMO 12 of 45 U.S. JGOFS DMO personnel   David Glover (director)   Cyndy Chandler (manager)   Jeff Dusenberry (data specialist)  Previous staff members Christine Hammond (manager) George Heimerdinger (data specialist) David Schneider (data specialist)

26 January 2005 U.S. JGOFS DMO 13 of 45 Lessons Learned...   develop a data policy, publicize and follow it guidance from steering committee consent from participating investigators reiterate at conferences and workshops encourage compliance agree on method of enforcement (used as last resort)

26 January 2005 U.S. JGOFS DMO 14 of 45 Lessons Learned...   establish protocols at program start with mechanism for adaptation when necessary sampling methodologies naming conventions   parameter dictionary, controlled vocabulary   XML schema, thesauri, ontologies units of measurement

26 January 2005 U.S. JGOFS DMO 15 of 45 Lessons Learned...   facilitate contribution of data to collection compile an inventory of expected results publish and maintain the inventory remind investigators of opportunities to contribute data and results to the growing inventory review procedure at conferences and workshops accept all formats of data work with investigators to complete metadata records

26 January 2005 U.S. JGOFS DMO 16 of 45 JGOFS distributed database management system U.S. JGOFS DMO accepted any format data from the field study investigators and used methods to locate and translate the data objects

26 January 2005 U.S. JGOFS DMO 17 of 45 Lessons Learned...  metadata is of critical importance   accurate, complete, available with data   monitor emerging standards   define minimal metadata requirements   standards-compliant solutions where possible   complete metadata record enables reuse of data metadata assembly is time consuming, but is the key to enabling secondary reuse of data

26 January 2005 U.S. JGOFS DMO 18 of 45 metadata

26 January 2005 U.S. JGOFS DMO 19 of 45 Lessons Learned...   quality assurance is an ongoing process   intense QA process during initial acquisition and ingestion into data system   problems discovered as data are utilized by others ◊ ◊ insufficient or inaccurate metadata   process of data product synthesis becomes a valuable diagnostic tool for improving data quality

26 January 2005 U.S. JGOFS DMO 20 of 45 Lessons Learned...   begin synthesis early do not wait until all the data has been collected synthesized products greatly enhance the data collection

26 January 2005 U.S. JGOFS DMO 21 of 45 Lessons Learned...   provide timely, easy access to project results   develop and maintain simple, reliable interface to data collection   single interface to entire data collection   balance tension between new innovative technologies and existing stable implementations   if an interface is broken, it doesn’t matter how great the original concept was

26 January 2005 U.S. JGOFS DMO 22 of 45 U.S. JGOFS Data Server JGOFS distributed database management system (d-DBMS) used for field data JGOFS distributed database management system (d-DBMS) used for field data Live Access Server (LAS) used for gridded, synthesis and model results Live Access Server (LAS) used for gridded, synthesis and model results

26 January 2005 U.S. JGOFS DMO 23 of 45 U.S. JGOFS data server web interface to data catalog

26 January 2005 U.S. JGOFS DMO 24 of 45 JGOFS Distributed Database Management System (d-DBMS) distributed, object-oriented system distributed, object-oriented system originally developed by Glenn Flierl, James Bishop, Satish Paranjpe, David Glover originally developed by Glenn Flierl, James Bishop, Satish Paranjpe, David Glover supports multidisciplinary, multi- institutional data acquisition project supports multidisciplinary, multi- institutional data acquisition project multiple data storage formats and locations multiple data storage formats and locations data interpreted by ‘methods’ data interpreted by ‘methods’

26 January 2005 U.S. JGOFS DMO 25 of 45 Data Listing select dataset select variable (columns) select range (rows) view metadata

26 January 2005 U.S. JGOFS DMO 26 of 45 subset, plot, download data

26 January 2005 U.S. JGOFS DMO 27 of 45 Live Access Server Interface provides access to synthesis and model results

26 January 2005 U.S. JGOFS DMO 28 of 45 Live Access Server ~ LAS LAS Development Team (original) LAS Development Team (original) Steve Hankin Steve Hankin Jon Callahan Jon Callahan Joe Sirott Joe Sirott located at UW JISAO/NOAA-PMEL located at UW JISAO/NOAA-PMEL University of Washington’s Joint Institute for the Study of the Atmosphere and Ocean and NOAA Pacific Marine Environmental Laboratory University of Washington’s Joint Institute for the Study of the Atmosphere and Ocean and NOAA Pacific Marine Environmental Laboratory

26 January 2005 U.S. JGOFS DMO 29 of 45 Live Access Server Interface configurable Web server configurable Web server data and metadata interface data and metadata interface provides access to geo-referenced scientific data provides access to geo-referenced scientific data presents distributed data sets as a unified virtual data base (DODS/OPeNDAP) presents distributed data sets as a unified virtual data base (DODS/OPeNDAP) uses Ferret as the default visualization application uses Ferret as the default visualization application visualize data with on-the-fly graphics visualize data with on-the-fly graphics request custom subsets of variables in a variety of file formats request custom subsets of variables in a variety of file formats access background reference material (metadata) access background reference material (metadata) compare variables from distributed sources compare variables from distributed sources

26 January 2005 U.S. JGOFS DMO 30 of 45 LAS enables a data server to … unify access to multiple types of data in a single interface unify access to multiple types of data in a single interface create thematic data collections from distributed data sources create thematic data collections from distributed data sources offer derived products on the fly offer derived products on the fly offer variety of visualization styles offer variety of visualization styles customized for the datacustomized for the data

26 January 2005 U.S. JGOFS DMO 31 of 45 U.S. JGOFS LAS MySQL database of netCDF and JGOFS format data objects MySQL database of netCDF and JGOFS format data objects interface to project data and metadata interface to project data and metadata data sub-selection (selections, projections) data sub-selection (selections, projections) multi-variable support multi-variable support gridded vs. in-situ data differencing gridded vs. in-situ data differencing multiple views multiple views (property-property, depth horizon, cruise tracks, overplots)(property-property, depth horizon, cruise tracks, overplots) multiple products (ps, gif, text, NetCDF) multiple products (ps, gif, text, NetCDF)

26 January 2005 U.S. JGOFS DMO 32 of 45 LAS v6 select dataset and variables

26 January 2005 U.S. JGOFS DMO 33 of 45 select dataset select variable   set constraints select view (XYZT) select output type selections: lat/lon time depth range LAS constraints LAS v6 constraints

26 January 2005 U.S. JGOFS DMO 34 of 45 LAS output select dataset select variable set constraints   output

26 January 2005 U.S. JGOFS DMO 35 of 45 Property Depth LAS v6 Property- Depth Arabian Sea Nitrate from merged product generated by DMO from in situ Niskin bottle data

26 January 2005 U.S. JGOFS DMO 36 of 45 Comparison Overlay Plot

26 January 2005 U.S. JGOFS DMO 37 of 45 overlay plot chlorophyll (May) shaded SeaWiFS data contributed by: Yoder and Kennelly surface nitrate (April) contours Nutrient Climatologies contributed by: Ray Najjar Nutrient Climatologies contributed by: Ray Najjar

26 January 2005 U.S. JGOFS DMO 38 of 45 U.S. JGOFS Data System Summary supports a variety of data formats supports a variety of data formats OPeNDAP used to access data collection OPeNDAP used to access data collection coupled metadata and data coupled metadata and data supports data subselection supports data subselection offers variety of products for download offers variety of products for download

26 January 2005 U.S. JGOFS DMO 39 of 45 Lessons Learned...   encourage data managers to collaborate other data managers within program   JGOFS DMTT data managers from other programs   GLOBEC, LTER program investigators and participants   attend conferences and workshops   offer data system tutorials

26 January 2005 U.S. JGOFS DMO 40 of 45 Lessons Learned...   publish data reports archive data in one place easy access to project results most complete and accurate form of database

26 January 2005 U.S. JGOFS DMO 41 of 45 data reports published on CD-ROM

26 January 2005 U.S. JGOFS DMO 42 of 45 Lessons Learned...   plan early for final archive of program results digital records don’t forget the boxes of stuff !

26 January 2005 U.S. JGOFS DMO 43 of 45   develop a data policy, publicize and follow it   establish protocols at program start with mechanism for adaptation when necessary   facilitate contribution of data to collection  metadata is of critical importance   accurate, complete, available with data   quality assurance is an ongoing process   begin synthesis early Lessons Learned...

26 January 2005 U.S. JGOFS DMO 44 of 45   provide timely, easy access to project results   develop and maintain simple, reliable interface to data collection   encourage data managers to collaborate   publish data reports   plan early for final archive of program results Lessons Learned...

26 January 2005 U.S. JGOFS DMO 45 of 45 Challenges   functioning amid the chaos   maintaining a healthy data management system amid the chaos of rapidly changing information technology   distinguishing between enabling and disruptive technologies   data and results – what to preserve?   raw and processed data, synthesized products, model code, inputs, results   increasing volume and diversity   long-term preservation of data

26 January 2005 U.S. JGOFS DMO 46 of 30 U.S. JGOFS web site