NOAA Data Management Perspective & Plans NSF RDMI Workshop 2017-09-15 NOAA briefing at NSF RDMI Workshop 2016-11-17 NOAA Data Management Perspective & Plans NSF RDMI Workshop 2017-09-15 Jeff de La Beaujardière, PhD National Oceanic and Atmospheric Administration NOAA Data Management Architect jeff.deLaBeaujardiere@noaa.gov jeff.deLaBeaujardiere@noaa.gov
NOAA has "Big Data" (Volume, Variety, Velocity, ...) NOAA briefing at NSF RDMI Workshop 2016-11-17 Satellites Weather radars Ocean bathymetry Buoy networks Tide gauges Ships Aircraft Autonomous vehicles Human observers Numerical models + Extramurally-funded data Jeff.deLaBeaujardiere@noaa.gov 2016-09-15 These data are unique, valuable, irreplaceable, and collected at public expense jeff.deLaBeaujardiere@noaa.gov
Vision for NOAA Data Management NOAA briefing at NSF RDMI Workshop 2016-11-17 Discoverable All NOAA environmental data shall be for all types of users and applications. Accessible Usable Preserved Jeff.deLaBeaujardiere@noaa.gov Vision: All NOAA data will be discoverable, accessible, documented, and preserved for all types of users and applications. 2016-09-15 jeff.deLaBeaujardiere@noaa.gov
NOAA Data Policies https://nosc.noaa.gov/EDMC/ NOAA briefing at NSF RDMI Workshop 2016-11-17 NOAA Administrative Order 212-15: Management of Environmental Data (2010) Jeff.deLaBeaujardiere@noaa.gov NOAA Environmental Data Management Framework (2012-2013) Data Management Planning Directive (2011; rev. 2015) Data Documentation Directive (2011; rev. 2016) Data Access Directive (2015) Archive Appraisal Procedure (2008) 2016-09-15 Data Citation Directive (2015) Data Sharing Directive for NOAA Grantees (2012; rev. 2016) jeff.deLaBeaujardiere@noaa.gov
Implementation Activities Unified Access Framework Standardized formatting & access for gridded data and in-situ observations Big Earth Data Initiative Small NOAA internal grants to improve discovery, access, and usability of datasets Enterprise Metadata Metrics & Assessment Tools to create metadata & evaluate completeness Other projects throughout NOAA Jeff.deLaBeaujardiere@noaa.gov 2016-09-15
NOAA Data Catalog Jeff.deLaBeaujardiere@noaa.gov 2016-09-15
Dataset Identifier Project DOI benefits: Permanent, citable ID. International standard (ISO 26324). Recognition by publishers. Credit from your boss during annual review...? Not yet! DOI (Digital Object Identifier) Jeff.deLaBeaujardiere@noaa.gov landing page Data & Metadata NCEI Archive (National Centers for Environmental Info.) links to 2016-09-15
Challenges managing NOAA Internal Data The good news: NOAA has dedicated, intelligent personnel working assiduously to ensure data are of good quality, publicly accessible, and archived. Less good: Much effort required No existing enterprise-wide approach Lack of resources, tools, training DM often a side-job in addition to regular duties Jeff.deLaBeaujardiere@noaa.gov 2016-09-15
Repository (documents) Conceptual Overview of Grant Data Sharing Directive Data & Publication Sharing Directive for NOAA Grants, Cooperative Agreements, and Contracts (v.3, 2016) https://nosc.noaa.gov/EDMC/PD.DSP.php Federal Funding Opportunity Data Mgmt Guidance Proposal Data Mgmt Plan Researchers Jeff.deLaBeaujardiere@noaa.gov $ cite funding w/FundRef Data Access NOAA Institutional Repository (documents) Link to published version; expose after embargo Data collected by Grantee Research Articles cite data w/DOI 2016-09-15 deposit accepted manuscript 9
Grantee Data Sharing Challenges https://nosc.noaa.gov/EDMC/PD.DSP.php Researchers Data Access Data Jeff.deLaBeaujardiere@noaa.gov Challenges: Compliance monitoring 2yrs after grant end Data hosting NOAA archive may not be able to accept all data size, type, or stewardship issues Need approved repositories permanent or short/medium-term? Limited reusability of multi-source data Data scattered across multiple sites Lack of data standards & interoperability 2016-09-15 10
Data Management is not the goal We don't want to just "manage" data – we want to use and reuse data, and extract maximum value from it Jeff.deLaBeaujardiere@noaa.gov 2016-09-15
Users need answers, not huge datasets (... or 100s of tiny datasets) Jeff.deLaBeaujardiere@noaa.gov Data to Decisions: Distill huge & complex data to ~1 bit: plant crop? evacuate? build wind farm? go skiing? Support non-expert data users 2016-09-15
Challenges Data Volume Data Complexity Jeff.deLaBeaujardiere@noaa.gov 2016-09-15 Data Volume Data Complexity
Traditional Data Services Approach NOAA briefing at NSF RDMI Workshop 2016-11-17 Data.gov and Other Portals Decision Support Tools Scientific Software Numerical Models Value- Adding Reseller User Tools Jeff.deLaBeaujardiere@noaa.gov data services layer shared standards Data Search & Discovery Services Data Access Services Data Documentation (Metadata) Compatible Formats and Vocabularies 2016-09-15 Data Sources Satellite Radar Buoy Ship Sonar Surveys Gliders Models
Traditional Data Services Approach NOAA briefing at NSF RDMI Workshop 2016-11-17 User Hardware User Hardware User Hardware User Hardware User Facilities copy of data Jeff.deLaBeaujardiere@noaa.gov Not scalable as data volumes increase Security risk of every on-premises service Maintenance burden of on-prem infrastructure Data Discovery data access data access data access data access data access data access data access data access 2016-09-15 Data Sources Satellite Radar Buoy Ship Sonar Surveys ROV/UAV Models
Notional Cloud Deployment Scenario Commercial Cloud Information Products Public users Decision-support functions Jeff.deLaBeaujardiere@noaa.gov One-way push NOAA security boundary On-premises Computing Master copy of NOAA Data Operational customers 2016-09-15 Operational Processing Forecast Models Derived from NOAA EDM Framework (2013), figure 8
Wish #1: Fully Leverage the Cloud Operational Customers (e.g., NWS) Jeff.deLaBeaujardiere@noaa.gov Archive Cloud Challenges: Egress costs vs free data Uncertain/unbounded costs Re-architecting for performance vs fork-lifting existing apps IT security policy mis-match 2016-09-15
NOAA Big Data Project (R&D) NOAA briefing at NSF RDMI Workshop 2016-11-17 www.noaa.gov/big-data-project Jeff.deLaBeaujardiere@noaa.gov 2016-09-15 selected datasets Briefing to OSTP PARR meeting
Wish #2: More Tools for Decision-making NOAA briefing at NSF RDMI Workshop Wish #2: More Tools for Decision-making complicated, multi-source data Earth Observations non-scientist users Jeff.deLaBeaujardiere@noaa.gov Policy & Business Decisions Model Outputs Decision Support Functions Ancillary Data Composable functions to create workflows for: Derived information products Multi-source data integration Location-specific analysis Statistics & Trends Novel analyses & discoveries 2016-09-15 jeff.deLaBeaujardiere@noaa.gov
NOAA briefing at NSF RDMI Workshop Questions? NOAA briefing at NSF RDMI Workshop 2016-11-17 Jeff de La Beaujardière, PhD jeff.deLaBeaujardiere@noaa.gov Jeff.deLaBeaujardiere@noaa.gov 2016-09-15 jeff.deLaBeaujardiere@noaa.gov