The NOAA Big Data Project ESIP Cloud Computing Panel 2016-07-21 Jeff de La Beaujardière, PhD NOAA Data Management Architect jeff.deLaBeaujardiere@noaa.gov jeff.deLaBeaujardiere@noaa.gov
NOAA has "Big Data" (Volume, Variety, Velocity, ...) Open Data and Public/Private Collaboration at NOAA NOAA has "Big Data" (Volume, Variety, Velocity, ...) 2016-06-15 10 satellites 150+ weather radars 3 buoy networks 200+ tide gauges human observers animal telemetry 17 ships 10 aircraft Numerical models Extramurally funded data NOAA data are unique, valuable, irreplaceable, and collected at public expense jeff.deLaBeaujardiere@noaa.gov
Open Data and Public/Private Collaboration at NOAA Accelerating User Demand for NOAA Data Open Data and Public/Private Collaboration at NOAA 2016-06-15 Briefing to OSTP PARR meeting
Traditional Data Services Approach (pre-Cloud) NOAA Big Data Project 2016-07-21 Data.gov and Other Portals Decision Support Tools Scientific Software Numerical Models Value- Adding Reseller User Tools Jeff.deLaBeaujardiere@noaa.gov data services layer shared standards Data Search & Discovery Services Data Access Services Data Documentation Compatible Formats and Vocabularies 2016-07-21 Data Sources Satellite Radar Buoy Ship Sonar Surveys ROV/UAV Models
Traditional Data Services Approach (pre-Cloud) NOAA Big Data Project Traditional Data Services Approach (pre-Cloud) 2016-07-21 User Hardware User Hardware User Hardware User Hardware User Facilities copy of data Jeff.deLaBeaujardiere@noaa.gov Data Discovery data access data access data access data access data access data access data access data access 2016-07-21 Data Sources Satellite Radar Buoy Ship Sonar Surveys ROV/UAV Models
Conceptual Overview of NOAA Big Data Project 2016-07-21 new customers & lines of business Customer 1 Customer 2 Customer 3 application & product providers Custom Product/ App #1 App #2 App #3 Jeff.deLaBeaujardiere@noaa.gov Cloud IaaS provider(s) [Infrastructure as a Service] integration functions analysis functions working copy of data agency security boundary Technical: NOAA retains master copy of its environmental data, including Earth observations and model outputs. NOAA performs basic functions for data discovery, access and usability, providing services inside its security boundary. Users currently use NOAA-hosted services to find and retrieve our data. If we had enough bandwidth, we could send all or most of our data out to a commercial Cloud provider, who would then have a working copy of our data. Computing functions for data integration and analysis could be provided, and could operate on multiple datasets co-located with the computing. Application and product developers could provide products and decision support apps tailored for individual customers. We would enable maximum diversity in the user space, for decisions based on specialized knowledge and algorithms. The more we can standardize the lower layers, the easier it would be to support this scenario. Financial: No funds would be exchanged to or from NOAA. Data provided into the Cloud would remain available free of charge in the format provided. The network service provider would provide bandwidth at no cost to NOAA, and recoup costs from the Cloud provider. Application developers would pay Cloud provider for computing and for storage and transmission of derived products. Customers would pay developers for derived products and decision-support apps. Agency Service Tier Access Services Catalog Metadata Formatting agency-provided services 2016-07-21 master copy of data Earth Observations Model Outputs jeff.deLaBeaujardiere@noaa.gov
URL: data-alliance.noaa.gov 5 BDP CRADAs announced April 2015 CRADA = Cooperative Research and Development Agreement URL: data-alliance.noaa.gov Jeff.deLaBeaujardiere@noaa.gov 2016-07-21
1st BDP Dataset: NEXRAD L2 NOAA Big Data Project 2016-07-21 NEXRAD = Next-generation Radar Level 2 = reflectivity data from 150+ stations 1991-present; 270+ TB compressed Jeff.deLaBeaujardiere@noaa.gov 2016-07-21 jeff.deLaBeaujardiere@noaa.gov
1st BDP Dataset: NEXRAD L2 Publicly available on both AWS and OCC https://aws.amazon.com/noaa-big-data/nexrad/ http://occ-data.org/NOAANEXRAD/ Jeff.deLaBeaujardiere@noaa.gov 2016-07-21
Future Datasets for BDP NOAA Big Data Project 2016-07-21 Geostationary Satellite Multi-Radar/ Multi-Sensor Jeff.deLaBeaujardiere@noaa.gov Numerical Models Fisheries bycatch data 2016-07-21 jeff.deLaBeaujardiere@noaa.gov
Jeff de La Beaujardière NOAA Big Data Project 2016-07-21 Thank you! Jeff de La Beaujardière jeff.deLaBeaujardiere@noaa.gov http://orcid.org/0000-0002-1001-9210 jeff.deLaBeaujardiere@noaa.gov