Experiences of a Earth Science Data User Confessions of a Data Hoarder Rob Carver, The Weather Company.

Slides:



Advertisements
Similar presentations
Conversion of CPC Monitoring and Forecast Products to GIS Format Viviane Silva Lloyd Thomas, Mike Halpert and Wayne Higgins.
Advertisements

1 NASA CEOP Status & Demo CEOS WGISS-25 Sanya, China February 27, 2008 Yonsook Enloe.
Jennifer M. Adams and Brian Doty IGES/COLA
Sponsored by the National Science Foundation GENI I&M Workshop NetCDF and Local Data Manager (LDM) Mike Zink November 4, 2010
BEDI -Big Earth Data Initiative
New Resources in the Research Data Archive Doug Schuster.
University of Chicago Department of Energy The Parallel and Grid I/O Perspective MPI, MPI-IO, NetCDF, and HDF5 are in common use Multi TB datasets also.
7 +/- 2 Maybe Good Ideas John Caron June (1) NetCDF-Java (aka CDM) has lots of functionality, but only available in Java – NcML Aggregation – Access.
1 Lightning Products and Services at NOAA’s National Climatic Data Center Steve Ansari, Stephen Del Greco, Neal Lott (NOAA / NCDC)
1 86 th Annual American Meteorological Society Meeting Atlanta, Georgia January 29 – February 2, 2006 The NOAA Weather and Climate Toolkit Steve Ansari,
McIDAS-V McIDAS-V The 5 th Generation of McIDAS by Tom Whittaker Space Science and Engineering Center University of Wisconsin-Madison USA with contributions.
Web based tools Ideas for presentation of operational meteorological data Ernst de Vreede KNMI EGOWS /6/2009 Ideas for presentation of operational.
2014 ESIP Summer Meeting July 8–11, 2014 | Frisco, Colorado Advancing Scientific Data Support in ArcGIS Nawajish Noman.
Unidata TDS Workshop THREDDS Data Server Overview October 2014.
Overview of the ODP Data Provider Sergey Sukhonosov National Oceanographic Data Centre, Russia Expert training on the Ocean Data Portal technology, Buenos.
McIDAS: The First True GIS Mashup Tommy Jasmin UW / CIMSS / SSEC McIDAS Users’ Group Meeting June 9, 2015.
1 The NOAA Weather and Climate Toolkit Steve Ansari, Stephen Del Greco (NOAA / NCDC) Mark Phillips (UNC-Asheville / NEMAC) Bill Hankins (STG Inc.)
October 16-18, Research Data Set Archives Steven Worley Scientific Computing Division Data Support Section.
1 The NOAA Weather and Climate Toolkit Steve Ansari, Stephen Del Greco, Neal Lott (NOAA / NCDC)
Implementing Geodatabase Technology
7 Nov Geospatial Interoperability Summit Iowa Environmental Mesonet: Using Open Source GIS Tools and Web Services to Disseminate Environmental.
Data Formats: Using Self-describing Data Formats Curt Tilmes NASA Version 1.0 February 2013 Section: Local Data Management Copyright 2013 Curt Tilmes.
AIRNow-International The future of the United States real-time air quality reporting and forecasting program and GEOSS participation John E. White U.S.
© Crown copyright Met Office Introduction to IDV PRECIS Reading Workshop, August 2009.
Promising data analytics technologies Tiffany Mathews.
Unidata’s TDS Workshop TDS Overview – Part II October 2012.
Unidata TDS Workshop TDS Overview – Part I XX-XX October 2014.
U.S. Department of the Interior U.S. Geological Survey Diving into the Data Pool with DAAC2Disk Kelly Lemig ERT, Inc., contractor to the U.S. Geological.
1 The NOAA Weather and Climate Toolkit Steve Ansari, Stephen Del Greco, Neal Lott (NOAA / NCDC)
Presented at the GHRC User Working Group Meeting September 25-26, 2014 INFRASTRUCTURE At the GHRC DAAC Will Ellett IT Manager
NcML Aggregation vs Feature Collections. NcML functionality 1.Modify the objects found in CDM files – Especially Attributes – Don’t have to rewrite the.
Unidata’s TDS Workshop TDS Overview – Part II Unidata July 2011.
Data Scrounging 101 Steve Signell, Instructor Robert Poirier, TA School of Science Rensselaer Polytechnic Institute Monday,
Accomplishments and Remaining Challenges: THREDDS Data Server and Common Data Model Ethan Davis Unidata Policy Committee Meeting May 2011.
Integrated Grid workflow for mesoscale weather modeling and visualization Zhizhin, M., A. Polyakov, D. Medvedev, A. Poyda, S. Berezin Space Research Institute.
TRLN High Performance Data Storage System 21 Sep 2006 Jim Porto Ken Galluppi.
Operational Deployment of Datasets ATS 690 Jason Burks Holly Allen.
Unidata TDS Workshop THREDDS Data Server Overview
ESIP Federation 2004 : L.B.Pham S. Berrick, L. Pham, G. Leptoukh, Z. Liu, H. Rui, S. Shen, W. Teng, T. Zhu NASA Goddard Earth Sciences (GES) Data & Information.
Beth Russell Scientific Communications and Data Specialist NOAA Science On a Sphere Data Updates for Science On a Sphere.
NEXRAD Data and Products at NCDC
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
Software. A web site is a collection of web pages on a particular topic. A web page is a document written in HTML code. Web pages are linked together.
THREDDS Catalogs Ethan Davis UCAR/Unidata NASA ESDSWG Standards Process Group meeting, 17 July 2007.
Unidata’s TDS Workshop TDS Overview – Part I July 2011.
Types of Spatial Data Sites Data portals: Find and download data –Humboldt County, National Atlas “Atlases”: General information –GoogleMaps, MapQuest.
Google and Large Scientific Datasets or How To Move 100TB Jon Trowbridge Google Space Telescope Science Institute March 15, 2007.
An Update on COLA’s Software Development Jennifer M. Adams and Brian Doty.
NR 621: GIS on The Web Jim Graham Spring Dynamic Web Pages (server) Browser ClientServer Web Server HTML File Image File HTML File Image File Database.
Information Technology: GrADS INTEGRATED USER INTERFACE Maps, Charts, Animations Expressions, Functions of Original Variables General slices of { 4D Grids.
Research & Development Building a science foundation for sound environmental decisions Remote Sensing Information Gateway (RSIG)
TIGGE Archive Status at NCAR THORPEX Workshop and 6th GIFS-TIGGE Working Group Meetings WMO Headquarters Geneva September 2008 Steven Worley Doug.
Data Stewardship at the NOAA Data Centers Sub Topic - Value Added Products ESIP Federation Meeting, Washington, DC January 6-8, 2009.
End-to-End Data Services A Few Personal Thoughts Unidata Staff Meeting 2 September 2009.
ORNL DAAC SPATIAL DATA ACCESS TOOL Open Geospatial Consortium (OGC) Services Bruce E. Wilson Suresh K. Santhana Vannan Yaxing Wei Tammy W. Beaty National.
1 Thomas Karl Director National Oceanic and Atmospheric Administration National Climatic Data Center AMS 88 th Annual Meeting Town Hall Meeting January.
1 2.5 DISTRIBUTED DATA INTEGRATION WTF-CEOP (WGISS Test Facility for CEOP) May 2007 Yonsook Enloe (NASA/SGT) Chris Lynnes (NASA)
SuomiNet Overview CSU Atmospheric Science September 25, 2013 Natalie Tourville CIRA.
Lidar Radar Open Software Environment LROSE Mike Dixon Earth Observing Laboratory (EOL) National Center for Atmospheric Research (NCAR) Boulder, Colorado.
5-7 May 2003 SCD Exec_Retr 1 Research Data, May Archive Content New Archive Developments Archive Access and Provision.
Update on Unidata Technologies for Data Access Russ Rew
Serving Iowa Mesonet data with U of Minnesota’s MapServer Daryl Herzmann Iowa Environmental Mesonet 31 Jul 2002.
1 UNCA - Severe Weather Workshop Asheville, NC April 17, 2010 Storm Data at NCDC And other severe weather products and services Stuart Hinson Meteorologist.
Data Are from Mars, Tools Are from Venus
AWRA – Open Water Data Initiative – Lightning Talk
MERRA Data Access and Services
TIGGE Data Archive and Access System at NCAR
Types of Spatial Data Sites
Types of Spatial Data Sites
Andrew Hendrickson & Brian Embley
Presentation transcript:

Experiences of a Earth Science Data User Confessions of a Data Hoarder Rob Carver, The Weather Company

–Andrew S. Tanenbaum “Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.”

Open Data and The Weather Company ❖ Our business model is taking open data and using it to tell interesting stories that engage our users. ❖ Over the years, we’ve archived over 100 Tb of data ❖ GRIB1, GRIB2, NIDS, shapefiles, netCDF, HDF5, ❖ NWS/NCEP, NCDC, FEMA, Census Bureau, NASA DAAC’s

Locating Data 1.Google and literature searches 2.??? 3.Data!

100+ Tb of Weather Models ❖ Most data arrives through Unidata’s LDM and FTP pull scripts. ECMWF pushes data to our FTP site. (All GRIB2/1) ❖ Ingested into the forecast system, and GRADS handles the model visualization ❖ Archived to local disk arrays and Amazon S3

Level-III NIDS Archive ❖ NCDC maintains an archive of the WSR-88D radar network’s products from 1995 to present (>10 Tb) ❖ Order datasets from a tape-based archive ❖ Two years to acquire it using a set of PHP scripts ❖ Easier to acquire the entire archive than figuring out what subset to acquire ❖ Already had a NIDS parser for visualization

FEMA Flood Maps ❖ Data Acquisition Method: DVD for each state ❖ Format: ESRI Shapefiles (1 shapefile of a feature class per state) ❖ Data Display: Split state shapefiles by county and then pre-render tiles for moderate to coarse zoom levels on a map mashup.

Suggestions ❖ Data in a difficult/proprietary format just waste disk space ❖ Please use data formats that are well-supported by open-source software packages (i.e. OGR/GDAL) ❖ netCDF, TIFF, ESRI shapefiles, HDF5, geoJSON ❖ Instead of complex CSV or fixed-width text files, use self-describing formats (JSON,XML,SQLITE)

Suggestions (cont.) ❖ Data/Navigation files should use the same naming conventions/sequences ❖ Don’t use overly large archive files ❖ Data pools/ftp servers attached to large disk arrays are awesome data providers (as long as limits are in place) ❖ For really large, static datasets (>10Gb), Bittorrent would be really useful

Questions/Comments/Answer s? ❖