Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiences of a Earth Science Data User Confessions of a Data Hoarder Rob Carver, The Weather Company.

Similar presentations


Presentation on theme: "Experiences of a Earth Science Data User Confessions of a Data Hoarder Rob Carver, The Weather Company."— Presentation transcript:

1 Experiences of a Earth Science Data User Confessions of a Data Hoarder Rob Carver, The Weather Company

2 –Andrew S. Tanenbaum “Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.”

3 Open Data and The Weather Company ❖ Our business model is taking open data and using it to tell interesting stories that engage our users. ❖ Over the years, we’ve archived over 100 Tb of data ❖ GRIB1, GRIB2, NIDS, shapefiles, netCDF, HDF5, ❖ NWS/NCEP, NCDC, FEMA, Census Bureau, NASA DAAC’s

4 Locating Data 1.Google and literature searches 2.??? 3.Data!

5 100+ Tb of Weather Models ❖ Most data arrives through Unidata’s LDM and FTP pull scripts. ECMWF pushes data to our FTP site. (All GRIB2/1) ❖ Ingested into the forecast system, and GRADS handles the model visualization ❖ Archived to local disk arrays and Amazon S3

6 Level-III NIDS Archive ❖ NCDC maintains an archive of the WSR-88D radar network’s products from 1995 to present (>10 Tb) ❖ Order datasets from a tape-based archive ❖ Two years to acquire it using a set of PHP scripts ❖ Easier to acquire the entire archive than figuring out what subset to acquire ❖ Already had a NIDS parser for visualization

7 FEMA Flood Maps ❖ Data Acquisition Method: DVD for each state ❖ Format: ESRI Shapefiles (1 shapefile of a feature class per state) ❖ Data Display: Split state shapefiles by county and then pre-render tiles for moderate to coarse zoom levels on a map mashup.

8 Suggestions ❖ Data in a difficult/proprietary format just waste disk space ❖ Please use data formats that are well-supported by open-source software packages (i.e. OGR/GDAL) ❖ netCDF, TIFF, ESRI shapefiles, HDF5, geoJSON ❖ Instead of complex CSV or fixed-width text files, use self-describing formats (JSON,XML,SQLITE)

9 Suggestions (cont.) ❖ Data/Navigation files should use the same naming conventions/sequences ❖ Don’t use overly large archive files ❖ Data pools/ftp servers attached to large disk arrays are awesome data providers (as long as limits are in place) ❖ For really large, static datasets (>10Gb), Bittorrent would be really useful

10 Questions/Comments/Answer s? ❖ rob.carver@weather.com rob.carver@weather.com


Download ppt "Experiences of a Earth Science Data User Confessions of a Data Hoarder Rob Carver, The Weather Company."

Similar presentations


Ads by Google