Purely statistical (driven by statistical models of correlations and structures inferred from data) Purely mechanstic (driven by equations whose parameters and functional forms are taken from prior process-studies or first principles) Regression & machine-learning models that link: data, environmental process variables, and statistical models of process/observation error Examples: GLM, GAM, GLMM, RandomForest, MaxEnt, Geostatistics (Regression-Kriging) Coupled Dynamic Physical-Biological Models Examples: ROMS-NPZD, Atlantis, HAB forecast models, spawning habitat models Descriptive spatial and spatio-temporal statistics Examples: Geostatistics (Variogram analysis), Point-pattern analysis, Moran’s I, Spectral/Wavelet analysis, etc. Types of Spatial Models
Overview of spatial statistics tools for pelagic habitat characterization Spatial statistics and spatial modeling is a vast and rapidly developing field; would be impossible to cover everything here What specific types of problems do we have to address for pelagic habitat characterization? – Generating gap-free gridded maps from scattered survey data – Filling gaps in satellite data – Characterize scales of environmental and biological correlation and coherence – Predictive modeling of species distribution and abundance and/or ecosystem properties Spatial Spatio-temporal Not covered here… – Predicting/simulating coupled physical-biological processes – Assimilating data into hindcast ocean models (e.g., 4DVAR)
Some handy techniques for marine/ecological spatial modeling Descriptive analysis of pattern Variography (auto-correlation, cross-correlation) Point-pattern techniques (e.g., detecting clusters) Wavelet analysis (scale-dependent coupling) Empirical Orthogonal Function analysis (EOF) Interpolation Geostatistical models (Optimal Interpolation) – Ordinary Kriging – Indicator Kriging – Universal Kriging, Kriging with external drift Modeling distribition and abundance Spatial Generalized Linear Models (GLM’s) Spatial Generalized Additive Models (GAM’s) Spatial Generalized Linear Mixed Models (GLMM’s) Geostatistics: Regression-Kriging, Universal Kriging, Kriging with external drift ‘Machine learning’ techniques: Regression Trees (e.g., TreeNet, RandomForest), MaxEnt (for presence-only data), Neural nets Hierarchical Bayesian Spatial Models (Markov Random Fields, CAR models)
Challenges to ocean habitat characterization Ocean is dynamic Multiscale Complex interactions Coupling to human systems Threshold behavior; need to identify and validate indicators
Example: geostatistical interpolation Predicted Mean Prediction Error Method: Ordinary Kriging with external drift Source: Poti et al (Ch. 3 in NOAA NOS Tech Memo 141) A BIOGEOGRAPHIC ASSESSMENT OF SEABIRDS, DEEP-SEA CORALS AND OCEAN HABITATS OF THE NEW YORK BIGHT: SCIENCE TO SUPPORT OFFSHORE SPATIAL PLANNING
Text-book spatial variogram Results from a single, spatially-varying process Fits a theoretical model well Informative for understanding spatial scaling and sampling precision. Actual spatial variogram for sea scallop density on Georges from HabCam data (50m resolution). Evidence of spatial patchiness Evidence of hierarchical habitat structures. Example: Quantifying hierarchical habitat structures with variography
Parallel evidence for hierarchical habitat structure from scallop fishing behavior (VMS data). HabCam VMS Georges Bank VMS Mid Atlantic
Example: Delineating regions with distinct phytoplankton dynamics Methodology: Merge SeaWiFS and MODIS Chl A datasets for 1998 – 2010 Transform, scale and center data for each pixel / day of the year Perform Empirical Orthogonal Function (EOF) analysis Run cluster analysis on EOF scores to delineate regions
Example: Juvenile salmon habitat in the NCC H0: Alongshore transport links PDO to regional ocean conditions Results – Cold phase: more water from north, more cold water copepods, and more habitat – Bi et al. (2011) GRL – Habitat based on presence: Bi et al. MEPS (2007), Bi et al. FO (2008) – Habitat with spatial structure: GLMMS (Bi et al. FO 2011), GAM (Yu et al. in revision)
Software overview (not a comprehensive list) R packages – Gstat – RandomFields – others…see CRAN spatial task view Matlab toolboxes – mGstat – EasyKrig – BMElib – Wavelets – others…google search recommended ArcGIS extensions/toolboxes – Geostatistical Analyst, Spatial Analyst, Spatial Statistics extensions Marine Geospatial Ecology Tools (MGET) – Hybrid of R, ArcGIS, Python Free standalone programs SGeMS Gstat GSLIB Many others…see AI-geostats list on next page
A few links for more information AI Geostats: Forum that has compiled lists of free and commercial geostatistical software w/detailed capabilities; excellent resource: CRAN Task View: Analysis of Spatial Data ESRI Geostatistical Analyst Marine Geospatial Ecology Tools (MGET)