Who will you trust? Field technicians? Software programmers?

Slides:



Advertisements
Similar presentations
Design of Experiments Lecture I
Advertisements

Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
The Effects of Site and Soil on Fertilizer Response of Coastal Douglas-fir K.M. Littke, R.B. Harrison, and D.G. Briggs University of Washington Coast Fertilization.
Best Model Dylan Loudon. Linear Regression Results Erin Alvey.
Robert Plant != Richard Plant. Sample Data Response, covariates Predictors Remotely sensed Build Model Uncertainty Maps Covariates Direct or Remotely.
Maxent interface.
Statistics for the Social Sciences Psychology 340 Spring 2005 Prediction cont.
Correlation and Autocorrelation
Geographic data: sources and considerations. Geographical Concepts: Geographic coordinate system: defines locations on the earth using an angular unit.
Chapter Topics Types of Regression Models
Overview What is Spatial Modeling? Why do we care?
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Correlation and Regression Analysis
The Global Digital Elevation Model (GTOPO30) of Great Basin Location: latitude 38  15’ to 42  N, longitude 118  30’ to 115  30’ W Grid size: 925 m.
Correlation Coefficients Pearson’s Product Moment Correlation Coefficient  interval or ratio data only What about ordinal data?
Introduction to Regression Analysis, Chapter 13,
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Hydrologic Statistics
Spatial Interpolation of monthly precipitation by Kriging method
Basic Geographic Concepts GEOG 370 Instructor: Christine Erlien.
Agronomic Spatial Variability and Resolution What is it? How do we describe it? What does it imply for precision management?
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
NR 422- Habitat Suitability Models Jim Graham Spring 2009.
Regression. Population Covariance and Correlation.
Examining Relationships in Quantitative Research
Linear correlation and linear regression + summary of tests
Why Model? Make predictions or forecasts where we don’t have data.
Ex_Water Yield Model Data needs 1.Soil depth,an average soil depth value for each cell. The soil depth values should be in millimeters (Raster) Source:
Museum and Institute of Zoology PAS Warsaw Magdalena Żytomska Berlin, 6th September 2007.
Role of Spatial Database in Biodiversity Conservation Planning Sham Davande, GIS Expert Arid Communities Technologies, Bhuj 11 September, 2015.
Uncertainty How “certain” of the data are we? How much “error” does it contain? How well does the model match reality? Goal: –Understand and document uncertainties.
Figure 2-1. Two different renderings (categorizations) of corn yield data. Analyzing Precision Ag Data – text figures © 2002, Joseph K. Berry—permission.
Remotely sensed land cover heterogeneity
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Setting up for modeling Remember goals: –Desired model and uncertainty Sample area selection What resolution will we model at? What are we modeling? How.
Remote-sensing and biodiversity in a changing climate Catherine Graham SUNY-Stony Brook Robert Hijmans, UC-Berkeley Lianrong Zhai, SUNY-Stony Brook Sassan.
Interpolation and evaluation of probable Maximum Precipitation (PMP) patterns using different methods by: tarun gill.
Interfacing Vegetation Databases with ecological theory and practical analysis. Mike Austin, Margaret Cawsey and Andre Zerger CSIRO Sustainable Ecosystems.
How Good is a Model? How much information does AIC give us? –Model 1: 3124 –Model 2: 2932 –Model 3: 2968 –Model 4: 3204 –Model 5: 5436.
U.S. Department of the Interior U.S. Geological Survey Automatic Generation of Parameter Inputs and Visualization of Model Outputs for AGNPS using GIS.
Uncertainty “God does not play dice” –Einstein “the end of certainty” –Prigogine, 1977 Nobel Prize What remains is: –Quantifiable probability with uncertainty.
Performance Performance is fundamentally limited by: –Size of data –Where the data is stored –Type of processing –Processing software –Hardware available.
Why Is It There? Chapter 6. Review: Dueker’s (1979) Definition “a geographic information system is a special case of information systems where the database.
Habitat Suitability Models
Why Model? Make predictions or forecasts where we don’t have data.
Robert Plant != Richard Plant
Statistics for Managers using Microsoft Excel 3rd Edition
Statistics for the Social Sciences
Uncertainty How “certain” of the data are we?
More General Need different response curves for each predictor
Overview What is Spatial Modeling? Why do we care?
Predicting species distributions for New England invasives
How Good is a Model? How much information does AIC give us?
What is in our head…. Spatial Modeling Performance in Complex Terrain Scott Eichelberger, Vaisala.
Performance Performance is fundamentally limited by: Size of data
Radar/Surface Quantitative Precipitation Estimation
Understanding and Assembling Model Input
Regression Modeling Approaches
Overview What is Spatial Modeling? Why do we care?
Uncertainty “God does not play dice”
Spatial interpolation
Hydrologically Relevant Error Metrics for PEHRPP
Igor Appel Alexander Kokhanovsky
Jensen, et. al Winter distribution of blue crab Callinectes sapidus in Chesapeake Bay: application and cross-validation of a two-stage generalized.
More General Need different response curves for each predictor
Presentation transcript:

Who will you trust? Field technicians? Software programmers? Statisticians? Instructors? GIS technicians? Other researchers? Yourself?

Regression (Correlation) Modeling Creates a model in N-Dimensional “Hyper-Space” Defined by: Covariates Response variables Mathematics used to create the model Statistics used to optimize parameters Options for model evaluation Predictor variables

Multiple Linear Regression  

Linear Regression: 2 Predictors Mathworks.com

Non-Linear Regression

Regression Methods Continuous Regression: Linear Regression Generalized Linear Models (GLM) Generalized Additive Models (GAMs) Categorical Regression (trees): Regression Trees Classification and regression trees (CART) Machine Learning: Maximum Entropy (Maxent) NPMR, HEMI, BRTs, etc.

Brown Shrimp Size Add graph from work

Terminology Plant uses: I prefer: Measured value and response variable Explanatory variable I prefer: Response variable I’ll use “measured value” to identify measured values in field data Covariate: Explanatory variable used to build the model Predictor: Explanatory variable used to predict The covariate and the predictor will be different in cases like predicting effects of climate change in the future.

Douglas Fir Habitat Model 1 Habitat Quality 1000 Precipitation (mm)

Predictor Model Prediction

Model Selection and Parameter Estimation Field Data Covariate Predictor Model Prediction

Model Selection and Parameter Estimation Field or Sample Data Covariate Predictor Model Model Validation Prediction

Douglas-Fir sample data Create the Model Model “Parameters” Precip Extract Prediction To Points Text File Attributes To Raster

Data Response Variable Covariates Predictors From the field data (sample data) Covariates From the field or remotely sensed Predictors Typically remotely sensed Sample as covariates for training Can be different for predicting to new scenarios

Response Variable What is the: Will it answer your question? Spatial uncertainty? Temporal uncertainty? Measurement uncertainty? Will it answer your question?

Covariate Variables What is the: Spatial uncertainty? Temporal uncertainty? Measurement uncertainty? How well does the collection time of the covariates match the field data? Do they co-vary with the phenomena? Do the covariates “correlate”?

Types of uncertainty Accuracy (bias) Precision (repeatability) Reliability (consistency of a set of measurements) Resolution (fineness of detail) Logical consistency Adherence to structural rules, attributes, and relationships Completeness Plant uses Accuracy a little differently

Types of Errors Gross errors Random Systematic errors Transcription Sinks in DEMs Random Estimated using probability theory Systematic errors “Drift” in instruments Dropped lines in Landsat All of these types of uncertainty can be compensated for in some cases.

Gross Errors Lat/Lon: Dates: Measurements: Reversed 0, names, dates, etc. Dates: Extended in databases Measurements: Inconsistent units Inconsistent protocols What can you expect from a field team?

Occurrences of Polar Bears From The Global Biodiversity Information Facility (www.gbif.org, 2011)

Systematic Errors Landsat Scan line Error

Response Variable Qualification Tools Maps (various resolutions) Examine the data values: How many digits? Repeating patterns, gross errors? “Documentation” Measurements: Occurrences? Binary: Histogram Categorical: Histogram Continuous: Histogram

What’s the Impact on Models?

Significant Digits How many digits to represent 1 meter? Geographic: Lat/Lon? UTM: Eastings/Northings?

Significant Digits Geographic: UTM: 1 digit = 1 degree 1 degree ~ 110 km 0.00001 ~ 1.1 meters UTM: 1 digit = 1 meter

Covariate Qualification Maps Documentation Examine the data: How many digits? Integer or floating point? Repeating patterns? Histograms

CONUS Annual Percip.

Covariate Uncertinaty

Min Temp of Coldest Month After applying a filter to the raster

Histograms hist(Temp,breaks=400)

Covariate Correlation Correlation Plots Pearson product-moment correlation coefficient Spearman’s rho – non parametric correlation coefficient

Correlation plots

California Correlations

California Predictors

Response vs. Covariates For Occurrences: Histogram covariates at occurrences vs. overall covariates For Binary Data: Histogram covariates for each value For Categorical Data : Or scatter plots For Continuous Data Scatter plots

Covariate Occurrence Histograms Precipitation with Douglas-Fir Occurrences

Douglas Fir Model In HEMI 2 Green shows a histogram of precipitation for all of California Histograms are scaled to go from 0 to 1 (all values) Green: Histogram of all of California Red: Histogram of Douglas-Fir Occurrences

Doug-Fir Height vs. Precip.

Douglas Fir Height After gridding to coarse grid cells

Terrestrial Predictors Elevation: Slope Aspect Absolute Aspect Distance to: Roads Streams (streamline) Climate Precip Temp Soil Type RS: Landsat MODIS NDVI, etc.

Marine Predictors Temp DO2 Salinity Depth Rugosity (roughness) Current (at depths) Wind

More Complicated Associated species Trophic levels Temporal Cyclical

Predictor Layers Means, mins, maxes Range of values Heterogeneity Spatial layers: Distance to… Topography: elevation, slope, aspect

Field Data and Predictors As close to field measurements as possible Clean and aggregate data as needed Documenting as you go Estimate overall uncertainty Answer the question: What spatial, temporal, and measurement scales are appropriate to model at given the data?

Temporal Issues Divide data into months, seasons, years, decades. Consistent between predictors and response Extract predictors as close to sample location and dates as possible Use the “best” predictor layers

Additional Slides

Dimensions of uncertainty Space Time Attribute Scale Relationships

Basic Tools Histograms: What is the distribution of occurrences of values (range and shape) Scattergrams: What is the relationship between response and predictor variables and between predictor variables QQPlots: Are the residuals normally distributed?

Types of Data “God does not play dice” “the end of certainty” Einstein “the end of certainty” Prigogine, 1977 Nobel Prize What remains is: Quantifiable probability with uncertainty

Uncertainty Factors Inherent uncertainty in the world Limitation of human congnition Limitation of measurement Uncertainty in processing and analysis