Jensen, et. al. 2005 Winter distribution of blue crab Callinectes sapidus in Chesapeake Bay: application and cross-validation of a two-stage generalized.

Slides:



Advertisements
Similar presentations
You have data! What’s next? Data Analysis, Your Research Questions, and Proposal Writing Zoo 511 Spring 2014.
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
More General Need different response curves for each predictor Need more complex responses.
Objectives (BPS chapter 24)
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Statistics for the Social Sciences Psychology 340 Spring 2005 Prediction cont.
Copyright © 2008 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Managerial Economics, 9e Managerial Economics Thomas Maurice.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Chapter Topics Types of Regression Models
Chapter 11 Multiple Regression.
Linear Regression/Correlation
Relationships Among Variables
Correlation and Linear Regression
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
Inference for regression - Simple linear regression
BPS - 3rd Ed. Chapter 211 Inference for Regression.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Jensen, et. al Winter distribution of blue crab Callinectes sapidus in Chesapeake Bay: application and cross- validation of a two- stage generalized.
Stats/Methods I JEOPARDY. Jeopardy CorrelationRegressionZ-ScoresProbabilitySurprise $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/20/12 Multiple Regression SECTIONS 9.2, 10.1, 10.2 Multiple explanatory.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Linear regression models. Purposes: To describe the linear relationship between two continuous variables, the response variable (y- axis) and a single.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Chapter 8 Part I Answers The explanatory variable (x) is initial drop, measured in feet, and the response variable (y) is duration, measured in seconds.
Stats Methods at IC Lecture 3: Regression.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Bootstrap and Model Validation
23. Inference for regression
The simple linear regression model and parameter estimation
Why Model? Make predictions or forecasts where we don’t have data.
Chapter 4: Basic Estimation Techniques
Chapter 4 Basic Estimation Techniques
BAE 6520 Applied Environmental Statistics
Regression Analysis AGEC 784.
Review 1. Describing variables.
Statistics for Managers using Microsoft Excel 3rd Edition
Statistics for the Social Sciences
More General Need different response curves for each predictor
Correlation and Simple Linear Regression
Basic Estimation Techniques
Chapter 11: Simple Linear Regression
Generalized Linear Models
Elementary Statistics
(Residuals and
Checking Regression Model Assumptions
BPK 304W Correlation.
Correlation and Simple Linear Regression
The Practice of Statistics in the Life Sciences Fourth Edition
Basic Estimation Techniques
CHAPTER 29: Multiple Regression*
Checking Regression Model Assumptions
Linear Regression/Correlation
Predict Failures with Developer Networks and Social Network Analysis
Correlation and Simple Linear Regression
Correlation and Regression
Basic Practice of Statistics - 3rd Edition Inference for Regression
More General Need different response curves for each predictor
Linear Regression and Correlation
Product moment correlation
An Introduction to Correlational Research
by Wei-Ping Chan, I-Ching Chen, Robert K
Linear Regression and Correlation
The BRT was made with over 5,000 trees!
Presentation transcript:

Jensen, et. al. 2005 Winter distribution of blue crab Callinectes sapidus in Chesapeake Bay: application and cross-validation of a two-stage generalized additive model

We present a 2-stage generalized additive model (GAM) of the distribution of mature female blue crab Callinectes sapidus in Chesapeake …The distribution and abundance of blue crabs was modeled …Depth, salinity, temperature, and distance from the Bay mouth were found to be the most important environmental determinants of mature female blue crab distributions. The response curves for these variables displayed patterns that are consistent with laboratory and field studies of blue crab/habitat relationships. The generality of the habitat models was assessed using intra- and inter-annual cross-validation. Although the models generally performed well in cross-validation, some years showed unique habitat relationships that were not well predicted by models from other years. Such variability may be overlooked in habitat suitability models derived from data collected over short time periods.

Sample Data Non-commercial dredges December to March 1989-1990 Sampling changed somewhat 1255-1599 stratified random stations 3 regions >15mm carapace

Predictors Depth Salinity Water temp Distance from Bay Mouth Distance to submerged aquatic vegetation Slope Bottom type not available

Blue Crab Distribution Model

Predictors Bottom temp and salinity Distance to Bay mouth Slope 99 to 123 sites per year Kriging Distance to Bay mouth Lowest-cost path from 250m raster map Slope 30 meter bathymetry

Modeling 75% training, 25% test Stage I: presence of females Logistic: binomial error distribution, logit link D=depth, M=distance to mouth, V=distance to veg, S=salinity, B=bottom slope, T=temp Stage II: female density sites >=1 crab Abundance:

Predictor Correlation

Callinectes sapidus. Model selection results for (a) Stage I (presence/absence) and (b) Stage II (abundance) GAMs. Significance test p-values are given for the explanatory variables distance from Bay mouth, salinity, depth, temperature, bottom slope, distance from SAV, and interaction terms. Terms that were not significant (ns, p > 0.05) were dropped from the model unless they were involved in a significant interaction (I). Degrees of freedom (df) were fixed for terms in bold. The adjusted R2 and percent of deviance explained are also given for each model

Distance From Mouth Fig. 2. Callinectes sapidus. Conditional regression spline smooths of distance from Bay mouth (km) on the probability of mature female blue crab presence for (a) 1996, (b) 1999, and (c) 2000 and density (ind. 1000 m–2) given presence for (d–i) 1991 to 1996, (j) 1998, (k) 1999, and (l) 2001. Smooths are shown only for those years in which the variables were significant (p < 0.05) and not included in an interaction term. The y-axis is the normalized effect of the variable; rugplot on the

Blue Crab vs. Salinity Fig. 3. Callinectes sapidus. Conditional regression spline smooths of salinity (ppt) on the probability of mature female blue crab presence for (a) 1990, (b) 1992, (c) 1997, and (d) 2002 and density (ind. 1000 m–2) given presence for (e) 1998 and (f) 2000. Smooths are shown only for those years in which the variables were significant (p < 0.05) and not included in an interaction term. The y-axis is the normalized effect of the variable; rugplot on the x-axis represents the number of observations; dashed lines are the ±2 standard error confidence belts Jensen et. al. 2005, Winter distribution of blue crab Callinectes sapidus in Chesapeake Bay: application and cross-validation of a two-stage generalized additive model

Depth

Temp

Skip this as we have not talked about AUC yet. Table 3. Callinectes sapidus. Evaluation of Stage I (presence/absence) model fits to the training data (a) using receiver operating characteristic (ROC) curves and cross-validation of Stage I models (b). Values in (a) respresent the area unter the ROC curve (AUC), the critical p-values: poptimum (popt) and pfair, and their sensitivity (sens.), specificity (Spec.), and percent correct predictions (% Corr.). Values in (b) represent the AUC where models developed with data from one year (columns) are applied to data from another (rows). Values on the diagonal represent intra-annual cross-validation where models developed using a training data subset are applied to the test data subset for the same year. AUC values >0.7 are shaded

Table 4. Callinectes sapidus Table 4. Callinectes sapidus. Cross-validation where models developed with data from one year (columns) are applied to data from another (rows). Values in (a) respresent the cross-validation r-squared. Values on the diagonal (in bold) represent intraannual cross-validation where models developed using a training data subset are applied to the test data subset for the same year. The first row of (a) represents the model fit to the training data. Values in (b) represent the z-score, i.e. the number of standard deviations above (shaded) or below the grand mean Fisher (1915) transformed cross-validation correlation coefficient

Fig. 6. Fig. 6. Callinectes sapidus Fig. 6. Fig. 6. Callinectes sapidus. Cross-validation plot showing observed vs. predicted log density (log # 1000m–2) for the 1998 training data, the one-to-one line (solid) and the linear regression line (dashed) fit to all data, including zeros. Note: multiple observations are overlaid along the x-axis near the origin Fig. 6. Callinectes sapidus. Cross-validation plot showing observed vs. predicted log density (log # 1000m–2) for the 1998 training data, the one-to-one line (solid) and the linear regression line (dashed) fit to all data, including zeros. Note: multiple observations are overlaid along the x-axis near the origin

Additional Slides

“Results indicate little difference between the performance of GAM and BRT models” The BRT was made with over 5,000 trees! deviance explained, percent correctly classified (PCC), sensitivity (i.e., proportion of observed positives correctly predicted), specificity (proportion of observed negatives correctly predicted) and the area under the receiver operator characteristic curve (ROC which gives a measure of the degree to which fitted values discriminate between observed values) Best Possible Value 100% 1 Martinez-Rincon 2012, Comparative performance of generalized additive models and boosted regression trees for statistical modeling of incidental catch of wahoo (Acanthocybium solandri) in the Mexican tuna purse-seine fishery

Response Curves (partial) GAMs BRTs