Nonlinear Logistic Regression of Susceptibility to Windthrow Seminar 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Multiple Regression Analysis
Brief introduction on Logistic Regression
Logistic Regression Psy 524 Ainsworth.
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Models with Discrete Dependent Variables
Econometric Details -- the market model Assume that asset returns are jointly multivariate normal and independently and identically distributed through.
N-way ANOVA. 3-way ANOVA 2 H 0 : The mean respiratory rate is the same for all species H 0 : The mean respiratory rate is the same for all temperatures.
Correlation and Regression. Spearman's rank correlation An alternative to correlation that does not make so many assumptions Still measures the strength.
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Log-linear and logistic models
Chapter Topics Types of Regression Models
Topic 3: Regression.
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Generalized Linear Models
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Regression and Correlation Methods Judy Zhong Ph.D.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Inference for regression - Simple linear regression
Day 7 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.
Simple Linear Regression
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Mechanism vs. phenomenology in choosing functional forms: Neighborhood analyses of tree competition Case Study 3 Likelihood Methods in Ecology April 25.
Multiple regression models Experimental design and data analysis for biologists (Quinn & Keough, 2002) Environmental sampling and analysis.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Analysis of Categorical and Ordinal Data: Binomial and Logistic Regression Lecture 6.
Linear correlation and linear regression + summary of tests
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
Lecture 5 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
Logistic Regression. Linear Regression Purchases vs. Income.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
PCB 3043L - General Ecology Data Analysis.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Logistic Regression and Odds Ratios Psych DeShon.
Nonparametric Statistics
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Mechanism vs. phenomenology in choosing functional forms: Neighborhood analyses of tree competition Case Study 3.
LOGISTIC REGRESSION. Purpose  Logistical regression is regularly used when there are only two categories of the dependent variable and there is a mixture.
Logistic Regression: Regression with a Binary Dependent Variable.
Nonparametric Statistics
The simple linear regression model and parameter estimation
Lecture 7 - Binomial and Logistic Regression
Chapter 7. Classification and Prediction
EHS Lecture 14: Linear and logistic regression, task-based assessment
Regression Analysis AGEC 784.
Logistic Regression APKC – STATS AFAC (2016).
Notes on Logistic Regression
Basic Estimation Techniques
PCB 3043L - General Ecology Data Analysis.
Generalized Linear Models
Basic Estimation Techniques
CHAPTER 29: Multiple Regression*
Nonparametric Statistics
Seminar 8 - Binomial and Logistic Regression
Lecture 7 - Binomial and Logistic Regression
Regression and Categorical Predictors
Lecture 7 - Binomial and Logistic Regression
Lecture 7 - Binomial and Logistic Regression
Presentation transcript:

Nonlinear Logistic Regression of Susceptibility to Windthrow Seminar 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Analysis of windthrow data l Traditionally: Summarize variation in degree and type of damage, across species and tree sizes, from the storm, as a whole... l A likelihood alternative: Use the spatial variation in storm intensity that occurs within a given storm to estimate parameters of functions that describe susceptibility to windthrow, as a function of variation of storm severity...

Categorizing Wind Damage l BINARY: Simple binary response variable (windthrown vs. not...) l CATEGORICAL: Multiple categories (uprooted, snapped,...) l ORDINAL: Ordinal categories (degree of damage): none, light, medium, heavy, complete canopy loss {usually estimated visually} l CONTINUOUS: just what the term implies, but rarely used because of the difficulties of quantifying damage in these terms...

Analysis of BINARY Data: Traditional Logistic Regression Definition: Logit = log of an odds ratio (i.e log[p/(1-p)]) Benefits of logits A logit is a continuous variable Ranges from negative when p 0.5 Standard logistic regression involves fitting a linear function to the logit: Consider a sample space consisting of two outcomes (A,B) where the probability that event A occurs is p

What if your terms are multiplicative? Example: Assume that the probability of windthrow is a joint (multiplicative) function of (1)Storm severity, and (2)Tree size In addition, assume that the effect of DBH is nonlinear.... A model that incorporates these can be written as:

A little more detail.... l P isj is the probability of windthrow of the j th individual of species s in plot i l DBH isj is the DBH of that individual l a s, b s, and c s are species-specific, estimated parameters, and l S i is the estimated storm severity in plot i - NOTE: storm severity is an arbitrary index, and was allowed to range from 0-1 But don’t you have to measure storm severity (not estimate it)?

Likelihood Function It couldn’t be any easier... (since the scientific model is already expressed as a probabilistic equation):.

An alternative, compound likelihood function and scientific model What if we assume that storm severity is not fixed within a plot (S i ), but rather varies for different trees within the plot? If we are willing to assume a constant variance across all plots, this only requires estimation of a single additional parameter (the variance: s 2 )

Example: Windthrow in the Adirondacks Highly variable damage due to: variation within storm topography susceptibility of species within a stand Reference: Canham, C. D., Papaik, M. J., and Latty, E. F Interspecific variation in susceptibility to windthrow as a function of tree size and storm severity for northern temperate tree species. Canadian Journal of Forest Research 31:1-10.

The dataset l Study area: 15 x 6 km area perpendicular to the storm path l 43 circular plots: ha (19.95 m radius) censused in 1996 (20 of the 43 were in oldgrowth forests) l The plots were chosen to span a wide range of apparent damage l All trees > 10 cm DBH censused l Tallied as windthrown if uprooted or if stem was < 45 o from the ground

Critical data requirements l Variation in storm severity across plots l Variation in DBH and species mixture within plots

The analysis... l 7 species comprised 97% of stems – only stems of those 7 species were included in the dataset for analysis l # parameters = 64 (43 plots + 3 parameters for each of 7 species) l Parameters estimated using simulated annealing

Model evaluation Numbers above bars represent the number of observations in the class The solid line is a 1:1 relationship

Estimating Storm Severity

Results: Big trees...

Little trees...

New twists l Effects of partial harvesting on risk of windthrow to residual trees l Effects of proximity to edges of clearings on risk of windthrow Research with Dave Coates in cedar-hemlock forests of interior B.C.

Effects of harvest intensity and proximity to edge… Equation (2) introduces the effect of prior harvest removal to equation (1) by adding basal area removal and assumes the effect is independent and additive Equation (3) assumes the effects of prior harvest interact with tree size: Models 1a – 3a: test models where separate c coefficients are estimated for “edge” vs. “non-edge” trees (edge = any tree within 10 m of a forest edge) Equation (1): basic model – probability of windthrow is a species-specific function of tree size and storm severity:

Other issues… l Is the risk of windthrow independent of the fate of neighboring trees? (not likely) - Should we examine spatially-explicit models that factor in the “nucleating” process of spread of windthrow gaps?…

Analysis for CATEGORICAL Response Variables l Extension of the binary case??: - Estimate a complete set of species-specific parameters for each of n-1 categories (assuming that the set of categories is complete and mutually exclusive...) l # of parameters required = P + (n-1)*(3*S) - Where P = # plots, S = # species, and n = # of response categories - {Is this feasible?...}

Analysis for ORDINAL Response Variables l The categories in this case are ranked (i.e. none, light, heavy damage) l Analysis shifts to cumulative probabilities...

Simple Ordinal Logistic Regression (i.e. the probability that an observation y will be less than or equal to ordinal level Y k (k = 1.. n-1 levels), given a vector of X explanatory variables), Then simple ordinal logistic regression fits a model of the form: If Remember : The probability that an event will fall into a single class k (rather than the cumulative probability) is simply

In our case... and where a ks, c s and b s are species specific parameters (s = 1.. m species), and S i are the estimated storm severities for the i = 1..n plots. where # of parameters: M + (n-1+2)*Q, where M = # of plots, n = # of ordinal response levels, and Q = # of species

The Likelihood Function Stays the Same Again, since the scientific model is already expressed as a probabilistic equation: The probability that an event will fall into a single class k (rather than the cumulative probability) is simply

Hurricanes in Puerto Rico l Storm damage assessment in the permanent plot at the Luquillo LTER site - Hurricane Hugo Hurricane Georges – 1998 l Combined the data into a single analysis: 136 plots, 13 species (including 1 lumped category for “other” species), and 3 damage levels: - No or light damage - Partial damage - Complete canopy loss l Total # of parameters = 188 (15,647 trees)

Parameter Estimation Solving simultaneously for 188 parameters in a dataset containing > 15,000 trees takes time...

Model Evaluation

Comparison of the two storms... Statistics on variation in storm severity from Hurricanes Hugo and Georges

Support for the Storm Severity Parameter Estimates Support limits for the 136 estimates of storm severity were not particularly “tight” Remember that the storm severity parameter values range from 0 - 1

Support for the Species-specific Parameters Range of the 1.92 Unit Support Intervals, as a % of the parameter estimate Strength of support for the species-specific parameters was better, but still not great...

Critical assumptions l Probability of damage to a tree in Georges was independent of damage in Hugo (actually true…) l The “parallel slopes” model is reasonable l Others ?