Seminar 3 - Inverse modeling

Slides:



Advertisements
Similar presentations
Case Study 2 Neighborhood Models of the Allelopathic Effects of an Invasive Tree Species Gómez-Aparicio, L. and C. D. Canham Neighborhood analyses.
Advertisements

Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
The General Linear Model. The Simple Linear Model Linear Regression.
FOR 474: Forest Inventory Plot Level Metrics from Lidar Heights Other Plot Measures Sources of Error Readings: See Website.
Maximum likelihood (ML) and likelihood ratio (LR) test
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Maximum likelihood (ML)
Maximum likelihood (ML) and likelihood ratio (LR) test
Estimation of parameters. Maximum likelihood What has happened was most likely.
Linear statistical models 2008 Model diagnostics  Residual analysis  Outliers  Dependence  Heteroscedasticity  Violations of distributional assumptions.
6.4 Prediction -We have already seen how to make predictions about our dependent variable using our OLS estimates and values for our independent variables.
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Maximum likelihood (ML)
Relationships Among Variables
Day 7 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.
Random variables Petter Mostad Repetition Sample space, set theory, events, probability Conditional probability, Bayes theorem, independence,
Probability theory 2 Tron Anders Moger September 13th 2006.
The Triangle of Statistical Inference: Likelihoood
Random Sampling, Point Estimation and Maximum Likelihood.
Probability Distributions and Dataset Properties Lecture 2 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006.
Mechanism vs. phenomenology in choosing functional forms: Neighborhood analyses of tree competition Case Study 3 Likelihood Methods in Ecology April 25.
Analysis of Categorical and Ordinal Data: Binomial and Logistic Regression Lecture 6.
Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.
The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.
Likelihood Methods in Ecology November 16 th – 20 th, 2009 Millbrook, NY Instructors: Charles Canham and María Uriarte Teaching Assistant Liza Comita.
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
Lecture 5 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.
Issues in Estimation Data Generating Process:
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Random Variable The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we.
Seminar 3 Data requirements, limitations, and challenges: Inverse modeling of seed and seedling dispersal Likelihood Methods in Forest Ecology October.
Lecture 6 Your data and models are never perfect… Making choices in research design and analysis that you can defend.
Machine Learning 5. Parametric Methods.
Linear Regression Linear Regression. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Purpose Understand Linear Regression. Use R functions.
Nonlinear Logistic Regression of Susceptibility to Windthrow Seminar 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006.
Mechanism vs. phenomenology in choosing functional forms: Neighborhood analyses of tree competition Case Study 3.
Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”
Prediction and Missing Data. Summarising Distributions ● Models are often large and complex ● Often only interested in some parameters – e.g. not so interested.
Probability distributions and likelihood
Chapter 13 Simple Linear Regression
Theme 6. Linear regression
Chapter 4: Basic Estimation Techniques
SUR-2250 Error Theory.
Lecture 5 Model Evaluation
Chapter 14 Introduction to Multiple Regression
Spatial statistics: Spatial Autocorrelation
Likelihood Methods in Ecology
Chapter 4 Basic Estimation Techniques
Seed dispersal and seedling recruitment in Miro (Podocarpus ferrugineus, Podocarpaceae) & Puriri (Vitex lucens, Verbenaceae) Andrew Pegman PhD Candidate.
Modeling and Simulation CS 313
Likelihood Methods in Ecology
Case Study 2 - Neighborhood competition
Chapter 7: Sampling Distributions
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Basic Estimation Techniques
Correlation and Regression
Statistical Methods For Engineers
Estimating Population Size
Case Study - Neighborhood Models of Allelopathy
Model Comparison.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
The Gamma PDF Eliason (1993).
Integration of sensory modalities
Lecture 5 Model Evaluation
Parametric Methods Berlin Chen, 2005 References:
Mathematical Foundations of BME Reza Shadmehr
Lecture 6 C. D. Canham Lecture 7 Your data and models are never perfect… Making choices in research design and analysis that you can defend.
Presentation transcript:

Seminar 3 - Inverse modeling C. D. Canham Case Study 4: Data requirements, limitations, and challenges: Inverse modeling of seed and seedling dispersal

Approaches to Estimation of Seed and Seedling Dispersal Functions Seminar 3 - Inverse modeling C. D. Canham Approaches to Estimation of Seed and Seedling Dispersal Functions Direct sampling around isolated trees (David Greene) Develop mechanistic models with directly measurable parameters (Ran Nathan) Inverse modeling using likelihood methods and neighborhood models (Eric Ribbens, Jim Clark, and a rapidly growing community of practitioners…)

Seminar 3 - Inverse modeling C. D. Canham The questions… What are the shapes of the dispersal functions? How does fecundity vary as a function of tree size? What other factors determine the spatial distribution of seeds and seedlings around parent trees? Wind direction (anisotropy) Secondary dispersal Density and distance - dependent seed predation and pathogens Substrate conditions Light levels

The basic approach: field methods Seminar 3 - Inverse modeling C. D. Canham The basic approach: field methods Map the distribution of potential parent trees within a stand Sample the density of seeds or seedlings at mapped locations within the stand Measure any additional features at the location of the seed traps or seedling quadrats

Seminar 3 - Inverse modeling C. D. Canham

Seminar 3 - Inverse modeling C. D. Canham The Probability Model Observations consist of counts Assume the counts are either Poisson or Negative Binomial distributed Poisson PDF: Where x = observed density (integer), and l = predicted density (continuous)

Seminar 3 - Inverse modeling C. D. Canham Negative Binomial PDF “shape” of the PDF controlled by both the expected mean (m) and a “shape” parameter (k) As k varies, the distribution can vary from over- to under-dispersed (i.e. variance > or < mean) This formulation would probably never be used, because taking the log to get log likelihoods converts to an equation with ln(gamma) terms, and these are much easier to compute… This is the notation for the gamma function…

The basic “scientific” model Seminar 3 - Inverse modeling C. D. Canham The basic “scientific” model Seed rain at a given location is the sum of the input of N parent trees, with the input from any given tree a function of the: Size (typically DBH) and Distance to the parent

How does total seed production vary with tree size? Seminar 3 - Inverse modeling C. D. Canham How does total seed production vary with tree size? Common assumption: seed production is a function of DBH2 (following Ribbens et al. 1994) where a = 2, and STR = total standardized seed production of a 30 cm DBH tree Is this a reasonable assumption? Is it supported by either independent data or theory?

How does seed rain vary with distance from a parent tree? Seminar 3 - Inverse modeling C. D. Canham How does seed rain vary with distance from a parent tree? Two basic classes of functions are commonly used*: Monotonically declining (negative exponential): Lognormal: As written, both of these functions are bounded between 0 and 1, so they have the nice feature of acting like a fractional scalar… *See Greene et al. (2004), J. Ecol. for a discussion…

Seminar 3 - Inverse modeling C. D. Canham One more trick… Normalizing the dispersal function [g(dist)] so that STR is in meaningful units… Where h is the “arcwise” (i.e. 360o) integration of the dispersal function

Seminar 3 - Inverse modeling C. D. Canham So, the basic scientific models… Lognormal form: Exponential form:

Seminar 3 - Inverse modeling C. D. Canham The Scientific Models

Anisotropy: does direction matter? Seminar 3 - Inverse modeling C. D. Canham Anisotropy: does direction matter? For the lognormal dispersal function: Incorporate effect of direction from source tree on modal dispersal distance1: Angle = angle from tree to trap Delta = direction the wind is blowing from Xp = magnitude of displacement from mean dispersal distance 1Staelens, J., L. Nachtergale, S. Luyssaert, and N. Lust. 2003. A model of wind-influenced leaf litterfall in a mixed hardwood forest. Canadian Journal of Forest Research.

Shape of the wind direction effect Seminar 3 - Inverse modeling C. D. Canham Shape of the wind direction effect When would this matter? (just to increase goodness of fit and improve parameter estimation?)

Potential Dataset Limitations Seminar 3 - Inverse modeling C. D. Canham Potential Dataset Limitations Censored data: not all parents are accounted for Insufficient variation in predicted values: parents are too uniformly distributed Two different populations treated as one: not all potential parents actually produce seeds Lack of independence: spatial autocorrelation among nearby samples

Beware of simplifying assumptions in your model... Seminar 3 - Inverse modeling C. D. Canham Beware of simplifying assumptions in your model...

Seminar 3 - Inverse modeling C. D. Canham Parameter Estimation – Varying a

What is the minimum size of a reproductive adult? Seminar 3 - Inverse modeling C. D. Canham What is the minimum size of a reproductive adult? Most studies have arbitrarily assumed that all adults over a low minimum size (10 – 15 cm DBH) contribute seeds. One approach – estimate the minimum (don’t assume it) How could we determine the effective minimum reproductive size?

Parent size and seedling production in a Puerto Rican rainforest Seminar 3 - Inverse modeling C. D. Canham Parent size and seedling production in a Puerto Rican rainforest Source: Uriarte et al. (2005) J. Ecology

Seminar 3 - Inverse modeling C. D. Canham Scaling reproductive output to tree size: Maximum likelihood parameter estimates Species a min. size (cm) Casearia arborea 0.14 13.7 Dacryodes excelsa 0.51 NA Guarea guidonia 2.06 48.13 Inga laurina 2.38 16.39 Manilkara bidentata 0.01 44.04 Prestoea acuminata 0.15 13.89 Schefflera morototoni 3.22 9.61 Sloanea berteriana 1.70 11.06 Tabebuia heterophylla 0.01 20.93 Source: Uriarte, M., C. D. Canham, J. Thompson, J. K. Zimmerman, and N. Brokaw. 2005. Seedling recruitment in a hurricane-driven tropical forest: light limitation, density-dependence and the spatial distribution of parent trees. Journal of Ecology 93:291-304.

Should there be an “intercept” in the model? Seminar 3 - Inverse modeling C. D. Canham Should there be an “intercept” in the model? Allowing for long-distance dispersal via a “bath” term: Where b is an average input of seeds even when there are no parents in the neighborhood…

For seedlings: does light influence germination? Seminar 3 - Inverse modeling C. D. Canham For seedlings: does light influence germination? Lopt Lhi = slope away from Lopt Llo = slope to Lopt 0 < M(GLI) < 1

Seminar 3 - Inverse modeling C. D. Canham Light Availability

Is there evidence of density dependence in seedling establishment? Seminar 3 - Inverse modeling C. D. Canham Is there evidence of density dependence in seedling establishment? Add yet another multiplier... C δ DD Effect (0-1) Conspecific seedling density

Negative Conspecific Density Dependence Seminar 3 - Inverse modeling C. D. Canham Negative Conspecific Density Dependence

Seminar 3 - Inverse modeling C. D. Canham Spatial autocorrelation Spatial autocorrelation

Dealing with spatial autocorrelation among observations… Seminar 3 - Inverse modeling C. D. Canham Dealing with spatial autocorrelation among observations… Remember - the formula for calculating log-likelihood assumes that observations are independent… We have been conditioned to assume that two observations taken at locations close together are likely to be not independent (a legacy of Stuart Hurlbert) Moran’s I and other indices of spatial autocorrelation Use blackboard to define independence in statistical terms… when dependent, P(A&B)=P(A)*P(B|A) How do you determine whether this is true?

A critical distinction… Seminar 3 - Inverse modeling C. D. Canham A critical distinction… Remember – the issue is whether the residuals (the error terms in the probability model) are independent. NOT whether the raw observations are… If your scientific model “explains” why two nearby observations have similar values, then the fact that they are similar is NOT evidence of lack of independence*… *despite assertions to the contrary in some papers on the subject

So, examine your residuals for spatial autocorrelation Seminar 3 - Inverse modeling C. D. Canham So, examine your residuals for spatial autocorrelation A “best-case” species… Moran’s I Distance class (m) Examples from a study of seedling recruitment in a New Zealand rainforest (data from Elaine Wright)

Seminar 3 - Inverse modeling C. D. Canham Another species… A worse case… Moran’s I Distance class (m)

Causes and consequences of fine-scale spatial autocorrelation… Seminar 3 - Inverse modeling C. D. Canham Causes and consequences of fine-scale spatial autocorrelation… The causes are probably legion: Many trees don’t produce seed in any given mast year, Many factors can cluster input of seeds or survival of seedlings The consequences are important but not fatal: Generally very little bias in parameter estimates themselves, But estimates of the variance of the parameters will be biased (low) Do the thought experiment or test this with real data – what would happen if you duplicated some observations in the dataset and then redid the analysis?