Download presentation
Presentation is loading. Please wait.
Published byPatience McKenzie Modified over 9 years ago
1
Lecture 4 Model Formulation and Choice of Functional Forms: Translating Your Ideas into Models
2
Alternate models as multiple working hypotheses. Null models Choice of functional forms Topics
3
Data Scientific Model* (hypothesis) Probability Model Inference The triangle of statistical inference *All hypotheses can be expressed as models!
4
The Scientific Method “Science is a process for learning about nature in which competing ideas are measured against observations” Feynman 1965
5
Scientific Process Devise alternative hypotheses. Devise experiment(s) with alternative possible outcomes. Carry out experiments. Recycle procedure. -- Platt 1964 (Strong inference) But this is time consuming and not very useful for many questions…..
6
The method of multiple working hypotheses “It differs from the simple working hypothesis in that it distributes the effort and divides the affection. “ “ Bring up into review every rationale explanation of the phenomenon in hand and to develop every tenable hypothesis relative to its nature. “ “ Some of the hypotheses have already been proposed and used while others are the investigator’s own creations. “ An adequate explanation often involves the coordination of several causes. “ “ When faithfully followed for a sufficient time it develops the habit of parallel or complex thought. “ “ The power of viewing phenomena analytically and synthetically at the same time appears to be gained. “ ---T. C.Chamberlain, 1890. Science 15: 92.
7
What is the best model to use? This is the critical question in making valid inferences from data. Careful a priori consideration of alternative models will often require a major change in emphasis among scientists. Model specification is more difficult than the application of likelihood techniques.
8
Formulation of Candidate Models Conceptually difficult. Subjective. Original and innovative. Models represent a scientific hypothesis. Translating your qualitative ideas into a quantitative, algebraic model that can be tested against alternative models…
9
Where do models come from? Scientific literature. Results of manipulative experiments. Personal experience. Scientific debate. Natural resource management questions. Monitoring programs. Judicial hearings.
10
Are models truth? Truth has infinite dimensions Sample data are finite Models should provide a good approximation to the data Larger data sets will support more complex approximations to reality
11
“..empiricism, like theory, is based on a series of simplifying assumptions…By choosing what to measure and what to ignore, an empiricist is making as many assumptions as does any theoretician.” --David Tilman Model selection is implicit in science
12
Develop a set of a priori candidate models Include a global model that includes all potential relevant effects. Test of global model (R-square, goodness of fit tests). Develop alternative simpler models.
13
Assessing alternative models How well does the model approximate “truth” relative to its competitors? (high accuracy or low bias). How repeatable is the prediction of a model relative to its competitors? (high precision or low variance).
14
Why do model selection at all? Principle of parsimony Number of parameters FewMany Bias 2 Variance
15
Principle of parsimony applied to model selection We typically penalize added complexity. A more complex model has to exceed a certain threshold of improvement over a simpler model. Added complexity usually makes a model more unstable. Complex models spread the data too thinly over data. Model selection is not about whether something is true or not but about whether we have enough information to characterize it properly.
16
Reality: Actual data Example from page 33-34 of Burnham and Anderson
17
A set of candidate models
18
UNDERFITTING!! Too simple: High bias (low accuracy)
19
OVERFITTING!! Too complicated: High variance (low precision)
20
REASONABLE FIT The compromise: a parsimonious model
21
Null Models Parametric methods advocate testing hypotheses against a null expectation (Ho ). Often the null is probably false simply on a priori grounds (e.g., the parameter θ had no effect). In likelihood terms this usually means the null model is the one that sets the value of parameter θ equal to 0 or 1.
22
States of mind of a null hypothesis tester Practical importance of Statistical significance observed difference of observed difference Not significant Significant Not important Important
23
Model Selection Methods Adjusted R- square. Likelihood Ratio Tests. Akaike’s Information Criterion. We will talk about these topics later…
24
Choice of Functional Forms Model formulation requires the specification of a functional form that formalizes the relationship between the predictive variables and the process we are trying to understand. The functional form should clarify the verbal description of the mechanisms driving the process under study. Choosing a functional form is a skill that needs to be developed over time.
25
Choice of Functional Forms: Mechanism vs. phenomenology Mechanistic: based on some biological or ecological model. Phenomenological: functions that fit the data well or are simple/convenient to use.
26
Choice of functional forms: What matters? Does it represent what happens in your model? Does the shape of the function resemble actual data? Is the range of data desired delivered by this function? Does the function allow for ready variation of the aspects of the question that the researcher wants to explore? What happens at either end (as x 0 and x )? What happens in the middle? Critical points (maxima, minima).
27
Model Functions Vs. Probability Density Functions Properties of pdf’s x Prob(x)
28
Some useful functions (not necessarily pdf’s!) Exponential. Weibull. Logistic. Lognormal. Power. Generalized Poisson. Logarithmic.
29
Exponential
30
Exponential: Decline in maximum potential growth as a function of crowding Effect on growth (Growth multiplier) 0 1 Species A Species B NCI (Neighborhood Crowding Index)
31
Michaelis-Menten function a = 1.43 s = 0.76 a = 1.63 s = 0.31
32
The exponential is a special case of the Weibull function (β=0): Weibull function
33
Weibull Example: Dispersal functions
34
Logistic
35
Logistic: Probability of mortality as a function of storm severity Canham et al. 2001
36
Lognormal
37
Lognormal: Leaf litterfall as a function of distance to the parent tree Data from GMF, CT
38
DBH (cm) 020406080100120140 0 5 10 15 20 25 30 35 CASARB DACEXC MANBID INGLAU SLOBER CECSCH TABHET GUAGUI ALCLAT SCHMOR BUCTET Lognormal: Growth as a function of DBH Max. Potential Growth (cm/yr) Data from LFDP, Puerto Rico
39
Power function: small mammal distribution as a function of canopy tree neighborhood Schnurr et al. 2004.
40
Parameter trade-offs: More than one way to get there…. NCI (Neighborhood Crowding Index) Trade-off?
41
Things to keep in mind Scaling issues: Pay attention to units, scales, and conversions. Multiplicative functions and parameter tradeoff. Computational issues Large exponent values Division by zero Logs of negative numbers
42
Catalog of curves for curve fitting. British Columbia Ministry of Forests. Abramowitz, M. and I. Stegun. 1965. Handbook of Mathematical Functions. McGill, B. 2003. “Strong and weak tests of macroecological theory”. Oikos. VanClay, J. 1995. “Growth models for tropical forests: a synthesis of models and methods”. Forest Science. Some useful references
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.