Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wildlife Population Analysis

Similar presentations


Presentation on theme: "Wildlife Population Analysis"— Presentation transcript:

1 Wildlife Population Analysis
Maximum Likelihood Estimators, Information Theoretic Methods, and Significance Testing

2 But first… 1 model  1 hypothesis and Hypothesis = {T} + H
Importance of a priori thinking in research Theory - principle that is widely accepted and can be used to make predictions about natural phenomena. Hypothesis - conjecture put forth as a possible explanation of observed phenomena, Basis of argument or experimentation to reach truth 1 model  1 hypothesis and Hypothesis = {T} + H

3 Also extremely important…
Good study design – a few elements are: Randomization Vital to getting a representative result Selection of samples or assignment of treatments Biased sampling  biased results C  E Simple random, stratified, cluster, adaptive… Replication (not just  sample size) Use of controls = rigor +$$$ AIC and MLE won’t help

4 Scope of inference Draw conclusions using deductive or inductive logic about a population based on a limited number of samples. Q: How do we determine the population of inference? A: it is the group of individuals (units) with some probability of being sampled

5 Good research …statistical methods get in the way of [doing the research we need to do]… In field studies, all too often we use cost and the “complexity of natural systems” as excuses for lack of a priori thinking and poor scientific process…

6 Wildlife Population Analysis
Maximum Likelihood Estimators, Information Theoretic Methods, and Significance Testing

7 Motivation Maximum likelihood estimates
Present estimation based on the likelihood Introduce concept of model-based estimation Provide an example of maximum likelihood estimation Information theoretic methods (e.g., AIC) Introduce the idea of model uncertainty Establish concepts of model selection

8 Part I: Maximum Likelihood Estimators

9 Maximum Likelihood Estimates
Given a model () MLE is (are) the value(s) that are most likely to estimate the parameter(s) of interest. That is, they maximize the probability of the model given the data. The likelihood of a model is the product of the probabilities of the observations.

10 Maximum Likelihood Estimates
For linear models (e.g., ANOVA and regression) these are usually determined using the linear equations which minimize the sum of the squared residuals – closed form For nonlinear models and some distributions we determine MLEs setting the first derivative equal to zero and then making sure it is a maxima by setting the second derivative equal to zero – closed form. Or we can search for values that maximize the probabilities of all of the observations – numerical estimation.

11 Binomial Sampling characterized by two mutually exclusive events
heads or tails, on or off, heads or tails, male or female, or in our case survived or died. often referred to as Bernoulli trials

12 Models Trials have an associated parameter p
usually referred to as the probability of success. probability of failure is 1-p, often referred to as q p + q = 1 Parameter, p, also represents a model – approximation of the truth binomial models may be good approximation complex models (e.g., capture-mark recapture) probabilities and the models themselves may not approximate reality well.

13 Binomial Sampling p is a continuous variable between 0 and 1 (0 <p <1) y is the number of successful outcomes n is the number of trials. This estimator is unbiased. Expectation: Additionally:

14 About Binomial random variables:
n trials must be identical – i.e., the population is well defined (e.g.,20 coin flips, 50 Kirtland's warbler nests, 75 radio-marked black bears in the Pisgah Bear Sanctuary). Each trial results in one of two mutually exclusive outcomes. (e.g., heads or tails, survived or died, successful or failed, etc.) The probability of success on each trial remains constant. (homogeneous) Trials are independent events (the outcome of one does not depend on the outcome of another). y, the number of successes; is the random variable after n trials.

15 Binomial Probability Function
…the probability of observing y successes given n trials with the underlying probability p is ... Example: 10 flips of a fair coin (p = 0.5), 7 of which turn up heads is written

16 Binomial Probability Function (2)
evaluated numerically:

17 Binomial Probability Function (3)
Maximum Maximum

18 Likelihood Function of Binomial Probability
Reality: have data (n and y) don’t know the model (p) leads us to the likelihood function: read the likelihood of p given n and y is ... not a probability function. is a positive function (0 < p < 1)

19 Binomial Probability Function and it's likelihood
Back to our example. We found: we can ignore the above constant (binomial coefficient) use observed values of n and y insert various values of p in:

20 Likelihood Function of Binomial Probability(2)
Alternatively, the likelihood of the data given the model can be thought of as the product of the probabilities of the individual observations. The probability of the observations is: Therefore, f = 1 for success, f = 0 for failure

21 Binomial Probability Function and it's likelihood
maximum

22 Log likelihood Although the Likelihood function is useful, the log-likelihood has some desirable properties in that the terms are additive and the binomial coefficient does not include p.

23 Log likelihood Using the alternative:
Once again the estimate of p that maximizes the value of ln(L) is the MLE.

24 Properties of MLEs Asymptotically normally distributed
Asymptotically minimize variance Asymptotically unbiased as n →  One-to-one transformations of MLEs are also MLEs. For example mean lifespan: is also an MLE.

25 Information Theoretic Methods

26 Resources Information Theoretic Methods
Johnson, D. H The insignificance of statistical significance testing. Journal of Wildlife Management 63: Anderson, D. R., K. P. Burnham, and W. L. Thompson Null hypothesis testing: problems, prevalence, and an alternative. Journal Wildlife Management 64: Anderson, D. R., K. P. Burnham Avoiding pitfalls when using information-theoretic methods. Journal Wildlife Management 66: Information Theoretic Methods: model selection Burnham, K. P., and D. R. Anderson Model selection and multi-model inference: a practical information theoretic approach. 2nd ed. Springer-Verlag, New York, NY.

27 Models “All models are wrong; some are useful.”
George Box Approximations of reality. A statistical model is a mathematical expression that help us predict a response (dependent) variable as a function of explanatory (independent) variables based on a set of assumptions that allow the model not to fit exactly.

28 Parsimony Defined - Economy in the use of means to an end.
…[using] the smallest number of parameters possible for adequate representation of the data.” Box and Jenkins (1970:17) In the context of our analyses, we strive to be economical in the use of parameters to explain the variation in data.

29 Precision versus bias Biased; Imprecise Unbiased; Imprecise
Unbiased; Precise Biased; Precise

30 Trade-off between precision and bias.
As K, the number of parameters increases, bias decreases and variance increases. Best Approximating Model

31 Information Criterion
Kullback-Leibler (Kullback and Liebler 1951) “distance," or “information" seeks to describe the difference between models and forms theoretical basis for data- based model selection.

32 AIC—Akaike's Information Criterion
Akaike's Information Criterion or AIC (Akaike 1973) expected Kullback-Leibler information Fisher's maximized log-likelihood function maximum log-likelihood is biased upward. bias  K (the number of estimable parameters) AIC = -2ln(L)+2K

33 AIC—Akaike's Information Criterion
Model w/smallest value of AIC is best approximating model If none of the models are good, AIC still selects best approximating model among those in the candidate set. It is extremely important to assure that the set of candidate models is well-substantiated Plausible biological hypothesis Rooted in theory AIC is only valid when comparing models fit to the same data.

34 Adjustments to AIC – AICc
small sample size adjustment: K – number of parameters n – sample size

35 AICc As sample size increases the penalty for each additional parameter decreases Allows for more model complexity with more data

36 Overdispersion Sampling variance exceeds the theoretical (model-based) variance Lack of independence among individuals animals that mate for life; pair behaves as unit young of some species - continue to live with the parents species traveling in flocks or schools Heterogeneity – individuals having unique characteristics (e.g., survival or capture probability) Can be detected by examining model "fit."

37 Goodness of fit Similar to examining expected frequencies of genotypes and phenotypes in general biology and genetics labs. If poor fit is detected Apply a variance inflation factor (c) or an estimate Perfect fit: c = 1.

38 Goodness of fit – c-hat Deviance = -2ln(L j)+2ln (Lsat)
Goodness of fit – 2 or G-test Bootstrapping

39 Quasi-likelihood (QAIC)
An adjustment to AIC that incorporates c-hat is the Quasi-likelihood (Lebreton et al. 1992) and is calculated as: and for small samples: Influences model selection & AIC weight

40 QAIC & c-hat As c-hat increases uncertainty increases
Increasingly favors simpler models as c-hat gets larger K = 7 K = 5 K = 2

41 AIC differences AIC values are relative
Use AIC – difference between AICi and min(AICi) larger AICi reflects a greater distance between models lower likelihood that a model is the "best model." Burnham and Anderson (1998) recommend AICi < 2 – equivocal best models 2< AICi < 4 – considerable support in the data 4 < AICi < 7 – less well-supported AICi > 10 – no support; should not be considered

42 AIC differences (example)

43 Strength of Evidence for Alternative Models
The likelihood of model i, given the data and the R models is Normalized (so they sum to 1); interpreted as probabilities (model weights):

44 Relative “strength of evidence”
Ratio of the AIC weights.

45 Strength of Evidence (example)

46 Parameter likelihoods
Sum of the weights for models including the parameter: where: wi are the model weights Ii = 1 if the parameter appears in the model. R is the suite of models under consideration Only applicable when parameters are equally represented in the model set

47 Incorporating uncertainty
Multi-model inference wi reflect the relative strength of evidence for model(s) Implies uncertainty in the model selection process. several models may have DAICi < 2.0 Does not imply that the “true” model is in the model set Need incorporate this uncertainty when estimating parameter(s) and precision.

48 Unconditional parameter estimates
Parameters (weighted average): where: i is the parameter estimate wi are the model weights Ii = 1 if the parameter appears in the model. R is the suite of models under consideration

49 Parameter estimates

50 Unconditional estimates of precision
Estimates of precision based on a single model are conditional on the selected model and tend to overestimate precision. unconditional variance of a parameter  (weighted average):

51 Unconditional estimates of precision (example)

52 Hypothesis testing Use AIC, AICc, QAIC, or QAICc in model selection procedures Likelihood Ratio Tests (LRTs) for planned comparisons among nested models

53 Hypothesis testing LRT distributed approximately as 2
Ks - Kg degrees of freedom (where K = no. estimated parameters LRT with P <  – additional parameters are warranted

54 End of Lecture Material

55 Discussion of last week’s lab
Which articles clearly articulated the working hypotheses versus null hypotheses for the research? Read Johnson (1999) if you don't remember what a null hypothesis is. Which articles included a justification based in theory? Which articles clearly identified models either conceptual or mathematical that could be used as either the basis for predictions or comparisons to data? Evaluate the scientific rigor of each article by indicating whether the research was based on the four categories below: Experimental manipulation using appropriate controls, replication & randomization, Impact studies (before and after) preferably with replication & randomization, Observation based on a priori hypotheses Observation and a posteriori description

56 Questions after Lab 1 Did the majority of articles:
Articulate working hypotheses? Relate hypotheses to theory? Identify models for prediction? Describe research that used controls or at least impacts? a priori models or a posteriori stories? Are these good criteria for evaluating scientific rigor? General comments on the rigor of specific journals?


Download ppt "Wildlife Population Analysis"

Similar presentations


Ads by Google