Wildlife Population Analysis

Slides:



Advertisements
Similar presentations
Day 6 Model Selection and Multimodel Inference
Advertisements

The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
The Simple Regression Model
Topic 2: Statistical Concepts and Market Returns
Today Concepts underlying inferential statistics
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Inferential Statistics
Regression and Correlation Methods Judy Zhong Ph.D.
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
Non-parametric Tests. With histograms like these, there really isn’t a need to perform the Shapiro-Wilk tests!
Selecting Variables and Avoiding Pitfalls Chapters 6 and 7.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Lecture 4 Model Selection and Multimodel Inference.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Research Seminars in IT in Education (MIT6003) Quantitative Educational Research Design 2 Dr Jacky Pow.
Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre.
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Retain H o Refute hypothesis and model MODELS Explanations or Theories OBSERVATIONS Pattern in Space or Time HYPOTHESIS Predictions based on model NULL.
Chap 6 Further Inference in the Multiple Regression Model
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Machine Learning 5. Parametric Methods.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Model Selection Information-Theoretic Approach UF 2015 (25 minutes) Outline: Why use model selection Why use model selection AIC AIC AIC weights and model.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”
Methods of Presenting and Interpreting Information Class 9.
Multiple Regression Analysis: Inference
Chapter 4 Basic Estimation Techniques
Regression Analysis AGEC 784.
Lecture 4 Model Selection and Multimodel Inference
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Correlation and Simple Linear Regression
Inference and Tests of Hypotheses
Further Inference in the Multiple Regression Model
Chapter 2 Simple Comparative Experiments
Chapter 11 Simple Regression
Chapter 25 Comparing Counts.
12 Inferential Analysis.
Multivariate Analysis Lec 4
Correlation and Simple Linear Regression
I271B Quantitative Methods
Model Comparison.
Correlation and Simple Linear Regression
When You See (This), You Think (That)
12 Inferential Analysis.
Elements of a statistical test Statistical null hypotheses
Simple Linear Regression and Correlation
Chapter 26 Comparing Counts.
Chapter 7: The Normality Assumption and Inference with OLS
Seminar in Economics Econ. 470
Product moment correlation
Lecture 4 Model Selection and Multimodel Inference
Inferential Statistics
Wildife Population Analysis
Chapter 26 Comparing Counts.
Lecture 4 Model Selection and Multimodel Inference
MGS 3100 Business Analysis Regression Feb 18, 2016
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Wildlife Population Analysis Maximum Likelihood Estimators, Information Theoretic Methods, and Significance Testing

But first… 1 model  1 hypothesis and Hypothesis = {T} + H Importance of a priori thinking in research Theory - principle that is widely accepted and can be used to make predictions about natural phenomena. Hypothesis - conjecture put forth as a possible explanation of observed phenomena, Basis of argument or experimentation to reach truth 1 model  1 hypothesis and Hypothesis = {T} + H

Also extremely important… Good study design – a few elements are: Randomization Vital to getting a representative result Selection of samples or assignment of treatments Biased sampling  biased results C  E Simple random, stratified, cluster, adaptive… Replication (not just  sample size) Use of controls = rigor +$$$ AIC and MLE won’t help

Scope of inference Draw conclusions using deductive or inductive logic about a population based on a limited number of samples. Q: How do we determine the population of inference? A: it is the group of individuals (units) with some probability of being sampled

Good research …statistical methods get in the way of [doing the research we need to do]… In field studies, all too often we use cost and the “complexity of natural systems” as excuses for lack of a priori thinking and poor scientific process…

Wildlife Population Analysis Maximum Likelihood Estimators, Information Theoretic Methods, and Significance Testing

Motivation Maximum likelihood estimates Present estimation based on the likelihood Introduce concept of model-based estimation Provide an example of maximum likelihood estimation Information theoretic methods (e.g., AIC) Introduce the idea of model uncertainty Establish concepts of model selection

Part I: Maximum Likelihood Estimators

Maximum Likelihood Estimates Given a model () MLE is (are) the value(s) that are most likely to estimate the parameter(s) of interest. That is, they maximize the probability of the model given the data. The likelihood of a model is the product of the probabilities of the observations.

Maximum Likelihood Estimates For linear models (e.g., ANOVA and regression) these are usually determined using the linear equations which minimize the sum of the squared residuals – closed form For nonlinear models and some distributions we determine MLEs setting the first derivative equal to zero and then making sure it is a maxima by setting the second derivative equal to zero – closed form. Or we can search for values that maximize the probabilities of all of the observations – numerical estimation.

Binomial Sampling characterized by two mutually exclusive events heads or tails, on or off, heads or tails, male or female, or in our case survived or died. often referred to as Bernoulli trials

Models Trials have an associated parameter p usually referred to as the probability of success. probability of failure is 1-p, often referred to as q p + q = 1 Parameter, p, also represents a model – approximation of the truth binomial models may be good approximation complex models (e.g., capture-mark recapture) probabilities and the models themselves may not approximate reality well.

Binomial Sampling p is a continuous variable between 0 and 1 (0 <p <1) y is the number of successful outcomes n is the number of trials. This estimator is unbiased. Expectation: Additionally:

About Binomial random variables: n trials must be identical – i.e., the population is well defined (e.g.,20 coin flips, 50 Kirtland's warbler nests, 75 radio-marked black bears in the Pisgah Bear Sanctuary). Each trial results in one of two mutually exclusive outcomes. (e.g., heads or tails, survived or died, successful or failed, etc.) The probability of success on each trial remains constant. (homogeneous) Trials are independent events (the outcome of one does not depend on the outcome of another). y, the number of successes; is the random variable after n trials.

Binomial Probability Function …the probability of observing y successes given n trials with the underlying probability p is ... Example: 10 flips of a fair coin (p = 0.5), 7 of which turn up heads is written

Binomial Probability Function (2) evaluated numerically:

Binomial Probability Function (3) Maximum Maximum

Likelihood Function of Binomial Probability Reality: have data (n and y) don’t know the model (p) leads us to the likelihood function: read the likelihood of p given n and y is ... not a probability function. is a positive function (0 < p < 1)

Binomial Probability Function and it's likelihood Back to our example. We found: we can ignore the above constant (binomial coefficient) use observed values of n and y insert various values of p in:

Likelihood Function of Binomial Probability(2) Alternatively, the likelihood of the data given the model can be thought of as the product of the probabilities of the individual observations. The probability of the observations is: Therefore, f = 1 for success, f = 0 for failure

Binomial Probability Function and it's likelihood maximum

Log likelihood Although the Likelihood function is useful, the log-likelihood has some desirable properties in that the terms are additive and the binomial coefficient does not include p.

Log likelihood Using the alternative: Once again the estimate of p that maximizes the value of ln(L) is the MLE.

Properties of MLEs Asymptotically normally distributed Asymptotically minimize variance Asymptotically unbiased as n →  One-to-one transformations of MLEs are also MLEs. For example mean lifespan: is also an MLE.

Information Theoretic Methods

Resources Information Theoretic Methods Johnson, D. H. 1999. The insignificance of statistical significance testing. Journal of Wildlife Management 63:763-772. Anderson, D. R., K. P. Burnham, and W. L. Thompson. 2000. Null hypothesis testing: problems, prevalence, and an alternative. Journal Wildlife Management 64:912-923. Anderson, D. R., K. P. Burnham. 2002. Avoiding pitfalls when using information-theoretic methods. Journal Wildlife Management 66:912-918. Information Theoretic Methods: model selection Burnham, K. P., and D. R. Anderson. 2002. Model selection and multi-model inference: a practical information theoretic approach. 2nd ed. Springer-Verlag, New York, NY.

Models “All models are wrong; some are useful.” George Box Approximations of reality. A statistical model is a mathematical expression that help us predict a response (dependent) variable as a function of explanatory (independent) variables based on a set of assumptions that allow the model not to fit exactly.

Parsimony Defined - Economy in the use of means to an end. …[using] the smallest number of parameters possible for adequate representation of the data.” Box and Jenkins (1970:17) In the context of our analyses, we strive to be economical in the use of parameters to explain the variation in data.

Precision versus bias Biased; Imprecise Unbiased; Imprecise Unbiased; Precise Biased; Precise

Trade-off between precision and bias. As K, the number of parameters increases, bias decreases and variance increases. Best Approximating Model

Information Criterion Kullback-Leibler (Kullback and Liebler 1951) “distance," or “information" seeks to describe the difference between models and forms theoretical basis for data- based model selection.

AIC—Akaike's Information Criterion Akaike's Information Criterion or AIC (Akaike 1973) expected Kullback-Leibler information Fisher's maximized log-likelihood function maximum log-likelihood is biased upward. bias  K (the number of estimable parameters) AIC = -2ln(L)+2K

AIC—Akaike's Information Criterion Model w/smallest value of AIC is best approximating model If none of the models are good, AIC still selects best approximating model among those in the candidate set. It is extremely important to assure that the set of candidate models is well-substantiated Plausible biological hypothesis Rooted in theory AIC is only valid when comparing models fit to the same data.

Adjustments to AIC – AICc small sample size adjustment: K – number of parameters n – sample size

AICc As sample size increases the penalty for each additional parameter decreases Allows for more model complexity with more data

Overdispersion Sampling variance exceeds the theoretical (model-based) variance Lack of independence among individuals animals that mate for life; pair behaves as unit young of some species - continue to live with the parents species traveling in flocks or schools Heterogeneity – individuals having unique characteristics (e.g., survival or capture probability) Can be detected by examining model "fit."

Goodness of fit Similar to examining expected frequencies of genotypes and phenotypes in general biology and genetics labs. If poor fit is detected Apply a variance inflation factor (c) or an estimate Perfect fit: c = 1.

Goodness of fit – c-hat Deviance = -2ln(L j)+2ln (Lsat) Goodness of fit – 2 or G-test Bootstrapping

Quasi-likelihood (QAIC) An adjustment to AIC that incorporates c-hat is the Quasi-likelihood (Lebreton et al. 1992) and is calculated as: and for small samples: Influences model selection & AIC weight

QAIC & c-hat As c-hat increases uncertainty increases Increasingly favors simpler models as c-hat gets larger K = 7 K = 5 K = 2

AIC differences AIC values are relative Use AIC – difference between AICi and min(AICi) larger AICi reflects a greater distance between models lower likelihood that a model is the "best model." Burnham and Anderson (1998) recommend AICi < 2 – equivocal best models 2< AICi < 4 – considerable support in the data 4 < AICi < 7 – less well-supported AICi > 10 – no support; should not be considered

AIC differences (example)

Strength of Evidence for Alternative Models The likelihood of model i, given the data and the R models is Normalized (so they sum to 1); interpreted as probabilities (model weights):

Relative “strength of evidence” Ratio of the AIC weights.

Strength of Evidence (example)

Parameter likelihoods Sum of the weights for models including the parameter: where: wi are the model weights Ii = 1 if the parameter appears in the model. R is the suite of models under consideration Only applicable when parameters are equally represented in the model set

Incorporating uncertainty Multi-model inference wi reflect the relative strength of evidence for model(s) Implies uncertainty in the model selection process. several models may have DAICi < 2.0 Does not imply that the “true” model is in the model set Need incorporate this uncertainty when estimating parameter(s) and precision.

Unconditional parameter estimates Parameters (weighted average): where: i is the parameter estimate wi are the model weights Ii = 1 if the parameter appears in the model. R is the suite of models under consideration

Parameter estimates

Unconditional estimates of precision Estimates of precision based on a single model are conditional on the selected model and tend to overestimate precision. unconditional variance of a parameter  (weighted average):

Unconditional estimates of precision (example)

Hypothesis testing Use AIC, AICc, QAIC, or QAICc in model selection procedures Likelihood Ratio Tests (LRTs) for planned comparisons among nested models

Hypothesis testing LRT distributed approximately as 2 Ks - Kg degrees of freedom (where K = no. estimated parameters LRT with P <  – additional parameters are warranted

End of Lecture Material

Discussion of last week’s lab Which articles clearly articulated the working hypotheses versus null hypotheses for the research? Read Johnson (1999) if you don't remember what a null hypothesis is. Which articles included a justification based in theory? Which articles clearly identified models either conceptual or mathematical that could be used as either the basis for predictions or comparisons to data? Evaluate the scientific rigor of each article by indicating whether the research was based on the four categories below: Experimental manipulation using appropriate controls, replication & randomization, Impact studies (before and after) preferably with replication & randomization, Observation based on a priori hypotheses Observation and a posteriori description

Questions after Lab 1 Did the majority of articles: Articulate working hypotheses? Relate hypotheses to theory? Identify models for prediction? Describe research that used controls or at least impacts? a priori models or a posteriori stories? Are these good criteria for evaluating scientific rigor? General comments on the rigor of specific journals?