Lecture 4 Model Selection and Multimodel Inference.

Slides:



Advertisements
Similar presentations
Econometric Modeling Through EViews and EXCEL
Advertisements

Analysis of variance and statistical inference.
Brief introduction on Logistic Regression
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Day 6 Model Selection and Multimodel Inference
Model Identification & Model Selection
Lecture 16: Logistic Regression: Goodness of Fit Information Criteria ROC analysis BMTRY 701 Biostatistical Methods II.
Model Assessment, Selection and Averaging
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
458 Model Uncertainty and Model Selection Fish 458, Lecture 13.
Maximum likelihood (ML) and likelihood ratio (LR) test
Topic 2: Statistical Concepts and Market Returns
. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
Today Concepts underlying inferential statistics
5-3 Inference on the Means of Two Populations, Variances Unknown
CHAPTER 19: Two-Sample Problems
Lecture 4 Model Selection and Multimodel Inference
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Chapter 13: Inference in Regression
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Day 7 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.
The Method of Likelihood Hal Whitehead BIOL4062/5062.
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
The Triangle of Statistical Inference: Likelihoood
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.
Likelihood Methods in Ecology November 16 th – 20 th, 2009 Millbrook, NY Instructors: Charles Canham and María Uriarte Teaching Assistant Liza Comita.
Lecture 3 Hypothesis Testing and Statistical Inference using Likelihood: The Central Role of Models Likelihood Methods in Ecology April , 2011 Granada,
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS.
Lecture 5 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Issues concerning the interpretation of statistical significance tests.
ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.
Sampling and estimation Petter Mostad
Lecture 6 Your data and models are never perfect… Making choices in research design and analysis that you can defend.
Advanced Residual Analysis Techniques for Model Selection A.Murari 1, D.Mazon 2, J.Vega 3, P.Gaudio 4, M.Gelfusa 4, A.Grognu 5, I.Lupelli 4, M.Odstrcil.
Machine Learning 5. Parametric Methods.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
MODEL DIAGNOSTICS By Eni Sumarminingsih, Ssi, MM.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Lecture 4 Model Selection and Multimodel Inference
Further Inference in the Multiple Regression Model
Chapter 8: Inference for Proportions
Model Comparison.
Choosing a test: ... start from thinking whether our variables are continuous or discrete.
Simple Linear Regression
Chapter 7: The Normality Assumption and Inference with OLS
Lecture 4 Model Selection and Multimodel Inference
Lecture 4 Model Selection and Multimodel Inference
Lecture 6 C. D. Canham Lecture 7 Your data and models are never perfect… Making choices in research design and analysis that you can defend.
Wildlife Population Analysis
Presentation transcript:

Lecture 4 Model Selection and Multimodel Inference

Topics l Model selection: how do you choose between (and compare) alternate models? l Multimodel inference: how do you combine information from more than 1 model?

Comparing alternative models l We don’t ask “Is the model right or wrong?” We ask “Do the data support one model more than a competing model?” l Strength of evidence (support) for a model is relative: - Relative to other models: As models improve, support may change. - Relative to data at hand: As the data improve, support may change.

Bias and Uncertainty in Model Selection l Model Selection Bias: Chance inclusion of meaningless variables in a model will produce a biased underestimate of the variance, and a corresponding exaggeration of the precision of the model (the problem with “fishing expeditions”) l Model Selection Uncertainty: The fact that we are using data (with uncertainty) to both estimate parameters and to select the best model necessarily introduces uncertainty into the model selection process See discussion on pages of Burnham and Anderson

Comparing alternative models: methods l Likelihood ratio tests - Limited to comparisons between two models l Akaike’s Information Criterion (AIC) - Can be used to simultaneously assess many models l Modifications to take into account sample size: - AIC for small sample size - Schwarz criterion (particularly for very large samples) Remember: you can only directly compare alternate models applied to exactly the same dataset…

Recall the Likelihood Principle… “Within the framework of a statistical model, a set of data supports one statistical hypothesis better than the other if the likelihood of the first hypothesis, on the data, exceeds the likelihood of the second hypothesis”. (Edwards 1972)

But remember parsimony.. l A more complex model (more parameters) is expected to have higher likelihood, so we need some way to penalize models with higher numbers of parameters..

Likelihood ratios l The likelihood ratio L[A(x)] /L[B(x)] is a measure of the strength of evidence favoring model (hypothesis) A over model (hypothesis) B. l Issues: - What constitutes a “big” difference? - How do you penalize a model if it uses more parameters?

Likelihood ratio tests (LRT) l LRT follows a chi-square distribution with degrees of freedom equal to the difference in the number of parameters between models A and B. Remember: if the two models have the same number of parameters, just use likelihood to compare them…

Limitations of Likelihood Ratio Tests l Can only compare a pair of models at a time… (gets clumsy when you have a larger set of models) l Requires that you use a traditional frequentist “p-value” as your basis for judging between models…

A more general framework for model comparison: Information theory l “Reality” = “Truth” = Unknowable (or at least too much trouble to find…) l Models are approximations of reality, and we’d like to know how “close” they are to reality… l The “distance” between a model and reality is defined by the “Kullback-Leibler Information” (K-L distance) l Unfortunately, K-L distance can only be directly computed in hypothetical cases where reality is known.. See Chapter 2 of Burnham and Anderson for discussion and details…

Interpretation of Kullblack-Leibler Information l Information entropy = information content of a random outcome l Minimizing KL is the same as maximizing entropy. l We want a model that does not respond to randomness but does respond to information. l We maximize entropy subject to the constraints of the model used to capture information in the data. l By maximizing entropy, subject to a constraint, we leave only the information supported by the data. The model does not respond to noise

Akaike’s contribution (1973) l Akaike (1973) proposed “an information criterion” (AIC) (but now often called an Akaike Information Criterion) that relates likelihood to K-L distance, and includes an explicit term for model complexity… ^ This is an estimate of the expected, relative distance between the fitted model and the unknown true mechanism that generated the observed data. K=number of estimated parameters

Akaike’s Information Criterion l AIC has a built in penalty for models with larger numbers of parameters. l Provides implicit tradeoff between bias and variance. ^

AIC l We select the model with smallest value of AIC (i.e. closest to “truth”). l AIC will identify the best model in the set, even if all the models are poor! l It is the researcher’s (your) responsibility that the set of candidate models includes well founded, realistic models.

AIC for small samples l Unless the sample size (n) is large with respect to the number of estimated parameters (K), use of AICc is recommended. l Generally, you should use AICc when the ratio of n/K is small (less than ~ 40), based on K from the global (most complicated) model. l Use AIC or AICc consistently in an analysis rather than mix the two criteria.

Penalty for adding parameters - small sample sizes - How much does likelihood have to improve to justify adding one more parameter – as a function of sample size and model complexity? Using AIC c Note: converges on 1 as sample size >> 100

But what about cases with very large sample sizes? l Does it make intuitive sense that a model that is only 1 unit of likelihood better, for a sample with observations, should be selected? l The Schwarz criterion (also known as the Bayesian Information Criterion) So the penalty is simply a function of sample size, and increases as a function of the natural log of n (divided by 2)

Penalty for adding parameters - large sample sizes - So the penalty is simply a function of sample size (regardless of base model complexity), and increases as a function of the natural log of n (divided by 2)

Some Rough Rules of Thumb l Differences in AIC (Δi’s) can be used to interpret strength of evidence for one model vs. another. l A model with a Δ value within 1-2 of the best model has substantial support in the data, and should be considered along with the best model. l A Δ value within only 4-7 units of the best model has considerably less support. l A Δ value > 10 indicates that the worse model has virtually no support and can be omitted from further consideration.

Comparing models with different PDFs l LRTs and AIC can be used as one basis for selecting the “best” PDF for a given dataset and model, l But more generally, an examination of the distribution of the residuals should guide the choice of the appropriate PDF l There will be cases where different PDFs are appropriate for different models applied to the same dataset - Example: neighborhood competition models where residuals shift from lognormally to normally distributed as the models are improved by additional terms

Strength of evidence for alternate models: Akaike weights Akaike weights (w i ) are the weight of evidence in favor of model i being the actual best model for the situation at hand given that one of the N models must be the best model for that set of N models. Akaike weights for all models combined should add up to 1. where

Uses of Akaike weights l “Probability” that the candidate model is the best model. l Relative strength of evidence (evidence ratios). l Variable selection—which independent variable has the greatest influence? l Model averaging.

An example... The Data: x i = measurements of DBH on 50 trees y i = measurements of crown radius on those trees The Scientific Models: y i =  x i +  [1 parameter (  y i =  x i +  [2 parameters (  y i =  x i + γ x i 2 +  [3 parameters (  γ  The Probability Model:  is normally distributed, with mean = 0 and variance estimated from the observed variance of the residuals...

Back to the example….. Akaike weights can be interpreted as the estimated probability that model i is the best model for the data at hand, given the set of models considered. Weights > 0.90 indicate that robust inferences can be made using just that model.

Akaike weights and the relative importance of variables l For nested models, estimates of relative importance of predictor variables can be made by summing the Akaike weights of variables across all the models where the variables occur. l Variables can be ranked using these sums. l The larger this sum of weights, the more important the variable is.

Example: detecting density dependence Source: Brook, B.W. and C.J.A. Bradshaw Strength of evidence for density dependence in abundance time series of 1198 species. Ecology 87:

Ambivalence about selecting a best model to use for inference… The inability to identify a single best model is not a defect of the AIC method. It is an indication that the data are not adequate to reach strong inference. What is to be done?? MULTIMODEL INFERENCE AND MODEL AVERAGING

Multimodel Inference l If one model is clearly the best (w i >0.90) then inference can be made based on this best model. l Weak strength of evidence in favor of one model suggests that a different dataset may support one of the alternate models. l Designation of a single best model is often unsatisfactory because the “best” model is highly variable. l We can compute a weighted estimate of the parameter and the predicted value using Akaike weights.

Akaike Weights and Multimodel Inference l Estimate parameter values for the models with at least some measurable support. l Estimate weighted average of parameters across those models. l Only applicable to linear models. l For non-linear models, we can generate weighted averages of the predicted response value for given values of the predictor variables.

Akaike Weights and Multimodel Inference Estimate of parameter A = (0.73*1.04) +(0.27*1.31)= 1.11

Multimodel Inference: An example l Neighborhood models of tree growth: - Can we use MMI to improve parameter estimates for individual terms in the model? (not easily, given non-linearities in this model) - Can we use MMI to improve predictions of growth using a weighted suite of alternate models? (yes, but is it worth the effort?) See: Papaik, M. J., and C. D. Canham Multi-model analysis of tree competition along environmental gradients in southern New England forests. Ecological Applications 16:

Summary: Steps in Model Selection l Develop candidate models based on biological knowledge. l Take observations (data) relevant to predictions of the model. l Use data to obtain MLE of parameters of the alternate models. l Evaluate strength of evidence for alternate models using AIC and Akaike weights. l …Multimodel Inference? Do you agree with Burnham and Anderson that MMI is generally preferable to “best-model inference”?