Day 7 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.

Slides:



Advertisements
Similar presentations
Multiple Regression and Model Building
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and regression
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Probability & Statistical Inference Lecture 9
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Correlation and regression
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Statistics: Data Analysis and Presentation Fr Clinic II.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Pertemua 19 Regresi Linier
BCOR 1020 Business Statistics Lecture 24 – April 17, 2008.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Classification and Prediction: Regression Analysis
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
Relationships Among Variables
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Correlation & Regression
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Introduction to Linear Regression and Correlation Analysis
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
STA291 Statistical Methods Lecture 27. Inference for Regression.
Hypothesis Testing in Linear Regression Analysis
Identifying Input Distributions 1. Fit Distribution to Historical Data 2. Forecast Future Performance and Uncertainty ◦ Assume Distribution Shape and Forecast.
Simple Linear Regression
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Linear Trend Lines Y t = b 0 + b 1 X t Where Y t is the dependent variable being forecasted X t is the independent variable being used to explain Y. In.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Regression. Population Covariance and Correlation.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 5: Regression Analysis Part 1: Simple Linear Regression.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Lecture 5 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Comparing Counts.  A test of whether the distribution of counts in one categorical variable matches the distribution predicted by a model is called a.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
PCB 3043L - General Ecology Data Analysis.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Tutorial I: Missing Value Analysis
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
REGRESSION REVISITED. PATTERNS IN SCATTER PLOTS OR LINE GRAPHS Pattern Pattern Strength Strength Regression Line Regression Line Linear Linear y = mx.
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
Lecture 5 Model Evaluation
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
AP Statistics Chapter 14 Section 1.
PCB 3043L - General Ecology Data Analysis.
Correlation and Regression
Prepared by Lee Revere and John Large
Lecture 5 Model Evaluation
Presentation transcript:

Day 7 Model Evaluation

Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals

Assessing Goodness of Fit for Continuous Data l Visual methods - Don’t underestimate the power of your eyes, but eyes can deceive, too... l Quantification - A variety of traditional measures, all with some limitations... A good review... C. D. Schunn and D. Wallach. Evaluating Goodness-of-Fit in Comparison of Models to Data. Source:

Traditional inferential tests masquerading as GOF measures The   2 “goodness of fit” statistic - For categorical data only, this can be used as a test statistic: “What is the probability that the “model” is true, given the observed results” - The test can only be used to reject a model. If the model is accepted, the statistic contains no information on how good the fit is.. - Thus, this is really a badness – of – fit statistic - Other limitations as a measure of goodness of fit: »Rewards sloppy research if you are actually trying to “test” (as a null hypothesis) a real model, because small sample size and noisy data will limit power to reject the null hypothesis

Visual evaluation for continuous data l Graphing observed vs. predicted...

Source: Canham, C. D., P. T. LePage, and K. D. Coates A neighborhood analysis of canopy tree competition: effects of shading versus crowding. Canadian Journal of Forest Research. Examples Goodness of fit of neighborhood models of canopy tree growth for 2 species at Date Creek, BC Predicted Observed

Goodness of Fit vs. Bias 1:1 line

R 2 as a measure of goodness of fit l R 2 = proportion of variance* explained by the model...(relative to that explained by the simple mean of the data) Where exp i is the expected value of observation i given the model, and obs is the overall mean of the observations (Note: R2 is NOT bounded between 0 and 1) * this interpretation of R2 is technically only valid for data where SSE is an appropriate estimate of variance (e.g. normal data)

R 2 – when is the mean the mean? l Clark et al. (1998) Ecological Monographs 68:220 For i=1..N observations in j = 1..S sites – uses the SITE means, rather than the overall mean, to calculate R 2

r 2 as a measure of goodness of fit r 2 = squared correlation (r) between observed (x) and predicted (y) NOTE: r and r 2 are both bounded between 0 and 1

R 2 vs r 2 Is this a good fit (r 2 =0.81) or a really lousy fit (R 2 =-0.39)? (it’s undoubtedly biased...)

A note about notation... Check the documentation when a package reports “R 2 ” or “r 2 ”. Don’t assume they will be used as I have used them... Sample Excel output using the “trendline” option for a chart: The “R 2 ” value of 0.89 reported by Excel is actually r 2 (While R 2 is actually 0.21) (If you specify no intercept, Excel reports true R 2...)

R 2 vs. r 2 for goodness of fit l When there is no bias, the two measures will be almost identical (but I prefer R 2, in principle). l When there is bias, R 2 will be low to negative, but r 2 will indicate how good the fit could be after taking the bias into account...

Sensitivity of R 2 and r 2 to data range

The Tyranny of R 2 (and r 2 ) l Limitations of R 2 (and r 2 ) as a measure of goodness of fit... - Not an absolute measure (as frequently assumed), - particularly when the variance of the appropriate PDF is NOT independent of the mean (expected) value - i.e. lognormal, gamma, Poisson,

Gamma Distributed Data... The variance of the gamma increases as the square of the mean!...

So, how good is good? l Our assessment is ALWAYS subjective, because of - Complexity of the process being studied - Sources of noise in the data l From a likelihood perspective, should you ever expect R 2 = 1?

Other Goodness of Fit Issues... l In complex models, a good fit may be due to the overwhelming effect of one variable... l The best-fitting model may not be the most “general” - i.e. the fit can be improved by adding terms that account for unique variability in a specific dataset, but that limit applicability to other datasets. (The curse of ad hoc multiple regression models...)

How good is good: deviance l Comparison of your model to a “full” model, given the probability model. For i = 1..n observations, a vector X of observed data (x i ), and a vector  of j = 1..m parameters (  j ): Define a “full” model with n parameters  i = x i (  full ). Then: Nelder and Wedderburn (1972)

Deviance for normally-distributed data Log-likelihood of the full model is a function of both sample size (n) and variance (  2 ) Therefore – deviance is NOT an absolute measure of goodness of fit... But, it does establish a standard of comparison (the full model), given your sample size and your estimate of the underlying variance...

Forms of Bias Proportional bias (slope not = 1) Systematic bias (intercept not = 0)

“Learn from your mistakes” (Examine your residuals...) l Residual = observed – predicted l Basic questions to ask of your residuals: - Do they fit the PDF? - Are they correlated with factors that aren’t in the model (but maybe should be?) - Do some subsets of your data fit better than others?

Using Residuals to Calculate Prediction Error l RMSE: (Root mean squared error) (i.e. the standard deviation of the residuals)

Predicting lake chemistry from spatially-explicit watershed data l At steady state: Where concentration, lake volume and flushing rate are observed, And input and inlake decay are estimated

Predicting iron concentrations in Adirondack lakes Results from a spatially-explicit, mass-balance model of the effects of watershed composition on lake chemistry Source: Maranger et al. (2006)

Should we incorporate lake depth? Shallow lakes are more unpredictable than deeper lakes The model consistently underestimates Fe concentrations in deeper lakes

Adding lake depth improves the model... R2 went from 56% to 65% It is just as important that it made sense to add depth...

But shallow lakes are still a problem...

Summary – Model Evaluation l There are no silver bullets... l The issues are even muddier for categorical data... l An increase in goodness of fit does not necessarily result in an increase in knowledge… - Increasing goodness of fit reduces uncertainty in the predictions of the models, but this costs money (more and better data). How much are you willing to spend? - The “signal to noise” issue: if you can see the signal through the noise, how far are you willing to go to reduce the noise?