How Good is a Model? How much information does AIC give us?

Slides:



Advertisements
Similar presentations
Lecture 17: Tues., March 16 Inference for simple linear regression (Ch ) R2 statistic (Ch ) Association is not causation (Ch ) Next.
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Kin 304 Regression Linear Regression Least Sum of Squares
Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Plausible values and Plausibility Range 1. Prevalence of FSWs in some west African Countries 2 0.1% 4.3%
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Linear regression models
Robert Plant != Richard Plant. Sample Data Response, covariates Predictors Remotely sensed Build Model Uncertainty Maps Covariates Direct or Remotely.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Chapter Topics Types of Regression Models
Analysis of Simulation Input.. Simulation Machine n Simulation can be considered as an Engine with input and output as follows: Simulation Engine Input.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Simple Linear Regression Analysis
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Short Term Load Forecasting with Expert Fuzzy-Logic System
RLR. Purpose of Regression Fit data to model Known model based on physics P* = exp[A - B/(T+C)] Antoine eq. Assumed correlation y = a + b*x1+c*x2 Use.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
2015 AprilUNIVERSITY OF HAIFA, DEPARTMENT OF STATISTICS, SEMINAR FOR M.A 1 Hastie, Tibshirani and Friedman.The Elements of Statistical Learning (2nd edition,
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
Simple Linear Regression Models
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
© 1998, Geoff Kuenning Linear Regression Models What is a (good) model? Estimating model parameters Allocating variation Confidence intervals for regressions.
Review of Statistical Models and Linear Regression Concepts STAT E-150 Statistical Methods.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Why Model? Make predictions or forecasts where we don’t have data.
Statistics PSY302 Quiz One Spring A _____ places an individual into one of several groups or categories. (p. 4) a. normal curve b. spread c.
Goodness-of-Fit Chi-Square Test: 1- Select intervals, k=number of intervals 2- Count number of observations in each interval O i 3- Guess the fitted distribution.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
How Good is a Model? How much information does AIC give us? –Model 1: 3124 –Model 2: 2932 –Model 3: 2968 –Model 4: 3204 –Model 5: 5436.
Uncertainty “God does not play dice” –Einstein “the end of certainty” –Prigogine, 1977 Nobel Prize What remains is: –Quantifiable probability with uncertainty.
Linear Regression Models Andy Wang CIS Computer Systems Performance Analysis.
STATISTICS Chapter 2 and and 2.2: Review of Basic Statistics Topics covered today:  Mean, Median, Mode  5 number summary and box plot  Interquartile.
Why Is It There? Chapter 6. Review: Dueker’s (1979) Definition “a geographic information system is a special case of information systems where the database.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Linear model. a type of regression analyses statistical method – both the response variable (Y) and the explanatory variable (X) are continuous variables.
Outline Sampling Measurement Descriptive Statistics:
23. Inference for regression
The simple linear regression model and parameter estimation
Why Model? Make predictions or forecasts where we don’t have data.
Chapter 14 Introduction to Multiple Regression
MECH 373 Instrumentation and Measurement
Regression Analysis AGEC 784.
Robert Plant != Richard Plant
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Review 1. Describing variables.
(5) Notes on the Least Squares Estimate
Regression Analysis Module 3.
Validation of Regression Models
Special Topics In Scientific Computing
Linear Regression Models
The Practice of Statistics in the Life Sciences Fourth Edition
Inference for Regression Lines
CHAPTER 29: Multiple Regression*
Interval Estimation.
Direct or Remotely sensed
Unfolding Problem: A Machine Learning Approach
6-1 Introduction To Empirical Models
10701 / Machine Learning Today: - Cross validation,
Section 7.7 Introduction to Inference
Uncertainty “God does not play dice”
CHAPTER 12 More About Regression
Model generalization Brief summary of methods
Statistics PSY302 Review Quiz One Spring 2017
Chengyuan Yin School of mathematics
2.3. Measures of Dispersion (Variation):
Inference Concepts 1-Sample Z-Tests.
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

How Good is a Model? How much information does AIC give us?

What do we need? What is the purpose of our model? Who will use it or it’s outputs? How will we explain the results and how they should be interpreted and used?

How Good is the Model? Does it make sense to you and experts in the topic? Do the predictions make sense? Does it hold up to validation? Is it overly sensitive? Is the uncertainty acceptable?

Direct or Remotely sensed May be the same data Covariates Direct or Remotely sensed Predictors Remotely sensed Jackknife Field Data Response, coordinates Sample Data Response, covariates Qualify, Prep Qualify, Prep Qualify, Prep Random split? Randomness Cross-Validation Noise Injection Inputs Test Data Training Data Outputs Temp Data Processes Build Model Repeated Over and Over Randomize parameters Monte-Carlo The Model Sensitivity Testing Randomness Statistics Validate Predict Noise Injection Predicted Values Uncertainty Maps Summarize Predictive Map

How Good is a Model? Can Compute: Also: AIC, BIC Also: Number of parameters Likelihood Response curves with sample data Confidence intervals Residual histograms with: Min, max, mean, standard deviation

Does the Model fit the Data? Plots of the model vs. the data Histograms of residuals Goodness of Fit Tests RMSE/RMSD These methods do not test the model outside the domain of the data

Residual Statistics Residual: Mean – 0? Min – how much lower than the model might a sample be? Max – how much higher than the model might a sample be? Standard Deviation – what is the “spread of the errors” Do these describe the full range of sample values?

Root Mean Squared Error Also known as Root Mean Squared Deviance (RMSD) 𝑅𝑀𝑆𝐸= ( 𝑦 𝑖 − 𝑦 𝑖 ) 2 𝑛 𝑦 𝑖 = prediction at 𝑥 𝑖 𝑦 𝑖 = data sample at 𝑥 𝑖 𝑛 = number of samples

General Approach Create the “default” model Test the model by: Splitting into test and training data sets Train (fit) the model on the training data Inject error into response and covariants Validate the model against the test data Inject error into coefficients Create Maps Collect statistics: AIC, residuals, etc. Repeat until statistics stabilize Summarize statistics

Direct or Remotely sensed May be the same data Covariates Direct or Remotely sensed Predictors Remotely sensed Jackknife Field Data Response, coordinates Sample Data Response, covariates Qualify, Prep Qualify, Prep Qualify, Prep Random split? Randomness Cross-Validation Noise Injection Inputs Test Data Training Data Outputs Temp Data Processes Build Model Repeated Over and Over Randomize parameters Monte-Carlo The Model Sensitivity Testing Randomness Statistics Validate Predict Noise Injection Predicted Values Uncertainty Maps Summarize Predictive Map