Week 2 Video 4 Metrics for Regressors.

Slides:



Advertisements
Similar presentations
Econometric Modeling Through EViews and EXCEL
Advertisements

Lesson 10: Linear Regression and Correlation
 Coefficient of Determination Section 4.3 Alan Craig
Regression, Correlation. Research Theoretical empirical Usually combination of the two.
Ch.6 Simple Linear Regression: Continued
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Correlation and Regression
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 February 27, 2012.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 March 7, 2013.
Model Assessment and Selection
Model Assessment, Selection and Averaging
Model assessment and cross-validation - overview
A Short Introduction to Curve Fitting and Regression by Brad Morantz
Bivariate Regression Analysis
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
CORRELATON & REGRESSION
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
Basic Statistical Concepts
Statistics Psych 231: Research Methods in Psychology.
Lecture 11 Multivariate Regression A Case Study. Other topics: Multicollinearity  Assuming that all the regression assumptions hold how good are our.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Dr. Mario MazzocchiResearch Methods & Data Analysis1 Correlation and regression analysis Week 8 Research Methods & Data Analysis.
Business Statistics - QBM117 Least squares regression.
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Lecture 5: Simple Linear Regression
PSY 307 – Statistics for the Behavioral Sciences Chapter 7 – Regression.
Linear Regression and Correlation Topic 18. Linear Regression  Is the link between two factors i.e. one value depends on the other.  E.g. Drivers age.
Norms & Norming Raw score: straightforward, unmodified accounting of performance Norms: test performance data of a particular group of test takers that.
Naive Extrapolation1. In this part of the course, we want to begin to explicitly model changes that depend not only on changes in a sample or sampling.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Regression and Correlation Methods Judy Zhong Ph.D.
沈致远. Test error(generalization error): the expected prediction error over an independent test sample Training error: the average loss over the training.
Relationships between Variables. Two variables are related if they move together in some way Relationship between two variables can be strong, weak or.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Chapter 6 & 7 Linear Regression & Correlation
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Introduction to Linear Regression
Topic 10 - Linear Regression Least squares principle - pages 301 – – 309 Hypothesis tests/confidence intervals/prediction intervals for regression.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Chapter 16 Data Analysis: Testing for Associations.
CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.
Chapter Thirteen Copyright © 2006 John Wiley & Sons, Inc. Bivariate Correlation and Regression.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 February 6, 2013.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
URBDP 591 A Lecture 9: Cross-Sectional and Longitudinal Design Objectives Experimental vs. Non-Experimental Design Cross-Sectional Designs Longitudinal.
Correlation & Regression Analysis
Chap 6 Further Inference in the Multiple Regression Model
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 February 1, 2012.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Linear Regression and Correlation Chapter 13.
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Model selection, model weights, and forecasting for time series models
Regression and Correlation of Data Correlation: Correlation is a measure of the association between random variables, say X and Y. No assumption that one.
Model Comparison.
The simple linear regression model and parameter estimation
Regression Analysis AGEC 784.
Decomposition of Sum of Squares
Linear Model Selection and regularization
Cross-validation for the selection of statistical models
Decomposition of Sum of Squares
Presentation transcript:

Week 2 Video 4 Metrics for Regressors

Metrics for Regressors Linear Correlation MAE/RMSE Information Criteria

Linear correlation (Pearson’s correlation) r(A,B) = When A’s value changes, does B change in the same direction? Assumes a linear relationship

What is a “good correlation”? 1.0 – perfect 0.0 – none -1.0 – perfectly negatively correlated In between – depends on the field

What is a “good correlation”? 1.0 – perfect 0.0 – none -1.0 – perfectly negatively correlated In between – depends on the field In physics – correlation of 0.8 is weak! In education – correlation of 0.3 is good

Why are small correlations OK in education? Lots and lots of factors contribute to just about any dependent measure

Examples of correlation values From Denis Boigelot, available on Wikipedia

Same correlation, different functions From John Behrens, Pearson

r2 The correlation, squared Also a measure of what percentage of variance in dependent measure is explained by a model If you are predicting A with B,C,D,E r2 is often used as the measure of model goodness rather than r (depends on the community)

RMSE/MAE

Mean Absolute Error Average of Absolute value (actual value minus predicted value)

Root Mean Squared Error (RMSE) Square Root of average of (actual value minus predicted value)2

MAE vs. RMSE MAE tells you the average amount to which the predictions deviate from the actual values Very interpretable RMSE can be interpreted the same way (mostly) but penalizes large deviation more than small deviation

Example Actual Pred 1 0.5 0.5 0.75 0.2 0.4 0.1 0.8 0 0.4

Example (MAE) Actual Pred AE 1 0.5 abs(1-0.5) 0.5 0.75 abs(0.75-0.5) 0.2 0.4 abs(0.4-0.2) 0.1 0.8 abs(0.8-0.1) 0 0.4 abs(0.4-0)

Example (MAE) Actual Pred AE 1 0.5 abs(1-0.5)=0.5 0.5 0.75 abs(0.75-0.5)=0.25 0.2 0.4 abs(0.4-0.2)=0.2 0.1 0.8 abs(0.8-0.1)=0.7 0 0.4 abs(0.4-0)=0.4

Example (MAE) Actual Pred AE 1 0.5 abs(1-0.5)=0.5 0.5 0.75 abs(0.75-0.5)=0.25 0.2 0.4 abs(0.4-0.2)=0.2 0.1 0.8 abs(0.8-0.1)=0.7 0 0.4 abs(0.4-0)=0.4 MAE = avg(0.5,0.25,0.2,0.7,0.4)=0.41

Example (RMSE) Actual Pred SE 1 0.5 (1-0.5)2 0.5 0.75 (0.75-0.5) 2 0.2 0.4 (0.4-0.2) 2 0.1 0.8 (0.8-0.1) 2 0 0.4 (0.4-0) 2

Example (RMSE) Actual Pred SE 1 0.5 0.25 0.5 0.75 0.0625 0.2 0.4 0.04 0.1 0.8 0.49 0 0.4 0.16

Example (RMSE) Actual Pred SE 1 0.5 0.25 0.5 0.75 0.0625 0.2 0.4 0.04 0.1 0.8 0.49 0 0.4 0.16 MSE = Average(0.25,0.0625,0.04,0.49,0.16)

Example (RMSE) Actual Pred SE 1 0.5 0.25 0.5 0.75 0.0625 0.2 0.4 0.04 0.1 0.8 0.49 0 0.4 0.16 MSE = 0.2005

Example (RMSE) Actual Pred SE 1 0.5 0.25 0.5 0.75 0.0625 0.2 0.4 0.04 0.1 0.8 0.49 0 0.4 0.16 RMSE = 0.448

Note Low RMSE/MAE is good High Correlation is good

What does it mean? Low RMSE/MAE, High Correlation = Good model High RMSE/MAE, Low Correlation = Bad model

What does it mean? High RMSE/MAE, High Correlation = Model goes in the right direction, but is systematically biased A model that says that adults are taller than children But that adults are 8 feet tall, and children are 6 feet tall

What does it mean? Low RMSE/MAE, Low Correlation = Model values are in the right range, but model doesn’t capture relative change Particularly common if there’s not much variation in data

Information Criteria

BiC Bayesian Information Criterion (Raftery, 1995) Makes trade-off between goodness of fit and flexibility of fit (number of parameters) Formula for linear regression BiC’ = n log (1- r2) + p log n n is number of students, p is number of variables

BiC’ Values over 0: worse than expected given number of variables Values under 0: better than expected given number of variables Can be used to understand significance of difference between models (Raftery, 1995)

BiC Said to be statistically equivalent to k-fold cross- validation for optimal k The derivation is… somewhat complex BiC is easier to compute than cross-validation, but different formulas must be used for different modeling frameworks No BiC formula available for many modeling frameworks

AIC Alternative to BiC Stands for An Information Criterion (Akaike, 1971) Akaike’s Information Criterion (Akaike, 1974) Makes slightly different trade-off between goodness of fit and flexibility of fit (number of parameters)

AIC Said to be statistically equivalent to Leave-Out- One-Cross-Validation

AIC or BIC: Which one should you use? <shrug>

All the metrics: Which one should you use? “The idea of looking for a single best measure to choose between classifiers is wrongheaded.” – Powers (2012)

Next Lecture Cross-validation and over-fitting