Model diagnostics Tim Paine, modified from Zarah Pattison’s slides

Slides:



Advertisements
Similar presentations
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
Forecasting Using the Simple Linear Regression Model and Correlation
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
REGRESSION What is Regression? What is the Regression Equation? What is the Least-Squares Solution? How is Regression Based on Correlation? What are the.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Chapter Topics Types of Regression Models
Regression Diagnostics - I
Regression Diagnostics Checking Assumptions and Data.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Lecture 19 Transformations, Predictions after Transformations Other diagnostic tools: Residual plot for nonconstant variance, histogram to check normality.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Business Statistics - QBM117 Statistical inference for regression.
Assumption of Homoscedasticity
Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.
Regression and Correlation Methods Judy Zhong Ph.D.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Basic Statistics. Basics Of Measurement Sampling Distribution of the Mean: The set of all possible means of samples of a given size taken from a population.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
© 1998, Geoff Kuenning General 2 k Factorial Designs Used to explain the effects of k factors, each with two alternatives or levels 2 2 factorial designs.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Economics 173 Business Statistics Lecture 20 Fall, 2001© Professor J. Petry
Analysis of Residuals Data = Fit + Residual. Residual means left over Vertical distance of Y i from the regression hyper-plane An error of “prediction”
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Regression Analysis Week 8 DIAGNOSTIC AND REMEDIAL MEASURES Residuals The main purpose examining residuals Diagnostic for Residuals Test involving residuals.
TODAY we will Review what we have learned so far about Regression Develop the ability to use Residual Analysis to assess if a model (LSRL) is appropriate.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 7 Section 4 – Slide 1 of 11 Chapter 7 Section 4 Assessing Normality.
© Buddy Freeman, Independence of error assumption. In many business applications using regression, the independent variable is TIME. When the data.
Lack of Fit (LOF) Test A formal F test for checking whether a specific type of regression function adequately fits the data.
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)
Correlation & Regression Analysis
KNN Ch. 3 Diagnostics and Remedial Measures Applied Regression Analysis BUSI 6220.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Quantitative Methods Residual Analysis Multiple Linear Regression C.W. Jackson/B. K. Gordor.
Inference for Least Squares Lines
Statistical Data Analysis - Lecture /04/03
Correlation, Bivariate Regression, and Multiple Regression
Statistics for the Social Sciences
Model validation and prediction
What is normal anyway?! Disclaimer: I am not an expert!
Kin 304 Regression Linear Regression Least Sum of Squares
Chapter 12: Regression Diagnostics
(Residuals and
BIVARIATE REGRESSION AND CORRELATION
Linear Regression Models
Day 7 Linear Regression.
BA 275 Quantitative Business Methods
Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.
Checking Regression Model Assumptions
Residuals The residuals are estimate of the error
Regression is the Most Used and Most Abused Technique in Statistics
Simple Linear Regression
Homoscedasticity/ Heteroscedasticity In Brief
Product moment correlation
Regression Assumptions
Homoscedasticity/ Heteroscedasticity In Brief
Chapter 13 Multiple Regression
Regression Assumptions
Presentation transcript:

Model diagnostics Tim Paine, modified from Zarah Pattison’s slides

When conducting any statistical analysis it is important to evaluate how well the model fits the data and that the data meet the assumptions of the model. For linear models: Residuals are independently drawn from a normal distribution with a mean of 0 and a constant variance. Residuals: the distance of the data points from the fitted regression line

Constancy of variance A residual plot is a graph that shows the residuals on the vertical axis and the fitted values of the response on the horizontal axis. If the points in a residual plot are randomly dispersed around the horizontal axis, a linear regression model is appropriate for the data

QQ PLOTS Test for normality Ranked samples from our distribution plotted against a similar no. of ranked quantiles taken from a ND 1,6,9 good I find them of little use.

Constancy of variance Constant variance means that when you plot the residuals against the predicted values, the variance of the error of the predicted values should be constant. See that the length of the red lines are all the same. Synonym: homoscedasticity Antonym: heteroscedasticity

Ideal examples : Residuals versus fitted value plots

Residuals appear exhibit homogeneity, normality, and independence Residuals appear exhibit homogeneity, normality, and independence. However, the variation in residuals associated with the predictor variable Month suggests a problem with heterogeneity

Sop <- lmer(log(subnatcov+1) ~ loi + P + ss + channel.slope + domnatcov + iapcov*avmoisture+iapcov*cov + (1|river )+ (1|trans), data=finalscale, REML=FALSE)

Try a square root transformation

Sop <- lmer(sqrt(subnatcov+1) ~ loi + P + ss + channel.slope + domnatcov + iapcov*avmoisture+iapcov*cov + (1|river )+ (1|trans), data=finalscale, REML=FALSE)