Applied Quantitative Analysis and Practices LECTURE#31 By Dr. Osman Sadiq Paracha.

Slides:



Advertisements
Similar presentations
1 Outliers and Influential Observations KNN Ch. 10 (pp )
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Chapter 5 Introduction to Inferential Statistics.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Lecture 24 Multiple Regression (Sections )
Regression Diagnostics Checking Assumptions and Data.
Linear Regression Analysis 5E Montgomery, Peck and Vining 1 Chapter 6 Diagnostics for Leverage and Influence.
Lecture 5: Simple Linear Regression
Multiple Regression Dr. Andy Field.
Linear Regression/Correlation
Linear Regression.  Uses correlations  Predicts value of one variable from the value of another  ***computes UKNOWN outcomes from present, known outcomes.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Correlation & Regression
CHAPTER 5 REGRESSION Discovering Statistics Using SPSS.
Lecture 15 Basics of Regression Analysis
1 MULTI VARIATE VARIABLE n-th OBJECT m-th VARIABLE.
Regression Analysis. Scatter plots Regression analysis requires interval and ratio-level data. To see if your data fits the models of regression, it is.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Applied Quantitative Analysis and Practices LECTURE#22 By Dr. Osman Sadiq Paracha.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Analysis of Residuals Data = Fit + Residual. Residual means left over Vertical distance of Y i from the regression hyper-plane An error of “prediction”
Applied Quantitative Analysis and Practices LECTURE#09 By Dr. Osman Sadiq Paracha.
Part IV Significantly Different: Using Inferential Statistics
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
1 G Lect 14M Review of topics covered in course Mediation/Moderation Statistical power for interactions What topics were not covered? G Multiple.
Regression Mediation Chapter 10. Mediation Refers to a situation when the relationship between a predictor variable and outcome variable can be explained.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 2: Review of Multiple Regression (Ch. 4-5)
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
12/17/ lecture 111 STATS 330: Lecture /17/ lecture 112 Outliers and high-leverage points  An outlier is a point that has a larger.
Applied Quantitative Analysis and Practices
Logistic Regression Analysis Gerrit Rooks
Applied Quantitative Analysis and Practices LECTURE#30 By Dr. Osman Sadiq Paracha.
Applied Quantitative Analysis and Practices LECTURE#07 By Dr. Osman Sadiq Paracha.
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
Applied Quantitative Analysis and Practices LECTURE#10 By Dr. Osman Sadiq Paracha.
Multiple Linear Regression An introduction, some assumptions, and then model reduction 1.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Regression Analysis.
Inference for Least Squares Lines
CHAPTER 3 Describing Relationships
Multiple Regression Prof. Andy Field.
Regression Analysis Part D Model Building
Moderation and Mediation
Regression Analysis.
Cautions about Correlation and Regression
Simple Linear Regression
Lecture Slides Elementary Statistics Thirteenth Edition
Diagnostics and Transformation for SLR
Stats Club Marnie Brennan
CHAPTER 3 Describing Relationships
Simple Linear Regression
Three Measures of Influence
Diagnostics and Transformation for SLR
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Applied Quantitative Analysis and Practices LECTURE#31 By Dr. Osman Sadiq Paracha

Previous Lecture Summary Outliers and Residuals Example of Model analysis for multiple regression

Outliers and Residuals The normal or unstandardized residuals are measured in the same units as the outcome variable and so are difficult to interpret across different models we cannot define a universal cut-off point for what constitutes a large residual we use standardized residuals, which are the residuals divided by an estimate of their standard deviation

Outliers and Residuals Some general rules for standardized residuals are derived from these facts: (1) standardized residuals with an absolute value greater than 3.29 (we can use 3 as an approximation) are cause for concern because in an average sample case a value this high is unlikely to happen by chance; (2) if more than 1% of our sample cases have standardized residuals with an absolute value greater than 2.58 (we usually just say 2.5) there is evidence that the level of error within our model is unacceptable (the model is a fairly poor fit of the sample data)

Outliers and Residuals (3) if more than 5% of cases have standardized residuals with an absolute value greater than 1.96 (we can use 2 for convenience) then there is also evidence that the model is a poor representation of the actual data. Studentized residual, which is the unstandardized residual divided by an estimate of its standard deviation that varies point by point. These residuals have the same properties as the standardized residuals but usually provide a more precise estimate of the error variance of a specific case.

Influential Cases There are several residual statistics that can be used to assess the influence of a particular case. Adjusted predicted value for a case when that case is excluded from the analysis. The computer calculates a new model without a particular case and then uses this new model to predict the value of the outcome variable for the case that was excluded If a case does not exert a large influence over the model then we would expect the adjusted predicted value to be very similar to the predicted value when the case is included

Influential Cases The difference between the adjusted predicted value and the original predicted value is known as DFFit We can also look at the residual based on the adjusted predicted value: that is, the difference between the adjusted predicted value and the original observed value. This is the deleted residual. The deleted residual can be divided by the standard deviation to give a standardized value known as the Studentized deleted residual. The deleted residuals are very useful to assess the influence of a case on the ability of the model to predict that case.

Influential Cases One statistic that does consider the effect of a single case on the model as a whole is Cook’s distance. Cook’s distance is a measure of the overall influence of a case on the model and Cook and Weisberg (1982) have suggested that values greater than 1 may be cause for concern.

Mediation Refers to a situation when the relationship between a predictor variable and outcome variable can be explained by their relationship to a third variable (the mediator).

The Statistical Model

Baron & Kenny, (1986) Mediation is tested through three regression models: 1. Predicting the outcome from the predictor variable. 2. Predicting the mediator from the predictor variable. 3. Predicting the outcome from both the predictor variable and the mediator.

Baron & Kenny, (1986) Four conditions of mediation: 1. The predictor must significantly predict the outcome variable. 2. The predictor must significantly predict the mediator. 3. The mediator must significantly predict the outcome variable. 4. The predictor variable must predict the outcome variable less strongly in model 3 than in model 1.

Limitations of Baron & Kenny’s (1986) Approach How much of a reduction in the relationship between the predictor and outcome is necessary to infer mediation? people tend to look for a change in significance, which can lead to the ‘all or nothing’ thinking that p- values encourage.

Lecture Summary Mediation through multiple regression Example in SPSS