Basics of Regression Analysis. Determination of three performance measures Estimation of the effect of each factor Explanation of the variability Forecasting.

Slides:



Advertisements
Similar presentations
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Advertisements

Topic 12: Multiple Linear Regression
Kin 304 Regression Linear Regression Least Sum of Squares
Forecasting Using the Simple Linear Regression Model and Correlation
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
EPI 809/Spring Probability Distribution of Random Error.
Simple Linear Regression and Correlation
Chapter 12 Simple Linear Regression
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Chapter 10 Simple Regression.
Multiple Linear Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
1 Pertemuan 13 Uji Koefisien Korelasi dan Regresi Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
Chapter Topics Types of Regression Models
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Chapter 11 Multiple Regression.
Multiple Linear Regression
Introduction to Probability and Statistics Linear Regression and Correlation.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Simple Linear Regression and Correlation
Chapter 7 Forecasting with Simple Regression
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Simple Linear Regression Analysis
Linear Regression/Correlation
Variance and covariance Sums of squares General linear models.
Correlation & Regression
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 Experimental Statistics - week 10 Chapter 11: Linear Regression and Correlation.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.
Chapter 13 Multiple Regression
Regression Analysis Relationship with one independent variable.
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 19 Measure of Variation in the Simple Linear Regression Model (Data)Data.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
Regression Analysis Part A Basic Linear Regression Analysis and Estimation of Parameters Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied.
Multiple Regression.
Chapter 20 Linear and Multiple Regression
REGRESSION G&W p
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Kin 304 Regression Linear Regression Least Sum of Squares
BPK 304W Regression Linear Regression Least Sum of Squares
Regression model with multiple predictors
Relationship with one independent variable
Quantitative Methods Simple Regression.
بحث في التحليل الاحصائي SPSS بعنوان :
CHAPTER 29: Multiple Regression*
Multiple Regression.
Prepared by Lee Revere and John Large
PENGOLAHAN DAN PENYAJIAN
Relationship with one independent variable
Multiple Linear Regression
Presentation transcript:

Basics of Regression Analysis

Determination of three performance measures Estimation of the effect of each factor Explanation of the variability Forecasting Error

Two Predictor Variables Population Regression Model: Y =  0 +  1 X 1 +  2 X 2 + e  e following N(0,  ) Unknown parameters:  0,  1,  2 ; 

From Data to Estimates of Coefficients Principle: Least Squares Normal Equation Systems Estimates of Coefficients Mathematics Computing Algorithm

Least Squares Method Simple RegressionMultiple Regression

Matrix Computation for b Normal Equation System: (X T X) b = X T Y –See Text Appendix D.3 Solution for b: b = (X T X) -1 (X T Y)

Standardized Regression Coefficients, Definition –b 0 = 0 –the beta coefficient Used to show relative weights of predictors. for k = 1, 2

Estimation of  s e - Standard Deviation of Disturbance e Forecasting Equation SS of Residuals Mean SS SSE =Y i -Y i 2  i=1 n MSE =sese 2 = SSE (n-3)

Standard Error of Coefficients The variance matrix of b (K+1 x 1)is

The Variability Explained First, determine the base variability for explanation by the regression Unconditional mean model: Y =  y + e e follows N(0,  y ) LS fit of the model: Pred_Y = Y SS of Residuals: MSS (DF=n-1):

The Variability Explained – cont. Second, by subtraction of the variability for still left. In SS: In Variance :

Creating ANOVA Table Regression Model Unexplained Variability in SS DF Unexplained Variability in Variance (MSE) Un- conditional SST (n-1) Conditional SSE (n-3) Variability Explained SSR= SST - SSE 2 Proportion Explained

Test of Significance F test of significance T- Test of significance –Two sided alternative –One sided alternative

F - Test of Significance of the variability explained by the regression H 0 :  1 =  2 = 0 H a : At least one coefficient is not 0 P-Value of F-stat = P{F (2, n-3) > F-stat}

t-Test of Significance of significance of a variable, X 1 - two sided H 0 :  1 = 0 H a :  1 = 0 P-Value of t-stat = P{ t ( n-3) > |t-stat|}

One Sided Test of Significance of significance of a variable, X 1 H 0 :  1 = 0 H a :  1 > 0 (using the prior knowledge) p-Value of t-stat = P{ t ( n-3) > t-stat}

Forecasting Point forecasting Sources of forecasting error Interval forecasting

Forecasting at x m Data of X for regressionValue of X for prediction

Sources of Forecasting Error Data: Y|x m =  0 +  1 x 1m +  2 x 2m + e m Forecast: Forecast Error:

Computing Standard Errors

Forecasting Performance Analysis R 2 _pred = 1 – Press / SST Press = SS of {y i – y i (i)} (deleted residual) Sample splitting –Analysis sample (n 1 ) –Validation sample (n 2 )

Generalization to K Independent Variables Use n – K – 1 for n – 3 for DF for t. Use K for the numerator DF and n-K-1 for the denominator DF for F.

Diagnostics Assumptions for Disturbance Multi-collinearity Outliers and Influential Observations

Problematic Data Conditions Regression Coefficients Are Sensitive to: –Highly Collinear Independent Variables –Contamination By Outliers and Influential Observations

Detecting Outliers and Influential Data Outliers –Leverage (X-space) distance from the mean –Tresid (Y-space) forecasting error Influential Data –Idea: with / without comparison –Cook’D –Dfbetas –Dfits

Modeling Techniques Transformation of Variables –Log –Others Using Dummy Variables –Symbolic representation –Dummy variables for qualitative variables Using Scores for Ordinal Variables Selection of Independent Variables –Forecasting –Computer intensive –Analysis of correlation structure of independent variables

Dummy Variables DK= “If (X=k,1,0)” Can be used nominal and also ordinal variables # of DK = c-1 where c is the number of categories.

Using Scores for Ordinal Variable Scoring Systems – 1, 2, 3, …c – -2, -1, 0, 1, 2 c:odd

Implications of Variable Selection

Selection of Variables - 1 Backward elimination Stepwise (forward) inclusion All X’s Final Regression T-test Best simple Best Two variables Best …. variables Max Increase in R 2 Max Increase in R 2

Selection of Variables - 2 All Possible Regression K independent variables K simple K (K-1) two variable 1 K variable Final Regression

Selection Criteria R2___________________________ Adj. R 2 ______________________ R 2 PRED ______________________ Se __________________________ Cp___________________________

C p (= # of coefficients) Select a combination with Cp close to p

What to Look for in Good Regression? Remember the three functions of regression –Estimation of the effect of each X –Explaining the variability of Y –Forecasting Populations regressions are assumptions –Needs testing Data might be contaminated

Extensions For Other Variable Types of Y

Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal

Generalized Linear Models (GLM) Regression model: Y =  0 +  1 X 1 +  2 X 2 + e  e following N(0,  ) GLM Formulation: 1.Model for Y: Y is N( ,  ) 2.Model for predictors (Link Function):  =  0 +  1 X 1 +  2 X

Forecasting Counting Data Model for Y: Poisson Distribution (  ) Link Function: