Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors.

Slides:



Advertisements
Similar presentations
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Advertisements

3.3 Hypothesis Testing in Multiple Linear Regression
Managerial Economics in a Global Economy
Topic 12: Multiple Linear Regression
Statistical Techniques I EXST7005 Simple Linear Regression.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Linear regression models
EPI 809/Spring Probability Distribution of Random Error.
Ch11 Curve Fitting Dr. Deshi Ye
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Multiple regression analysis
CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2.
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 11 Multiple Regression.
Simple Linear Regression Analysis
Multiple Linear Regression
Incomplete Block Designs
Ch. 14: The Multiple Regression Model building
Simple Linear Regression Analysis
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Objectives of Multiple Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
Chapter 12 Multiple Linear Regression Doing it with more variables! More is better. Chapter 12A.
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Statistics and Linear Algebra (the real thing). Vector A vector is a rectangular arrangement of number in several rows and one column. A vector is denoted.
1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
1Spring 02 First Derivatives x y x y x y dy/dx = 0 dy/dx > 0dy/dx < 0.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Regression Analysis © 2007 Prentice Hall17-1. © 2007 Prentice Hall17-2 Chapter Outline 1) Correlations 2) Bivariate Regression 3) Statistics Associated.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Environmental Modeling Basic Testing Methods - Statistics III.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
Summary of the Statistics used in Multiple Regression.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Chapter 12 Simple Linear Regression and Correlation
CHAPTER 7 Linear Correlation & Regression Methods
Multiple Regression.
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Chapter 12 Simple Linear Regression and Correlation
Multiple Regression Models
CHAPTER- 17 CORRELATION AND REGRESSION
Correlation and Regression
Product moment correlation
Introduction to Regression
Introduction to Regression
Presentation transcript:

Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors Multiple regression with three predictors Sum of square R 2 Test on  parameters Covariance matrix of the  Standard error of the 

Simple and multiple regression analysis in matrix form Tests on individual predictors Variance of individual predictors Correlation between predictors Standardized matrices Correlation matrices Sum of squares in Z R 2 in Z R 2 in Z R 2 between independent variables R 2 Standard error of  in Z

Least square Starting from the general: The method of least squares estimate of the beta parameter minimizing the sum of squares due to error. In fact, if:

You can estimate: Least square

Simple linear regression

intercepts slope

Multiple regression Similar to the simple A single dependent variable (Y) Two or more independent variables (X) Multiple correlation (rather than simple) Estimation by least squares

Simple linear regression (var.: 1 dep., 1 indep.) Multiple linear regression (Var.:1 dep., 2 indep.) interceptserror Independent variables slope Multiple regression

Multiple regression matrix form

X’X inversa Multiple regression matrix form

In matrix notation is briefly expressed : Multiple regression with three predictors

Matrix form

General scheme

The least squares method allows to check the following equality: Sum squares

Since in general: it's possible to derive that the sum of the squares of the distances of y from its average can be decomposed into the sum of squares due to regression and the sum of squares due to error, according to: Sum squares

It should be noted the equivalence of :

Sum squares

In summary : Sum squares

R2R2

Adjusted R 2 YY’ Because the coefficient of determination depends on both the number of observations (n) that the number of independent variables (k) it is convenient to correct by the degrees of freedom. Adjusted R 2 YY’ In our example :

Once a regression model has been constructed, it may be important to confirm the goodness of fit (R-squared )of the model and the statistical significance of the estimated parameters. Statistical significance can be checked by an F-test of the overall fit, followed by t-tests of individual parametersgoodness of fitR-squared statistical significanceF-testt-tests Test on  parameters

You can test the hypothesis of differences with 0 of the parameters  i taken together : Test on  parameters

k= Number of columns of the matrix X excluding X 0 n= Number of observations in y Test on  parameters

k= Number of columns of the matrix X excluding X 0 n= Number of observations in y

Covariance matrix of the  An estimate of the covariance matrix of the beta values result by: We denote:

Covariance matrix of the  Where the diagonal elements are an estimate of the variance of the single  i

Standard error of the  The standard error of the parameters can be calculated with the following formula: where c ii is the diagonal element inside the matrix(X’X) -1 corresponding to the parameter  i.

Standard error of the  Nota: quando il valore di c ii è elevato il valore di se b i cresce, indicando che la variabile X i ha un alto coefficiente di correlazione multipla con le altre variabili X.

Standard error of the  the increase in R 2 i led to a decreases of the denominator of the ratio and, consequently, increases the value of the standard error of the parameter  i. The standard error of the  i can also be calculated in the following way: where

With the standard error of measurement associated with each  i you can make a t- test to verify: Tests on individual predictors

With the standard error of measurement associated with each  i is also possible to estimate the confidence interval for each parameter:

Tests on individual predictors 1.Calculate the SSreg for the model containing all the independent variables. 2.Calculate the SSreg for the model excluding the variable for which you want to test the significance (SS-i). 3.Perform an F-test with the numerator equal to the difference SS reg -SS i weighted for the difference between the degrees of freedom of the two models, and with denominator SS REs / (nk-1). In order to conduct a statistical test on the regression coefficients is necessary:

Tests on individual predictors To test, for example, only the weight of the first predictor compared to the total model, it is necessary to calculate a new matrix  i from the matrix X i which was taken off the column belonging to the first predictor. From this follows immediately the calculation of SSi.

Tests on individual predictors

Same procedure is followed to test any subset of predictors. Similarly we have:

Tests on individual predictors It is interesting to note that this test on a single predictor is equivalent to the t-test b1 = 0. When the numerator there is only one degree of freedom, that is in fact the equivalence:

Summary table On this occasion, none of the estimated parameters obtained statistical significance on the hypothesis  i  0

Variance of individual predictors X i Using the matrix X'X we can calculate the variance of each variable X i.

Variance of individual predictors X i

Covariance between predictors and the dependent variable It is possible to calculate the covariance between the independent variables and the dependent variable according to:

Covariance between predictors and the dependent variable The correlation between the independent variables and the dependent variable is given by: As we will see later the use of standardized matrices simplifies the calculation immediately.

Test on multiple predictor You can perform a statistical test on a group of predictors in order to verify the significance. To do this, you use the formula specified above : To test, for example, the weight of only the first and second predictors with respect to the total model, it is necessary to calculate a new matrix  i from the matrix X i which was taken off the column belonging to these predictors. From this follows immediately the calculation of SSi.

Test on multiple predictor

Correlation between predictors Standard condition of independence between the variables X i

Correlation between predictors Condition of dependence between variables X i Completely standardized solution.

We denote by R i. the multiple correlation of the variable X i with the remaining variables, denoted by X j Correlation between predictors The element c ii represents the value of the diagonal of the matrix (X'X) -1 while S 2 i is the variance of the variable X i.

In case you do not have the X'X matrix but you have the MS res and the standard error of the parameter  i, the correlation between one X and the other one can be calculated in the following manner: Correlation between predictors

The X matrix and the y matrix can be converted into standardized scores by dividing the deviation of each element from the average for the appropriate standard deviation. Standardized matrices

In our example we have: Standardized matrices

With standardized variables is not necessary to include in the matrix Z the component 1 as the parameter  0 is equal to 0.

The standardized coefficients  can be obtained from those non-standardized using the formula: The equation of the regression line becomes: Standardized matrices

In our example we have:

Use standardized matrices allows to set the parameter  0 = 0. In fact, if the variables are standardized the intercept value for Y is 0, since all the means are equal to 0;Inoltre, essendo the correlation between any two standardized variables is: with i, j between 1 and k. Standardized matrices

Correlation matrices If we multiply the matrix (Z'Z) for the scalar [1 / (n-1)] we obtain the correlation matrix R between the independent variables

In our example we have: Correlation matrices

Correlation of Y with individual predictors Similarly if the variable Y is also standardized and multiply the product by the scalar Z'Y z [1 / (n-1)] we obtain the correlation matrix r yi of the variable Y with its predictors X i.

Correlation of Y with individual predictors

The solution of the system of normal equations of the line leads to the following equation: The estimated values ​​ can be obtained using the equation: Correlation of Y with individual predictors

With standardized variables we have: Starting from the general formulas it's possible to have the following simplified formulas: Sum of squares in Z

Calculation of R 2 y.123 Having decomposed the variance component due to the regression and the component due to the residuals, it is immediate to calculate:

Multiple correlation between the X i.yz If in general the squared multiple correlation of a variable with the other independent Xi is: in the presence of standardized variables, it becomes: where the element a ii belongs to the diagonal of the matrix R -1.

If you want to calculate the other two coefficients now you will have to do the following: For example, the squared multiple correlation between the first variable X 1 and the other two can be calculated in the following way: Multiple correlation between the X i.yz

Standard error of  z The standard error of the standardized parameters is obtainable by the general formula:

Standard error of  z You now have all the elements to test the differences of individual predictors from 0, obtaining the same results obtained with the non-standardized variables.