Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by.

Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by definition,

For example, let Let a = (a 1, a 2, …, a n )′ be a n  1 column vector of constants. It is easy to verify that and that, for symmetrical A (n  n)

Theory of Multiple Regression Suppose we have response variables Y i, i = 1, 2, …, n and k explanatory variables/predictors X 1, X 2, …, X k. i = 1,2, …, n There are k+2 parameters b 0, b 1, b 2, …, b k and σ 2

X is called the design matrix

OLS (ordinary least-squares) estimation

Fitted values are given by H is called the “hat matrix” (… it puts the hats on the Y’s)

The error sum of squares, SS RES, is The estimate of  2 is based on this.

Example: Find a model of the form yx1x1 x2x2 3.53.130 3.23.425 3.0 20 2.93.230 4.03.940 2.52.825 2.32.230 for the data below.

X is called the design matrix

The model in matrix form is given by: We have already seen that Now calculate this for our example

R can be used to calculate X’X and the answer is:

To input the matrix in R use X=matrix(c(1,1,1,1,1,1,1,3.1,3.4,3.0,3.4, 3.9,2.8,2.2,30,25,20,30,40,25,30),7,3) Number of rows Number of columns

Notice command for matrix multiplication

The inverse of X’X can also be obtained by using R

We also need to calculate X’Y Now

Notice that this is the same result as obtained previously using the lm result on R

So y = -0.2138 + 0.8984x1 + 0.01745x2 + e

The “hat matrix” is given by

The fitted Y values are obtained by

Recall once more we are looking at the model

Compare with

Error Terms and Inference A useful result is : n : number of points k: number of explanatory variables

In addition we can show that: And c (i+1)(i+1) is the (i+1)th diagonal element of where s.e.(b i )=  c (i+1)(i+1) 

For our example:

was calculated as:

This means that c 11 = 6.683, c 22 =0.7600,c 33 =0.0053 Note that c 11 is associated with b 0, c 22 with b 1 and c 33 with b 2 We will calculate the standard error for b 1 This is  0.7600 x 0.2902 = 0.2530

The value of b 1 is 0.8984 Now carry out a hypothesis test. H 0 : b 1 = 0 H 1 : b 1 ≠ 0 The standard error of b 1 is 0.2530 ^

The test statistic is This calculates as (0.8984 – 0)/0.2530 = 3.55

Ds….. ………. t tables using 4 degrees of freedom give cut of point of 2.776 for 2.5%. ………………................

We therefore accept H 1. There is no evidence at the 5% level that b 1 is zero. The process can be repeated for the other b values and confidence intervals calculated in the usual way. CI for  2 - based on the  4 2 distribution of ((4  0.08422)/11.14, (4  0.08422)/0.4844) i.e. (0.030, 0.695)

The sum of squares of the residuals can also be calculated.

Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by.

Similar presentations

Presentation on theme: "Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by.

Similar presentations

Presentation on theme: "Multiple Linear Regression - Matrix Formulation Let x = (x 1, x 2, …, x n )′ be a n  1 column vector and let g(x) be a scalar function of x. Then, by."— Presentation transcript:

Similar presentations

About project

Feedback