Multivariate Linear Regression

Multivariate Linear Regression
BMTRY 726 7/10/2018

Linear Regression Analysis
We are interested in predicting values of one or more responses from a set of predictors Regression analysis is an extension of what we discussed with ANOVA and MANOVA Allows for inclusion of continuous predictors in place of (or in addition to) treatment indicators in MANOVA

Why Use Such Models There are many reasons to consider regression approaches Models are simple and interpretable Linear models can outperform non-linear methods when: There are a limited number of training observations Low signal to noise ratio Such models can be have can be made more flexible (i.e. non-linear) by applying transformations to the data Fro example use of polynomials

f(x) for Linear Regression
Given our features, x, the regression function takes the following form: Recall we then want to identify the estimate of f(x) that minimizes the prediction error for output We can define a loss function L(Y, f(x)) The most common choice of loss function for regression is L2 = squared error loss

Notation & Data Consider a set of j = 1, 2,…, p variables (or features) collected in a study and an outcome y And i = 1, 2,…, n is the number of samples

Univariate Regression Analysis
Univariate regression models a single response y as a mean dependent on a set of independent predictors zi and random error ei Model assumptions:

Least Squares Estimation
Based on this loss function, we can develop an estimate of f(x) by finding value that minimizes the loss

This approach is referred to as the method of least square to estimate or model parameters

Geometry of Least Squares

Based on this loss function, we can develop an estimate of f(x) by finding value that minimizes the loss

We estimate the variance using the residuals

LRT for individual bi’s
First we may test if any predictors effect the response: The LRT is based on the difference in sums of square between the full and null models…

LRT for individual bi’s
Difference in SS between the full and null models…

Model Building If we have a large number of predictors, we want to identify the “best” subset There are many methods of selecting the “best” -Examine all possible subsets of predictors -Forward stepwise selection -Backwards stepwise selection -Shrinkage approaches

Model Building Though we can consider predictors that are significant, this may not yield the “best” subset (some models may yield similar results) The “best” choice is made by examining some criterion -R2 -adjusted R2 -Mallow’s Cp -AIC Since R2 increases as predictors are added, Mallow’s Cp and AIC are better choices for selecting the “best” predictor subset

Model Checking Always good to check if the model is “correct” before using it to make decisions… Information about fit is contained in the residuals If the model fits well the estimated error terms should mimic N(0, s 2). So how can we check?

Model Checking Studentized residuals plot
Plot residuals versus predicted values -Ideally points should be scattered (i.e. no pattern) -If a pattern exists, can show something about the problem

Model Checking Plot residuals versus predictors
QQ plot of studentized residuals plot

Model Checking While residuals analysis is useful, it may miss outliers- i.e. observations that are very influential on predictions Leverage: -how far is the jth observation from the others? -How much pull does j exert on the fit Observations that affect inferences are influential

Collinearity If Z is not full rank, a linear combination aZ of columns in Z =0 In such a case the columns are co-linear in which case the inverse of Z’Z doesn’t exist It is rare that aZ == 0, but if a combination exists that is nearly zero, (Z’Z)-1 is numerically unstable Results in very large estimated variance of the model parameters making it difficult to identify significant regression coefficients

Collinearity We can check for severity of multicollinearity using the variance inflation factor (VIF)

Extending Univariate Regression
Consider a regression problem now where we have m outcomes for each of our n individuals and want to find the association with r predictors If we assume that each response follows its own regression model we get

In Classical Regression Terms
When we write the model this was we can see it is equivalent to classic linear regression

In Classical Regression Terms
If we consider the ith response we can estimate b What about variance?

Sum of Squares It follows from the univariate case that SSE is We want to minimize SSE, so our solution for the ith outcome is Our sum of squares decomposition for the model is

Example Let’s develop our regression equations for the following data
z1 1 2 3 4 y1 8 9 y2 -1

Example Use all this information to find our sum of squares…

Properties The same properties we had in univariate regression hold here

Estimate of S

So far we haven’t made any assumptions about the distribution of e… what if normality holds

LRT What do we do with this information? Naturally we can develop LRT for our regression parameters

Other Hypotheses ? Consider specific hypotheses about the association between our predictors and our p outcomes in Y -Hypotheses about levels of a categorical predictor -Hypotheses about the magnitude of a predictors effect on multiple outcomes MV regression offers the opportunity to evaluate whether or not predictors have a similar impact on correlated outcomes

Inference We can make certain inferences about the elements of our parameter matrix Generalized LRT procedure (Wilks, 1932) Make comparison across groups (rows of b) Make comparisons across traits Matrix of zeros

Compute the corrected sums of squares and cross-products matrix for the model:
As we’ve noted previously, this reduces to: And when M = I, the diagonal elements of H are model sums of squares

Compute a matrix of residual sums of squares and cross-products:
When M = I: Diagonal elements of E are sums of the squared residuals Off-diagonal elements are sums of cross-products of residuals

Likelihood Ratio Test Reject the null hypothesis if Wilk’s criterion is too small Large sample chi-square approximation, reject H0 if

A more accurate approximation given by Rao
(C.R. Rao (1951) Bull Int Stat Inst. 33(2), ) When is true: where

Exact distribution of Wilk’s criterion:
(1) Independence (2) Homogeneity of covariance matrices (3) Multivariate normality When is true: When either (Table 6.3 in J & W)

Example: DNA Methylation
An investigator wants to know if exposure to environmental contaminants impacts DNA methylation in humans. Pilot study on 11 subjects Outcomes -%methylated DNA -% hydroxyl methylated DNA Serum levels of exposure -Multiple perfluoronated compounds (we will consider PFNA)

Model of DNA Methylation
What does the model look like?

Consider the LRT to evaluate the coefficients for PFNA

Alternative approach to evaluate the coefficients for PFNA?

What if the PI wants to evaluate if the impact of PFNA on each outcome is the same? What is the hypothesis?

Let’s test this hypothesis…

Model of Methylation on PFNA
Let’s test this hypothesis…

What did we fail to consider in examining whether the impact of PFNA on each outcome is the same?

How do our results/test change to address these issues?

Predictions from Multivariate Regression
Often we are interested in using regression models to make predictions for new data. We can do this with MV regression models…

We can use this information to construct 100(1-a)% confidence regions

We can also construct 100(1-a)% prediction regions

We can also construct 100(1-a)% confidence Intervals and prediction intervals

Concept of Linear Regression
Up until now, we have been focusing on fixed covariates Suppose instead that the response and all covariates are random with some joint distribution What if we want to predict Y using

Concept of Linear Regression
We select b0 and b to minimize the MSE MSE minimized over b0 and b when Y Y-b0-b’Z b0+b’Z Z

We select b0 and b to minimize the MSE MSE minimized over b0 and b when Where

So how do we use this? Useful if we want to use Z to interpret Y…

We’ve made no distributional assumptions so far If general form of f (Z) used to approximate y s.t. E(y-f (Z))min what will f (Z) be Special Case, Y and Z are jointly normal:

Example Find the MLE of the regression function for a single response

Example Find the best linear predictor, its mean square error, and the multiple correlation coefficient (assume n = 10)

Prediction of Several Variables
What if we are considering more than a single response? Consider responses Y1, Y2,…,Ym (and are MVN) It is easy to see that the regression equation takes the form

Prediction of Several Variables
The maximum likelihood estimators look very similar to the single response case…

Example Find the MLE of the regression function for two responses

Partial Correlation We may also be interested in determining the association between the Y’s after removing the effect of Z We can define a partial correlation between the Y’s removing the effect of Z as follows The corresponding sampling partial correlation coefficient is:

Testing Correlations May be interested in determining whether all correlations are 0

Testing Correlations We then consider the -2log likelihood (using a large sample approximation) Bartlett correction:

Testing Correlations: Example
Say we have an estimated correlation matrix and want to test if all correlations are 0

Inference for Individual Correlations
Now what if we are interested in testing if individual or partial correlations are 0 Using the sample covariance matrix we can compute a t-test

Inference for Individual Correlations
We can also find an approximate (1-a)100% CI for correlation:

Example From our earlier correlation matrix r13 = 0.24:

Multivariate Linear Regression

Similar presentations

Presentation on theme: "Multivariate Linear Regression"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multivariate Linear Regression

Similar presentations

Presentation on theme: "Multivariate Linear Regression"— Presentation transcript:

Similar presentations

About project

Feedback