Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.

Regression Analyses

Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where k is the number of predictors Find solution where Sum(Y-Y’) 2 minimized Do not confuse size of bs with importance for prediction Can standardize to get betas, which can help determine relative importance Multiple Regression

Prediction – allows prediction of change in the D.V. resulting from changes in the multiple I.V.s Explanation – enables explanation of the variate by assessing the relative contribution of each I.V. to the regression equation More efficient than multiple simple regression equations –Allows consideration of overlapping variance in the IVs Why use Multiple Regression?

When do you use Multiple Regression? When theoretical or conceptual justification exists for predicting or explaining the D.V. with the set of I.V.s D.V. is metric/continuous –If not, logistic regression or discriminant analysis

Variance in Y Variance in X1 Variance in X2 a c b e residual variance X1 X2 Y Multiple Regression

DV is continuous and interval or ratio in scale Assumes multivariate normality for random IVs Assumes normal distributions and homogeneity of variance for each level of X for fixed IVs No error of measurement Correctly specified model Errors not correlated Expected mean of residuals is 0 Homoscedasticity (error variance equal at all levels of X) Errors are independent/no autocorrelation (error for one score not correlated with error for another score) Residuals normally distributed Assumptions

Multiple regression represents the construction of a weighted linear combination of variables: The weights are derived to: (a)Minimize the sum of the squared errors of prediction: (b) Maximize the squared correlation (R 2 ) between the original outcome variables and the predicted outcome variables based on the linear combination.

X Y y-y`

Multiple R R is like r except it involves multiple predictors and R cannot be negative R is the correlation between Y and Y’ where Y’ = b0+b1X1 + b2X2 + b3X3...bkXk R 2 tells us the proportion of variance accounted for (coefficient of determination)

An example... Y = Number of job interviews X 1 = GRE score X 2 = Years to complete Ph.D. X 3 = Number of publications N = 500

Predicting Interviews Variance in Interviews Variance in Time to Graduate Variance in Pubs b f residual variance Variance in GRE a c d e

Regression with SPSS REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT interviews /METHOD=ENTER years to complete gre pubs /SCATTERPLOT=(*ZPRED,*ZRESID). From Analyze Menu Choose Regression Choose Linear

The error that is minimized in the derivation of the regression weights: the standard deviation of errors of prediction. The variance that is maximized in the derivation of the regression weights.

The error that is minimized in the derivation of the regression weights: the variance of errors of prediction.

The weight, b The weight, , if variables are standardized.

Significance of Beta weights. Output from SPSS

Multicollinearity Addition of many predictors increases likelihood of multicollinearity problems Using multiple indicators of the same construct without combining them in some fashion will definitely create multicollinearity problems Wreaks havoc with analysis e.g., significant overall R 2, but no variables in the equation significant Can mask or hide variables that have large and meaningful impacts on the DV

Multicollinearity Multicollinearity reflects redundancy in the predictor variables. When severe, the standard errors for the regression coefficients are inflated and the individual influence of predictors is harder to detect with confidence. When severe, the regression coefficients are highly related. var(b)

The tolerance for a predictor is the proportion of variance that it does not share with the other predictors. The variance inflation factor (VIF) is the inverse of the tolerance.

Remedies: (1) Combine variables using factor analysis (2) Use block entry (3) Model specification (omit variables) (4) Don’t worry about it as long as the program will allow it to run (you don’t have singularity, or perfect correlation) Multicollinearity

Incremental R 2 Changes in R 2 that occur when adding IVs Indicates the proportion of variance in prediction that is provided by adding Z to the equation It is what Z adds in prediction after controlling for X in Z Total variance in Y can be broken up in different ways, depending on order of entry (which IVs controlled first) If you have multiple IVs, change in R 2 strongly determined by intercorrelations and order of entry into the equation Later point of entry, less R 2 available to predict

Other Issues in Regression Suppressors (one IV correlated with the other IV but not with the DV; switches in sign) Empirical cross-validation Estimated cross-validation Dichotomization, Trichotomization, Median splits –Dichotomize one variable reduces max r to.798 –Cost of dichot is loss of 1/5 to 2/3 of real variance –Dichot on more than one variable can increase Type I error and yet can reduce power as well!

Significance of Overall R 2 Tests: a + b + c + d + e + f against area g (error) Get this from a simultaneous regression or from last step of block or hierarchical entry. Other approaches may or may not give you an appropriate test of overall R2, depending upon whether all variables are kept or some omitted. Y a b c d e f g X W Z

Significance of Incremental R 2 Change in R 2 tests: a + b + c against area d + e + f + g At this step, the t test for the b weight of X is the same as the square root of the F test if you only enter one variable. It is a test of whether or not the area of a + b + c is significant as compared to area d + e + f + g. Step 1: Enter X Y a b c d e f g X

Significance of Incremental R 2 Change in R2 tests: d + e against area f + g At this step, the t test for the b weight of X is a test of area a against area f + g and the t test for the b weight of W is a test of area d + e against area f + g. Y a b c d e f g X W Step 2: Enter W

Significance of Incremental R 2 Y a b c d e f g X W Z Step 3: Enter Z Change in R2 tests: f against g At this step, the t test for b weight of X is a test of area a against area g, the t test for the b weight of W is a test of area e against area g, and the t test for the b weight of Z is a test of area f against area g. These are the significance tests for the IV effects from a simultaneous regression analysis. No IV gets “credit” for areas b, c, d in a simultaneous analysis.

Hierarchical Regression Significance of Incremental R 2 Y a b c d e f g X W Z Enter variables in hierarchical fashion to determine R 2 for each effect. Test each effect against error variance after all variables have been entered. Assume we entered X then W then Z in a hierarchical fashion. Tests for X: areas a + b + c against g Tests for W: areas d + e against g Tests for Z: area f against g

Significance test for b or Beta Y a b c d e f g X W Z In final equation, when we look at the t tests for our b weights we are looking at the following tests: Tests for X: Only area a against g Tests for W: Only area e against g Tests for Z: Only area f against g That’s why incremental or effect R 2 tests are more powerful.

Methods of building regression equations Simultaneous: All variables entered at once Backward elimination (stepwise): Starts with full equation and eliminates IVs on the basis of significance tests Forward selection (stepwise): Starts with no variables and adds on the basis of increment in R2 Hierarchical: Researcher determines order and enters each IV Block entry: Researcher determines order and enters multiple IVs in single blocks

Simultaneous a c b e X Z Y f d g h W i Simultaneous: All variables entered at once Significance tests and R 2 based on unique variance No variable “gets credit” for area g Variables with intercorrelations have less unique variance Variable X & Z together predict more than W Variable W might be significant, X & Z are not Betas are partialled, so beta for W larger than X or Z

Backward Elimination a c b e X Z Y f d g h W i Starts with full equation and eliminates IVs Gets rid of least significant variable (probably X), then tests remaining vars to see if they are signif Keeps all remaining significant vars Capitalizes on chance Low cross-validation

Forward Selection a c b e X Z Y f d g h W i Starts with no variables and adds IVs Adds most unique R2 or next most significant variable (probably W because gets credit for area i) Quits when more vars are not significant Capitalizes on chance Low cross-validation

Hierarchical (Forced Entry) a c b e X Z Y f d g h W i Researcher determines order of entry for IVs Order based on theory, timing, or need for stat control Less capitalization on chance Generally higher cross-validation Final model based on IVs of theoretical importance Order of entry determines which IV gets credit for area g

Order of Entry Determining order of entry is crucial Stepwise capitalizes on chance and reduces cross- validation and stability of your prediction equation –Only useful to maximize prediction in a given sample –Can lose important variables Use the following: –Logic –Theory –Order of manipulations/treatments –Timing of measures Usefulness of the regression model is reduced as the k (number of IVs) approaches N (sample size) –Best to have at least 15 to 1 ratio or more

Interpreting b or  B or b is raw regression weight  is standardized (Scale invariant) At a given step, size of b or  influenced by order of entry in a regression equation –Should be interpreted at entry step Once all variables are in the equation, bs and  s will always be the same regardless of the order of entry Difficult to interpret b or  for main effects when interaction in equation

We can code groups and use to analyze data (e.g., 1 and 2 to represent females and males) Overall R 2 and significance tests for full equation will not change regardless of how we code (as long as orthogonal) Interpretation of intercept (a) and slope (b or beta weights) WILL change depending on coding We can use coding to capture effects of categorical variables Regression: Categorical IVs

Total # codes needed is always # groups -1 Dummy coding –One group assigned 0s. b wts indicate mean difference of groups coded 1 compared to the group coded 0 Effect coding –One group assigned -1s. b wts indicate mean difference of groups coded 1 to the grand mean All forms of coding give you the same overall R 2 and significance tests for total R 2 Difference is in interpretation of b wts

Dummy Coding # dummy codes = # groups – 1 Group that receives all zeros is the reference group Beta = comparison of reference group to group represented by 1 Intercept in the regression equation is mean of the reference group

Effect Coding # contrast codes = # groups – 1 Group that receives all zeros in dummy coding now gets all -1s Beta = comparison of the group represented by 1 to the grand mean Intercept in the regression equation is the grand mean

Regression with Categorical IVs vs. ANOVA Provides the same results as t tests or ANOVA Provides additional information –Regression equation (line of best fit) –Useful for future prediction –Effect size (R 2 ) –Adjusted R 2

Regression with Categorical Variables - Syntax Step 1. Create k -1 dummy variables Step 2. Run regression analysis with dummy variables as predictors REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT fiw /METHOD=ENTER msdum1 msdum2 msdum3 msdum4 msdum5.

Regression with Categorical Variables - Output

Adjusted R 2 There may be “overfitting” of the model and R 2 may be inflated Model may not cross-validate  shrinkage More shrinkage with small samples (< 10- 15 observations per IV)

Example: Hierarchical Regression Example. Number of children, hours in family work and sex as predictors of family interfering with work REGRESSION /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA CHA /CRITERIA=PIN(.05) POUT(.10) /NOORIGIN /DEPENDENT fiw /METHOD=ENTER numkids /METHOD=ENTER hrsfamil /METHOD=ENTER sex.

Hierarchical Regression Output

Simultaneous Regression Output

Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.

Similar presentations

Presentation on theme: "Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.

Similar presentations

Presentation on theme: "Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where."— Presentation transcript:

Similar presentations

About project

Feedback