Download presentation
Presentation is loading. Please wait.
Published byEgbert Poole Modified over 8 years ago
1
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION
2
REGRESSION DIAGNOSTICS
3
Testing Regression Assumptions Prior to Analysis Normal distribution Outliers Linear relationships Multicollinearity
4
Interrelatedness of independent variables Indications High correlations between variables (.85) Substantial R squared, but statistically insignificant coefficients Unstable regression coefficients Unexpected size of coefficients Unexpected signs (+/-)
5
Measures of Collinearity Tolerance Variance Inflation Factor (VIF) Eigenvalues Condition Index Variance Proportions
6
Tolerance Measure of collinearity Proportion of variance in a variable that is not accounted for by the other independent variables Each independent variable is regressed on the other independent variables High multiple correlation indicates variable is highly related to other independent variables
7
Tolerance Tolerance equals 1 - Rsquared Tolerance of 0 (1-1) would indicate perfect collinearity Tolerance of 0 indicates the independent variable is a perfect linear combination of the other variables Small tolerances ( <0.1) are indicative of problem with multicollinearity
8
Variance Inflation Factor (VIF) Reciprocal of tolerance High tolerance associated with low VIF
9
Eigenvalues Measure of the cross-product matrix Finding some eigenvalues that are much larger than others indicates an ill- conditioned data matrix Ill-conditioned matrix leads to large changes in solution with only small changes in independent and/or dependent variable
10
Condition Index Square root of the ratios of largest eigenvalue to each successive eigenvalue >15 indicates possible problem >30 indicates serious problem
11
Variance Proportions Proportions of the variance accounted for by each principal component associated with each of the eigenvalues Collinearity is a problem when a component associated with a high condition index contributes substantially to the variance of two or more variables
12
RESIDUAL The difference between the actual and the predicted score (Y - Y')
13
Residual Analysis Normal Distribution Homoscedasticity
14
Residual Analysis Normal Distribution of residuals indicates: linear relationships normal distribution of dependent variable for each value of the independent variable Assessment histogram of standardized residuals probability plot
15
Residual Analysis Homoscedasticity Plot residuals against predicted values and against independent variables
16
Computer Exercise What is the multiple correlation of three sets of predictors and overall state of health? First set = age and years of education Second set = confidence and life satisfaction Third Set = smoking history and satisfaction with current weight
17
SPSS - Multiple Regression/Residuals Statistics Confidence intervals R squared change Descriptives Part & Partial correlations Collinearity diagnostics Residuals Durbin Watson Casewise diagnostics C
18
SPSS - Residual Analysis (cont.) Options exclude cases pairwise Plots Histogram Normal probability plot Produce all partial plots
19
Example from the Literature
20
CANONICAL CORRELATION
21
Measures the relationship between a set of independent variables and a set of dependent variables Method of least squares Two composites independent variables, "on the left" dependent variables, "on the right"
22
Canonical Correlation Type of Data Required Data at all levels may be entered Categorical variables must be coded Continuous variables should meet assumptions
23
Assumptions Sample must be representative of population Variables must have normal distribution Homoscedasticity Linear relationships
24
CANONICAL CORRELATION Canonical correlation coefficients Maximum number equals the number of variables in the smaller set.
25
CANONICAL CORRELATION Canonical variate A weighted composite of the variables in a set. "New" variable
26
CANONICAL CORRELATION Coefficients Raw Standardized Structure
27
CANONICAL CORRELATION Raw Coefficients Like b -weights in regression Can be used to calculate predicted scores, based on actual scores
28
CANONICAL CORRELATION Canonical weights Standard score form Similar to standardized regression coefficients (Betas) Indicate the relative importance of the associated variable Unstable
29
CANONICAL CORRELATION Structure Coefficients Correlation between the canonical variates and the original variables Loadings of.30 or higher are treated as meaningful Interpreted like loadings in factor analysis Square of the loading is the proportion of variance accounted for
30
WILKS’ LAMBDA Varies from 0 to 1 Error variance Equal to 1 - R square The smaller the value, the greater the variance explained Tested for significance with Bartlett's test, a chi-square statistic
31
CANONICAL CORRELATION Redundancy The higher the redundancy or correlation among a group of variables, the better the ability to predict from one group to another.
32
Example from the Literature
33
CANONICAL CORRELATION Exercise What is the canonical correlation between the following two sets of variables? The predictor set includes: age, education, smoking history, depressed state of mind, exercise, and current quality of life. The outcome set includes: positive psychological attitudes and overall state of health.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.