Canonical Correlation Analysis (CCA). CCA This is it! The mother of all linear statistical analysis When ? We want to find a structural relation between.

Slides:



Advertisements
Similar presentations
Canonical Correlation
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
Forecasting Using the Simple Linear Regression Model and Correlation
Correlation and Linear Regression.
Multiple Regression [ Cross-Sectional Data ]
WENDIANN SETHI SPRING 2011 SPSS ADVANCED ANALYSIS.
The Simple Linear Regression Model: Specification and Estimation
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Statistics for Managers Using Microsoft® Excel 5th Edition
Canonical Correlation: Equations Psy 524 Andrew Ainsworth.
Chapter Topics Types of Regression Models
More about Correlations. Spearman Rank order correlation Does the same type of analysis as a Pearson r but with data that only represents order. –Ordinal.
Correlation A correlation exists between two variables when one of them is related to the other in some way. A scatterplot is a graph in which the paired.
Chapter 11 Multiple Regression.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Tables, Figures, and Equations
Multiple Regression Research Methods and Statistics.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Correlation. The sample covariance matrix: where.
Discriminant Analysis Testing latent variables as predictors of groups.
Multiple Linear Regression Response Variable: Y Explanatory Variables: X 1,...,X k Model (Extension of Simple Regression): E(Y) =  +  1 X 1 +  +  k.
Bivariate Linear Regression. Linear Function Y = a + bX +e.
Regression and Correlation Methods Judy Zhong Ph.D.
Chapter 11 Simple Regression
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Multivariate Data Analysis Chapter 8 - Canonical Correlation Analysis.
The Multiple Correlation Coefficient. has (p +1)-variate Normal distribution with mean vector and Covariance matrix We are interested if the variable.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Linear Regression Model In regression, x = independent (predictor) variable y= dependent (response) variable regression line (prediction line) ŷ = a +
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
By: Amani Albraikan.  Pearson r  Spearman rho  Linearity  Range restrictions  Outliers  Beware of spurious correlations….take care in interpretation.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
CANONICAL CORRELATION. Return to MR  Previously, we’ve dealt with multiple regression, a case where we used multiple independent variables to predict.
Canonical Correlation Psy 524 Andrew Ainsworth. Matrices Summaries and reconfiguration.
INDE 6335 ENGINEERING ADMINISTRATION SURVEY DESIGN Dr. Christopher A. Chung Dept. of Industrial Engineering.
Psychology 820 Correlation Regression & Prediction.
Multiple Regression INCM 9102 Quantitative Methods.
Chapter 4 Summary Scatter diagrams of data pairs (x, y) are useful in helping us determine visually if there is any relation between x and y values and,
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
Scatter Diagrams scatter plot scatter diagram A scatter plot is a graph that may be used to represent the relationship between two variables. Also referred.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Chapter 8 Relationships Among Variables. Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
Canonical Correlation. Canonical correlation analysis (CCA) is a statistical technique that facilitates the study of interrelationships among sets of.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
DISCRIMINANT ANALYSIS. Discriminant Analysis  Discriminant analysis builds a predictive model for group membership. The model is composed of a discriminant.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
Chapter 9 Multiple Linear Regression
Chapter 12: Regression Diagnostics
Quantitative Methods Simple Regression.
BIVARIATE REGRESSION AND CORRELATION
Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.
Multiple Regression Models
Linear Discriminant Analysis
Multiple Linear Regression
Multiple Linear Regression
Correlation and Regression
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Regression Part II.
Presentation transcript:

Canonical Correlation Analysis (CCA)

CCA This is it! The mother of all linear statistical analysis When ? We want to find a structural relation between a set of independent variables and a set of dependent variables.

CCA When ? (part 2) To what extend can one set of two or more variables be predicted or “explained” by another set of two or more variables? What contribution does a single variable make to the explanatory power to the set of variables to which the variable belongs? What contribution does a single variable contribute to predicting or “explaining” the composite of the variables in the variable set to which the variable does not belong? What different dynamics are involved in the ability of one variable set to “explain” in different ways different portions of other variable set? What relative power do different canonical functions have to predict or explain relationships? How stable are canonical results across samples or sample subgroups? How closely do obtained canonical results conform to expected canonical results?

CCA Assumptions Linearity: if not, nonlinear canonical correlation analysis. Absence of multicollinearity: If not, Partial Least Squares (PLS) regression to reduce the space. Homoscedasticity: If not, data transformation. Normality: If not, re-sampling. A lot of data: Max(p, q)  20  nb of pairs. Absence of outliers.

CCA Toy example IVsDVs = X = X

CCA Z score transformation IV1DV2IV1DV2 = Z = Z

CCA Canonical Correlation Matrix

CCA Relations with other subspace methods

CCA Eigenvalues and eigenvectors decomposition R = PCA

CCA Eigenvalues and eigenvectors decomposition The roots of the eigenvalues are the canonical correlation values

CCA Significance test for the canonical correlation A significant output indicates that there is a variance share between IV and DV sets Procedure: We test for all the variables (m=1,…,min(p,q)) If significant, we removed the first variable (canonical correlate) and test for the remaining ones (m=2,…, min(p,q) Repeat

CCA Significance test for the canonical correlation Since all canonical variables are significant, we will keep them all.

CCA Canonical Coefficients Analogous to regression coefficients Eigenvectors Correlation matrix of the dependant variables BY=BY= Bx=Bx=

CCA Canonical Variates Analogous to regression coefficients

CCA Loading matrices Matrices of correlations between the variables and the canonical coefficients AxAx AyAy

CCA Loadings and canonical correlations for both canonical variate pairs Only coefficient higher than |0.3| are interpreted. LoadingCanonical correlation

CCA Proportion of variance extracted How much variance does each of the canonical variates extract form the variables on its own side of the equation? First Second First Second

CCA Redundancy How much variance the canonical variates form the IVs extract from the DVs, and vice versa. Eigenvalues rd y  x

CCA Redundancy How much variance the canonical variates form the IVs extract from the DVs, and vice versa. Summary The first canonical variate from IVs extract 40% of the variance in the y variable. The second canonical variate form IVs extract 30% of the variance in the y variable. Together they extract 70% of the variance in the DVs. The first canonical variate from DVs extract 49% of the variance in the x variable. The second canonical variate form DVs extract 24% of the variance in the x variable. Together they extract 73% of the variance in the IVs.

CCA Rotation A rotation does not influence the variance proportion or the redundancy. = Loading matrix =