Canonical Correlation Analysis and Related Techniques Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia.

Slides:



Advertisements
Similar presentations
Factor Analysis Continued
Advertisements

PCA + SVD.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Lecture 7: Principal component analysis (PCA)
1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
Psychology 202b Advanced Psychological Statistics, II April 7, 2011.
Lecture 6 Ordination Ordination contains a number of techniques to classify data according to predefined standards. The simplest ordination technique is.
Principal component analysis (PCA)
Canonical correlations
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
What is EOF analysis? EOF = Empirical Orthogonal Function Method of finding structures (or patterns) that explain maximum variance in (e.g.) 2D (space-time)
Techniques for studying correlation and covariance structure
Factor Analysis Psy 524 Ainsworth.
Empirical Modeling Dongsup Kim Department of Biosystems, KAIST Fall, 2004.
Summarized by Soo-Jin Kim
Chapter 2 Dimensionality Reduction. Linear Methods
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.
Extensions of PCA and Related Tools
Some matrix stuff.
Statistics and Linear Algebra (the real thing). Vector A vector is a rectangular arrangement of number in several rows and one column. A vector is denoted.
Chapter 9 Factor Analysis
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Principal Component vs. Common Factor. Varimax Rotation Principal Component vs. Maximum Likelihood.
Mathematics of PCR and CCA Simon Mason Seasonal Forecasting Using the Climate Predictability Tool Bangkok, Thailand, 12 – 16 January.
Interpreting Principal Components Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University L i n.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Educ 200C Wed. Oct 3, Variation What is it? What does it look like in a data set?
Principal Components: A Conceptual Introduction Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University.
Principal Components: A Mathematical Introduction Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University.
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
Chapter 28 Cononical Correction Regression Analysis used for Temperature Retrieval.
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
Education 795 Class Notes Factor Analysis Note set 6.
Principle Component Analysis and its use in MA clustering Lecture 12.
Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015.
1 ESTIMATORS OF VARIANCE, COVARIANCE, AND CORRELATION We have seen that the variance of a random variable X is given by the expression above. Variance.
1 Canonical Correlation Analysis Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Canonical Correlation Analysis (CCA). CCA This is it! The mother of all linear statistical analysis When ? We want to find a structural relation between.
CSE 554 Lecture 8: Alignment
Exploratory Factor Analysis
PREDICT 422: Practical Machine Learning
School of Computer Science & Engineering
Factor analysis Advanced Quantitative Research Methods
Lecture: Face Recognition and Feature Reduction
Principal Components: A Conceptual Introduction
Dimension Reduction via PCA (Principal Component Analysis)
Measuring latent variables
Interpreting Principal Components
Measuring latent variables
Principal Component Analysis
Measuring latent variables
Recitation: SVD and dimensionality reduction
SVD, PCA, AND THE NFL By: Andrew Zachary.
Feature space tansformation methods
PCA of Waimea Wave Climate
Factor Analysis (Principal Components) Output
Principal Component Analysis
Seasonal Forecasting Using the Climate Predictability Tool
因子分析.
Canonical Correlation Analysis and Related Techniques
Factor Analysis.
Measuring latent variables
Presentation transcript:

Canonical Correlation Analysis and Related Techniques Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University L i n k i n g S c i e n c e t o S o c i e t y

L i n k i n g S c i e n c e t o S p o r t ! Principal components A principal component is defined as a weighted sum of a set of original variables in which the variance of the weighted sum is maximized, subject to the constraint that the matrix of weights is orthogonal. The weights could be defined to meet any set of criteria. For example, rotation is used to redefine the weights so that simple structure is achieved. There are a number of related techniques that involve defining weighted sums for two sets of variables …

L i n k i n g S c i e n c e t o S p o r t ! Maximum Covariance Analysis Consider two sets of variables, X and Y, both with the same number of corresponding cases n. With principal components analysis, the aim is to define a set of weights, U, that generate new variables, Z, with maximum variance: Consider alternative sets of weights for X and Y (U X and U Y ) that maximize the covariances, but with similar orthogonality constraints:

L i n k i n g S c i e n c e t o S p o r t ! Maximum Covariance Analysis The covariance to be maximized between Z X and Z Y is: In terms of X and Y: But X T Y is the covariance matrix of X with Y, C XY.

L i n k i n g S c i e n c e t o S p o r t ! Maximum Covariance Analysis Rearranging:

L i n k i n g S c i e n c e t o S p o r t ! Maximum Covariance Analysis The covariance matrix is then expressed in terms of a diagonal matrix C, and two orthogonal matrices: This is a singular value decomposition of the covariance matrix. For this reason maximum covariance analysis (MCA) is sometimes called SVD analysis or canonical covariance analysis.

L i n k i n g S c i e n c e t o S p o r t ! Canonical Correlation Analysis Canonical correlation analysis is very similar to maximum covariance analysis except that the correlations rather than the covariances are maximized. Instead of: The Zs are standardized to Ws to obtain the correlation matrix, R:

L i n k i n g S c i e n c e t o S p o r t ! Canonical Correlation Analysis The standardized scores, W, are obtained by dividing the scores, Z, by their standard deviations, S (which are diagonal): Substituting the Zs: indicating that the matrices of weights are no longer orthogonal (cf. maximum covariance analysis).

L i n k i n g S c i e n c e t o S p o r t ! Canonical Correlation Analysis Note that CCA is not the same as MCA using standardized original data since it is the variance of the weighted sums that is standardized in CCA. Buell pattern and other loading interpretation problems apply to MCA and CCA just as they do in PCA. It is unusual to apply rotation, however, because the criteria of maximizing covariance / correlation would be lost.

Buell patterns

L i n k i n g S c i e n c e t o S p o r t ! Redundancy Analysis Redundancy analysis is another similar technique except that the explained variance in the Y using the X is maximized. The technique differs from MCA and CCA in that only the Z X s are standardized. Note that different results will be obtained depending upon which dataset is X and which is Y.