Interpreting Principal Components Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University L i n.

Slides:

Advertisements

Similar presentations

Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides

Advertisements

Noise & Data Reduction. Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum.

Factor Analysis and Principal Components Removing Redundancies and Finding Hidden Variables.

Factor Analysis Continued

Chapter Nineteen Factor Analysis.

Dimension reduction (1)

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.

1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.

Lecture 7: Principal component analysis (PCA)

Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.

Factor Analysis Purpose of Factor Analysis

Factor Analysis There are two main types of factor analysis:

A quick introduction to the analysis of questionnaire data John Richardson.

1 Carrying out EFA - stages Ensure that data are suitable Decide on the model - PAF or PCA Decide how many factors are required to represent you data When.

Dr. Michael R. Hyman Factor Analysis. 2 Grouping Variables into Constructs.

Education 795 Class Notes Factor Analysis II Note set 7.

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.

Measuring the Unobservable

The Tutorial of Principal Component Analysis, Hierarchical Clustering, and Multidimensional Scaling Wenshan Wang.

Summarized by Soo-Jin Kim

Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.

Factor Analysis Istijanto MM, MCom. Definition Factor analysis  Data reduction technique and summarization  Identifying the underlying factors/ dimensions.

Canonical Correlation Analysis and Related Techniques Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia.

Advanced Correlational Analyses D/RS 1013 Factor Analysis.

Applied Quantitative Analysis and Practices

Factor Analysis Psy 524 Ainsworth. Assumptions Assumes reliable correlations Highly affected by missing data, outlying cases and truncated data Data screening.

Thursday AM  Presentation of yesterday’s results  Factor analysis  A conceptual introduction to: Structural equation models Structural equation models.

Factor Analysis ( 因素分析 ) Kaiping Grace Yao National Taiwan University

Principal Component vs. Common Factor. Varimax Rotation Principal Component vs. Maximum Likelihood.

Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.

Principal Components: A Conceptual Introduction Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University.

Principal Components: A Mathematical Introduction Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University.

Marketing Research Aaker, Kumar, Day and Leone Tenth Edition Instructor’s Presentation Slides 1.

Lecture 12 Factor Analysis.

Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.

Applied Quantitative Analysis and Practices

Exploratory Factor Analysis. Principal components analysis seeks linear combinations that best capture the variation in the original variables. Factor.

Principal Component Analysis (PCA)

MACHINE LEARNING 7. Dimensionality Reduction. Dimensionality of input Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Multivariate Data Analysis Chapter 3 – Factor Analysis.

Factor Analysis I Principle Components Analysis. “Data Reduction” Purpose of factor analysis is to determine a minimum number of “factors” or components.

Applied Quantitative Analysis and Practices LECTURE#19 By Dr. Osman Sadiq Paracha.

Factor Analysis Basics. Why Factor? Combine similar variables into more meaningful factors. Reduce the number of variables dramatically while retaining.

1 Principal Components Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking and Multimedia.

FACTOR ANALYSIS.  The basic objective of Factor Analysis is data reduction or structure detection.  The purpose of data reduction is to remove redundant.

Central limit theorem - go to web applet. Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 *

Chapter 14 EXPLORATORY FACTOR ANALYSIS. Exploratory Factor Analysis  Statistical technique for dealing with multiple variables  Many variables are reduced.

Dimension reduction (1) Overview PCA Factor Analysis Projection persuit ICA.

Basic statistical concepts Variance Covariance Correlation and covariance Standardisation.

Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.

Exploratory Factor Analysis

Principal Components Shyh-Kang Jeng

EXPLORATORY FACTOR ANALYSIS (EFA)

Principal Components: A Conceptual Introduction

Principal Component Analysis (PCA)

Measuring latent variables

Interpreting Principal Components

Measuring latent variables

Measuring latent variables

Principal Components Analysis

ALL the following plots are subject to the filtering :

PCA of Waimea Wave Climate

Dataset: Time-depth-recorder (TDR) raw data 1. Date 2

Factor Analysis (Principal Components) Output

Principal Component Analysis

Seasonal Forecasting Using the Climate Predictability Tool

Canonical Correlation Analysis and Related Techniques

Measuring latent variables

Presentation transcript:

Interpreting Principal Components Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University L i n k i n g S c i e n c e t o S o c i e t y

Retaining Principal Components Principal components analysis is specifically designed as a data reduction technique. How many of the new variables should be retained to represent the total variability of the original variables adequately? A stopping rule is required to identify at which point additional principal components are no longer required. L i n k i n g S c i e n c e t o S p o r t !

Retaining Principal Components There is a range of criteria that could be used to formulate a stopping rule: Internal criteria 1. Total variance explained; 2. Marginal variance explained; 3. Comparison with other deleted/retained eigenvalues; External criteria 4. Usefulness; 5. Physical interpretability. L i n k i n g S c i e n c e t o S p o r t !

Retaining Principal Components L i n k i n g S c i e n c e t o S p o r t ! Total variance explained Ensures a minimum loss of information, but No a priori criteria for defining the proportion of signal.

Retaining Principal Components L i n k i n g S c i e n c e t o S p o r t ! Marginal variance explained Ensures that each component explains a substantial proportion of the total variance. Choice of c?

Retaining Principal Components L i n k i n g S c i e n c e t o S p o r t ! Marginal variance explained 1. Original variables For the correlation matrix, the Guttmann - Kaiser criterion sets c = 1. For the covariance matrix, Kaiser’s rule sets c to the average of the original variables:

Retaining Principal Components L i n k i n g S c i e n c e t o S p o r t ! Marginal variance explained 2. Significant a. The “broken stick” rule b. Rule N Randomization procedures.

Retaining Principal Components L i n k i n g S c i e n c e t o S p o r t ! Similar variance explained Delete if components with similar variance are deleted. 1. χ 2 approximations 2. Scree test Delete eigenvalues below the elbow.

Retaining Principal Components L i n k i n g S c i e n c e t o S p o r t ! Similar variance explained 3. Log-eigenvalue test Scree test using logarithms of eigenvalues. Based on the assumption that the eigenvalues should decline exponentially.

Retaining Principal Components L i n k i n g S c i e n c e t o S p o r t ! Usefulness If principal components are to be used in other applications, retain the number that gives the best results. Use cross-validation. Perhaps retain subsets that do not necessarily include the first few components. Possibly subject to sampling errors, especially subset selection.

Retaining Principal Components L i n k i n g S c i e n c e t o S p o r t ! Physical interpretability 1. Time scores Do the time scores differ from white noise? 2. Spatial loadings Loadings identify “modes” of variability.

Interpreting the Principal Components Principal components are notoriously difficult to interpret physically. The weights are defined to maximize the variance, not maximize the interpretability! With spatial data (including climate data) the interpretation becomes even more difficult because there are geometric controls on the correlations between the data points. L i n k i n g S c i e n c e t o S p o r t !

Buell patterns Imagine a rectangular domain in which all the points are strongly correlated with their neighbours. L i n k i n g S c i e n c e t o S p o r t !

Buell patterns The points in the middle of the domain will have the strongest average correlations with all other points, simply because their average distance to all other grids is a minimum. L i n k i n g S c i e n c e t o S p o r t ! The strong correlations between neighbouring grids will be represented by PC 1, with the central grids dominating.

Buell patterns The points in the corners of the domain will have the weakest average correlations with all other points, simply because their average distance to all other grids is a maximum. L i n k i n g S c i e n c e t o S p o r t ! The weak correlations between distant grids will be represented by PC 2. The direction of the dipole reflects the domain shape.

Buell patterns? Are these real, or are they a function of the domain shape? L i n k i n g S c i e n c e t o S p o r t !

Buell patterns Because of domain shape dependency: 1.the first PC frequently indicates positive loadings with strongest values in the centre of the domain; 2.the second PC frequently indicates negative loadings on one side and positive loadings on the other side in the direction of the longest dimension of the domain. Similar kinds of problems arise when using: 1.gridded data with converging longitudes, or simply with longitude spacing different from latitude spacing; 2.station data. L i n k i n g S c i e n c e t o S p o r t !

Rotation The principal component weights are defined to maximize the variance, not maximize the interpretability! The weights could be redefined to meet alternative criteria. Rotation is sometimes performed to maximize the weights of as many metrics as possible, and to minimize the weights of the others. An objective of rotation is to attain simple structure: 1. weights are either close to zero or close to one; 2. variables have high weights on only one component. L i n k i n g S c i e n c e t o S p o r t !

Rotation The principal component weights are defined to maximize the variance, not maximize the interpretability! The weights could be redefined to meet alternative criteria. Rotation is sometimes performed to maximize the weights of as many metrics as possible, and to minimize the weights of the others. An objective of rotation is to attain simple structure: 1. weights are either close to zero or close to one; 2. variables have high weights on only one component. L i n k i n g S c i e n c e t o S p o r t !

Rotation L i n k i n g S c i e n c e t o S p o r t ! Commonly used rotation procedures include: Varimax – maximises the variance of the squared loadings. Quartimin – oblique rotation Procrustes – maximises the similarity between one set of loadings and a target set. Can be orthogonal or oblique.

Rotation Rotation does NOT solve Buell pattern problems, nor station and uneven gridded data problems, it only reduces them. What if a mode does not have simple structure – for example, a general warming trend? These problems are only of concern for interpretation. Rotation may be redundant if the principal components are used as input into some other procedures. L i n k i n g S c i e n c e t o S p o r t !