Principle Component Analysis What is it? Why use it? –Filter on your data –Gain insight on important processes The PCA Machinery –How to do it –Examples.

Slides:

Advertisements

Similar presentations

Dimensionality Reduction PCA -- SVD

Advertisements

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.

1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.

Computer Vision – Image Representation (Histograms)

Statistical tools in Climatology René Garreaud

Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006.

Principal Component Analysis CMPUT 466/551 Nilanjan Ray.

Principal Component Analysis

Principal Component Analysis

Principal component analysis (PCA)

Dimensional reduction, PCA

Chapter 4 Multiple Regression.

Face Recognition Jeremy Wyatt.

Face Recognition Using Eigenfaces

Linear and generalised linear models

Ch. 10: Linear Discriminant Analysis (LDA) based on slides from

What is EOF analysis? EOF = Empirical Orthogonal Function Method of finding structures (or patterns) that explain maximum variance in (e.g.) 2D (space-time)

Principal component analysis (PCA)

Principal Component Analysis Principles and Application.

Principal component analysis (PCA) Purpose of PCA Covariance and correlation matrices PCA using eigenvalues PCA using singular value decompositions Selection.

CS 485/685 Computer Vision Face Recognition Using Principal Components Analysis (PCA) M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive.

Dr Mark Cresswell Statistical Forecasting [Part 1] 69EG6517 – Impacts & Models of Climate Change.

Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.

Summarized by Soo-Jin Kim

Chapter 2 Dimensionality Reduction. Linear Methods

Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.

1 LING 696B: PCA and other linear projection methods.

S1 File Principal component analysis for contiguous U.S. regional temperatures Contiguous U.S. regional atmospheric temperatures , 13-year moving.

Modern Navigation Thomas Herring

Factor Analysis in Individual Differences Research: The Basics Psych 437.

Basics of Neural Networks Neural Network Topologies.

ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.

Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.

Paper review EOF: the medium is the message 報告人：沈茂霖 (Mao-Lin Shen) 2015/11/10 Seminar report.

Professor Walter W. Olson Department of Mechanical, Industrial and Manufacturing Engineering University of Toledo System Solutions y(t) t +++++… 11 22.

Latent Growth Modeling Byrne Chapter 11. Latent Growth Modeling Measuring change over repeated time measurements – Gives you more information than a repeated.

Central limit theorem revisited

What is the determinant of What is the determinant of

Principal Component Analysis Machine Learning. Last Time Expectation Maximization in Graphical Models – Baum Welch.

Data Projections & Visualization Rajmonda Caceres MIT Lincoln Laboratory.

Introduction to Linear Algebra Mark Goldman Emily Mackevicius.

Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;

Elements of Pattern Recognition CNS/EE Lecture 5 M. Weber P. Perona.

A signal in the energy due to planetary wave reflection in the upper stratosphere J. M. Castanheira(1), M. Liberato(2), C. DaCamara(3) and J. M. P. Silvestre(1)

Feature Extraction 主講人：虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 10: PRINCIPAL COMPONENTS ANALYSIS Objectives:

3 “Products” of Principle Component Analysis

Feature Extraction 主講人：虞台文.

Understanding Principle Component Approach of Detecting Population Structure Jianzhong Ma PI: Chris Amos.

Central limit theorem revisited Throw a dice twelve times- the distribution of values is not Gaussian Dice Value Number Of Occurrences.

Central limit theorem - go to web applet. Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 *

Unsupervised Learning II Feature Extraction

EMPIRICAL ORTHOGONAL FUNCTIONS 2 different modes SabrinaKrista Gisselle Lauren.

Dynamic graphics, Principal Component Analysis Ker-Chau Li UCLA department of Statistics.

EMPIRICAL ORTHOGONAL FUNCTIONS

9.3 Filtered delay embeddings

Dimension Reduction via PCA (Principal Component Analysis)

Application of Independent Component Analysis (ICA) to Beam Diagnosis

Principal Component Analysis

Dynamic graphics, Principal Component Analysis

Principal Component Analysis

Techniques for studying correlation and covariance structure

Principal Component Analysis

Descriptive Statistics vs. Factor Analysis

ALL the following plots are subject to the filtering :

Principal Component Analysis

Kalman Filter: Bayes Interpretation

Marios Mattheakis and Pavlos Protopapas

Outline Variance Matrix of Stochastic Variables and Orthogonal Transforms Principle Component Analysis Generalized Eigenvalue Decomposition.

Presentation transcript:

Principle Component Analysis What is it? Why use it? –Filter on your data –Gain insight on important processes The PCA Machinery –How to do it –Examples (Matlab demo: script on website) Things to keep in mind Caveats

Principle Component Analysis (PCA) Principle Component Analysis: one way to find order in a data set. PCA is a way to represent your data in a very compact form by identifying the most frequently recurring (energetic) spatial structures in the data, and projecting the data onto these structures. PCA can -- sometimes --- be a way to identify true physical modes of the system PCA is also known as Factor Analysis, Empirical Orthogonal Function (EOF) Analysis, and a host of other names -- depending on the discipline you were raised in.

Principle Component Analysis (PCA) PCA of a data set Z results in a series of patterns (empirical orthogonal functions, EOFs) with accompanying time series (principle components, PCs) that are an alternative representation of the exact data set –Lets say we have M data points (locations) at N times The EOFs are orthogonal to each other, and so too are the PCs Formally, the PCA is the eigenanalysis of the covariance matrix C: –The EOFs are the eigenvectors E of C

PCA (cont) The EOFs (aka empirical modes) contain all of the variance and structure (covariance) in the data Since the EOFs are orthogonal, the eigenvalues tell you how much of the total variance in the data set is explained by each mode (pattern)

PCA Application #1 Since PCA selects for patterns that explain all the variance and covariance in the data, the EOFs with largest variance explained also tend to to be large spatial scales –EOFs will small variance also tend to have localized patterns. The common assumption is that these are unwanted noise Hence, reconstituting the data using only the modes with largest variance and truncating the sum at, say 90% of the total variance is a way to filter out small scale features that are assumed to be uninteresting noise A filter on the data to reduce small scale (unwanted) variance or instrumental noise

PCA Application #2 When there is a lot of structure in the data set, it only takes only a few EOFs to express most of the variance and covariance in the whole data set –For example, it might take only two patterns to explain 90% of the variability in the whole data set –The PCs of these leading EOFs can then be analyzed to ascertain the temporal properties of these special patterns A way to identify special (physically meaningful) structures in a big data set

PCA Machinery Say we have a data set stored in the matrix Z. The data are gathered at M locations, and at N time increments: After removing the time mean at each point, the covariance matrix of Z is C:

PCA Machinery The Eigenvectors (EOFs) are the eigenmodes of C The Eigenvalues express the amount of variance in each orthogonal eigenmode –The sum of the eigenvalues is the total variance in the data The data can be expressed in terms of the EOFs E and PCs P (more later): Z = E P

PCA Machinery The EOFs (E) and PCs P are of the form The PCs P are found by projecting the data onto the eigenvectors:

Examples of PCA Matlab demo

When using PCA as a filter You need to figure out how many modes to retain In general keep enough to explain most (e.g., 90%) of the total variance in the data set

When using PCA to find a special (physically meaningful) pattern Need to make sure eigenmodes are not noise and are distinct (not overlapping) Expected slope due to noise Only distinct eigenvalue/vector

Caveats The method tends to favor places where variance is large –Example: circulation vs geopential –Hence, EOF analysis on geopotential (dynamic height) would tend to favor midlatitudes compared to an EOF analysis of winds (currents)

Caveats (cont) The technique works best when data are linearly related across space (because the eigenvalue decomposition is a linear decomposition of the covariance matrix). When there are nonlinear relationships in space (which is almost always the case), you have to be very careful when you assign physical meaning to the eigenvectors. In a paleo context, analogous troubles arise if the proxy index is not linearly related to the climate variable that you are reconstructing.

Caveats (cont) Since all of the variance and covariance is contained within the eigenvectors, the EOFs tend to have large spatial structures. –Since, in the atmosphere and ocean, large spatial structures tend to also be lower frequency phenomenon, the EOFs will tend to emphasize large scale, lower frequency phenomenon.

Caveats (cont) WARNING: When the eigenvalues are not well separated, the eigenanalysis often will scramble information between the modes, and one should be very cautious about interpreting these modes as physically. In fact, in general, don't try to interpret them physically. An example of such a problem can be seen using the supplied Matlab program.

Caveats (cont) When are the EOFs true physical modes? Define the true physical modes as the eigensolution to the linear dynamical system where x is the state vector and M is a matrix that contains the physics and thermodynamics. The eigenvectors of M are the true physical modes of the system

Caveats (cont) If M is not Hermitian, then the eigenvectors of M are not orthogonal. Hence, there can not be a one-to-one relationship between the true modes and the EOF modes of the output from this system. What would make M non-Hermitian? Anything that makes M not symmetric. For example: –sheared mean flow –coupling between the atmosphere and ocean (because they have different Rossby numbers). Hence, the EOFs are almost never true modes of the dynamical system. –They can be close, however, and so there are times when it is useful and appropriate to think of the two as being nearly synonymous.