Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

Slides:



Advertisements
Similar presentations
Krishna Rajan Data Dimensionality Reduction: Introduction to Principal Component Analysis Case Study: Multivariate Analysis of Chemistry-Property data.
Advertisements

Covariance Matrix Applications
Machine Learning Lecture 8 Data Processing and Representation
Dimensionality Reduction PCA -- SVD
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Lecture 7: Principal component analysis (PCA)
Principal Components Analysis Babak Rasolzadeh Tuesday, 5th December 2006.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
LISA Short Course Series Multivariate Analysis in R Liang (Sally) Shan March 3, 2015 LISA: Multivariate Analysis in RMar. 3, 2015.
© 2003 by Davi GeigerComputer Vision September 2003 L1.1 Face Recognition Recognized Person Face Recognition.
Factor Analysis Research Methods and Statistics. Learning Outcomes At the end of this lecture and with additional reading you will be able to Describe.
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9(b) Principal Components Analysis Martin Russell.
CHAPTER 19 Correspondence Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
Tables, Figures, and Equations
Techniques for studying correlation and covariance structure
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #18.
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
Separate multivariate observations
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka Virginia de Sa (UCSD) Cogsci 108F Linear.
Summarized by Soo-Jin Kim
Principle Component Analysis Presented by: Sabbir Ahmed Roll: FH-227.
Principal Components Analysis on Images and Face Recognition
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Chapter 2 Dimensionality Reduction. Linear Methods
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Some matrix stuff.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #19.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
Eigen Decomposition Based on the slides by Mani Thomas Modified and extended by Longin Jan Latecki.
Additive Data Perturbation: data reconstruction attacks.
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
CpSc 881: Machine Learning PCA and MDS. 2 Copy Right Notice Most slides in this presentation are adopted from slides of text book and various sources.
Introduction to Linear Algebra Mark Goldman Emily Mackevicius.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
Principle Component Analysis and its use in MA clustering Lecture 12.
Principal Component Analysis (PCA)
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
Matrix Factorization & Singular Value Decomposition Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Presented by: Muhammad Wasif Laeeq (BSIT07-1) Muhammad Aatif Aneeq (BSIT07-15) Shah Rukh (BSIT07-22) Mudasir Abbas (BSIT07-34) Ahmad Mushtaq (BSIT07-45)
Principal Components Analysis ( PCA)
Multivariate statistical methods. Multivariate methods multivariate dataset – group of n objects, m variables (as a rule n>m, if possible). confirmation.
Unsupervised Learning II Feature Extraction
Unsupervised Learning II Feature Extraction
Unsupervised Learning II Feature Extraction. Feature Extraction Techniques Unsupervised methods can also be used to find features which can be useful.
Exploring Microarray data
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
Principal Component Analysis (PCA)
Principal Component Analysis
Principal Component Analysis (PCA)
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
Descriptive Statistics vs. Factor Analysis
Introduction to Statistical Methods for Measuring “Omics” and Field Data PCA, PcoA, distance measure, AMOVA.
Recitation: SVD and dimensionality reduction
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
Principal Component Analysis (PCA)
Eigen Decomposition Based on the slides by Mani Thomas
Principal Components Analysis on Images and Face Recognition
Eigen Decomposition Based on the slides by Mani Thomas and book by Gilbert Strang. Modified and extended by Longin Jan Latecki.
Principal Component Analysis
Eigen Decomposition Based on the slides by Mani Thomas
Presentation transcript:

Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University

Principal Component Analysis  PCA is a widely used data compression and dimensionality reduction technique  PCA takes a data matrix, A, of n objects by p variables, which may be correlated, and summarizes it by uncorrelated axes (principal components or principal axes) that are linear combinations of the original p variables  The first k components display most of the variance among objects  The remaining components can be discarded resulting in a lower dimensional representation of the data that still captures most of the relevant information  PCA is computed by determining the eigenvectors and eigenvalues of the covariance matrix  Recall: The covariance of two random variables is their tendency to vary together 2

Geometric Interpretation of PCA  The goal is to rotate the axes of the p-dimensional space to new positions (principal axes) that have the following properties:  ordered such that principal axis 1 has the highest variance, axis 2 has the next highest variance,...., and axis p has the lowest variance  covariance among each pair of the principal axes is zero (the principal axes are uncorrelated). PC 1 PC 2 Note: Each principal axis is a linear combination of the original two variables Credit: Loretta Battaglia, Southern Illinois University 3

From p original variables: x 1,x 2,...,x p : Produce p new variables: y 1,y 2,...,y p : y 1 = a 11 x 1 + a 12 x a 1p x p y 2 = a 21 x 1 + a 22 x a 2p x p... y p = a p1 x 1 + a p2 x a pp x p such that: y i 's are uncorrelated (orthogonal) y 1 explains as much as possible of original variance in data set y 2 explains as much as possible of remaining variance etc. PCA: Coordinate Transformation y i 's are Principal Components

1st Principal Component, y 1 2nd Principal Component, y 2 Principal Components

x i2 x i1 y i,1 y i,2 Principal Components: Scores

λ1λ1 λ2λ2 Principal Components: Eigenvalues Eigenvalues represent variances of along the direction of each principle component

z 1 = [a 11,a 12,...,a 1p ]: 1 st Eigenvector of the covariance (or correlation) matrix, and coefficients of first principal component z 2 =[a 21,a 22,...,a 2p ]: 2 nd Eigenvector of the covariance (or correlation) matrix, and coefficients of first principal component … z p =[a p1,a p2,...,a pp ]: pth Eigenvector of the covariance (or correlation), matrix and coefficients of pth principal component Principal Components: Eigenvectors Dimensionality Reduction  We can take only the top k principal components y 1,y 2,...,y k effectively transforming the data into a lower dimensional space.

Covariance Matrix  Notes:  For a variable x, cov(x,x) = var(x)  For independent variables x and y, cov(x,y ) = 0  The covariance matrix is a matrix C with elements C i,j = cov(i,j)  The covariance matrix is square and symmetric  For independent variables, the covariance matrix will be a diagonal matrix with the variances along the diagonal and covariances in the non-diagonal elements  To calculate the covariance matrix from a dataset, first center the data by subtracting the mean of each variable, then compute: 1/n (A T.A) 9 Sum over n objects Value of variable j in object m Mean of variable j Value of variable i in object m Mean of variable i Covariance of variables i and j Recall: PCA is computed by determining the eigenvectors and eigenvalues of the covariance matrix

Covariance Matrix - Example 10 X =A = Original Data Centered Data Cov(X) = 1/(n-1) A T A = Covariance Matrix

Summary: Eigenvalues and Eigenvectors  Finding the principal axes involves finding eigenvalues and eigenvectors of the covariance matrix (C = A T A)  eigenvalues are values ( ) such that C.Z =.Z (Z are the eigenvectors)  this can be re-written as: (C - I).Z = 0  eigenvalues can be found by solving the characteristic equation: det(C - I) = 0  The eigenvalues, 1, 2,... p are the variances of the coordinates on each principal component axis  the sum of all p eigenvalues equals the trace of C (the sum of the variances of the original variables)  The eigenvectors of the covariance matrix are the axes of max variance  a good approximation of the full matrix can be computed using only a subset of the eigenvectors and eigenvalues  the eigenvalues are truncated below some threshold; then the data is reprojected onto the remaining r eigenvectors to get a rank-r approximation 11

Eigenvalues and Eigenvectors 12 Covariance Matrix 1 = = = Eigenvalues Note: = 74.4 = trace of C (sum of variances in the diagonal) Eigenvectors Z =

Reduced Dimension Space  Coordinates of each object i on the kth principal axis, known as the scores on PC k, are computed as where Y is the n x k matrix of PC scores, X is the n x p centered data matrix and Z is the p x k matrix of eigenvectors  Variance of the scores on each PC axis is equal to the corresponding eigenvalue for that axis  the eigenvalue represents the variance displayed (“explained” or “extracted”) by the kth axis  the sum of the first k eigenvalues is the variance explained by the k- dimensional reduced matrix 13

Reduced Dimension Space  Each eigenvalue represents the variance displayed (“explained”) by the a PC. The sum of the first k eigenvalues is the variance explained by the k- dimensional reduced matrix 14 A Scree Plot 

Reduced Dimension Space  So, to generate the data in the new space:  RowFeatureVector:  Matrix with the eigenvectors in the columns transposed so that the eigenvectors are now in the rows, with the most significant eigenvector at the top  RowZeroMeanData  The mean-adjusted data transposed, i.e. the data items are in each column, with each row holding a separate dimension 15 FinalData = RowFeatureVector x RowZeroMeanData

Example: Revisited 16 1 = = = Eigenvalues Eigenvectors Z = A = Centered Data

Reduced Dimension Space 17 U = Z T.A T = U = Z k T.A T = Taking only the top k =1 principle component: