Object Orie’d Data Analysis, Last Time Finished Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp. gives rotation,

Slides:



Advertisements
Similar presentations
Eigen Decomposition and Singular Value Decomposition
Advertisements

3D Geometry for Computer Graphics
Chapter Outline 3.1 Introduction
Dimensionality Reduction PCA -- SVD
PCA + SVD.
The General Linear Model. The Simple Linear Model Linear Regression.
Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Started detailed look at PCA Reviewed linear algebra Today: More linear algebra Multivariate.
Slides by Olga Sorkine, Tel Aviv University. 2 The plan today Singular Value Decomposition  Basic intuition  Formal definition  Applications.
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
3D Geometry for Computer Graphics
Principal component analysis (PCA)
Dimensional reduction, PCA
TFIDF-space  An obvious way to combine TF-IDF: the coordinate of document in axis is given by  General form of consists of three parts: Local weight.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
3D Geometry for Computer Graphics
3D Geometry for Computer Graphics
Ordinary least squares regression (OLS)
Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
Dirac Notation and Spectral decomposition
Principal Component Analysis. Philosophy of PCA Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of multivariate data.
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka Virginia de Sa (UCSD) Cogsci 108F Linear.
SVD(Singular Value Decomposition) and Its Applications
Statistics – O. R. 892 Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research University of North Carolina.
Detailed Look at PCA Three Important (& Interesting) Viewpoints: 1. Mathematics 2. Numerics 3. Statistics 1 st : Review Linear Alg. and Multivar. Prob.
Chapter 2 Dimensionality Reduction. Linear Methods
Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: More NCI 60 Data.
Object Orie’d Data Analysis, Last Time
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
Linear Regression Andy Jacobson July 2006 Statistical Anecdotes: Do hospitals make you sick? Student’s story Etymology of “regression”
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: Detailed (math ’
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
Analyzing Expression Data: Clustering and Stats Chapter 16.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
Principle Component Analysis and its use in MA clustering Lecture 12.
Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Thurs., Early, Oct., Nov.,
Principal Component Analysis (PCA)
1. Systems of Linear Equations and Matrices (8 Lectures) 1.1 Introduction to Systems of Linear Equations 1.2 Gaussian Elimination 1.3 Matrices and Matrix.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
Instructor: Mircea Nicolescu Lecture 8 CS 485 / 685 Computer Vision.
Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.
Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.
PCA as Optimization (Cont.) Recall Toy Example Empirical (Sample) EigenVectors Theoretical Distribution & Eigenvectors Different!
Chapter 13 Discrete Image Transforms
GWAS Data Analysis. L1 PCA Challenge: L1 Projections Hard to Interpret (i.e. Little Data Insight) Solution: 1)Compute PC Directions Using L1 2)Compute.
PCA Data Represent ’ n (Cont.). PCA Simulation Idea: given Mean Vector Eigenvectors Eigenvalues Simulate data from Corresponding Normal Distribution.
Cornea Data Main Point: OODA Beyond FDA Recall Interplay: Object Space  Descriptor Space.
Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Linear Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp.
CSE 554 Lecture 8: Alignment
Object Orie’d Data Analysis, Last Time
Background on Classification
Exploring Microarray data
Functional Data Analysis
Dimension Reduction via PCA (Principal Component Analysis)
Principal Nested Spheres Analysis
Singular Value Decomposition
Participant Presentations
Principal Component Analysis
Recitation: SVD and dimensionality reduction
Parallelization of Sparse Coding & Dictionary Learning
Principal Component Analysis (PCA)
Feature space tansformation methods
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)
Lecture 13: Singular Value Decomposition (SVD)
Marios Mattheakis and Pavlos Protopapas
Participant Presentations
Presentation transcript:

Object Orie’d Data Analysis, Last Time Finished Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp. gives rotation, easy sol ’ n) Connected Mathematics & Graphics Started Redistribution of Energy

PCA Redistribution of Energy Convenient summary of amount of structure: Total Sum of Squares Physical Interpetation: Total Energy in Data Insight comes from decomposition Statistical Terminology: ANalysis Of VAriance (ANOVA)

PCA Redist ’ n of Energy (Cont.) ANOVA mean decomposition: Total Variation = = Mean Variation + Mean Residual Variation Mathematics: Pythagorean Theorem Intuition Quantified via Sums of Squares

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Residuals from Mean = Data – Mean Most of Variation = 92% is Mean Variation SS Remaining Variation = 8% is Resid. Var. SS

PCA Redist ’ n of Energy (Cont.) Have already studied this decomposition (recall curve e.g.) Variation due to Mean (% of total) Variation of Mean Residuals (% of total)

PCA Redist ’ n of Energy (Cont.) Now decompose SS about the mean where: Energy is expressed in trace of covar’ce matrix

PCA Redist ’ n of Energy (Cont.) Eigenvalues provide atoms of SS decomposi’n Useful Plots are: “Power Spectrum”: vs. “log Power Spectrum”: vs. “Cumulative Power Spectrum”: vs. Note PCA gives SS’s for free (as eigenvalues), but watch factors of

PCA Redist ’ n of Energy (Cont.) Note, have already considered some of these Useful Plots: Power Spectrum Cumulative Power Spectrum

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Revisit SS Decomposition for PC1: PC1 has “most of var’n” = 93% Reflected by good approximation in Object Space

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Revisit SS Decomposition for PC1: PC2 has “only a little var’n” = 7% Reflected by poor approximation in Object Space

Different Views of PCA Solves several optimization problems: 1.Direction to maximize SS of 1-d proj’d data 2.Direction to minimize SS of residuals (same, by Pythagorean Theorem) 3.“Best fit line” to data in “orthogonal sense” (vs. regression of Y on X = vertical sense & regression of X on Y = horizontal sense) Use one that makes sense…

Different Views of PCA Next Time: Add some graphics about this Scatterplot of Toy Data sets + various fits, with residuals Will be Useful in Stor 165, as well

Different Views of PCA 2-d Toy Example Feature Space Object Space 1.Max SS of Projected Data 2.Min SS of Residuals 3.Best Fit Line

PCA Data Representation Idea: Expand Data Matrix in terms of inner prod’ts & eigenvectors Recall notation: Eigenvalue expansion (centered data):

PCA Data Represent ’ n (Cont.) Now using: Eigenvalue expansion (raw data): Where: Entries of are loadings Entries of are scores

PCA Data Represent ’ n (Cont.) Can focus on individual data vectors: (part of above full matrix rep’n) Terminology: are called “PCs” and are also called scores

PCA Data Represent ’ n (Cont.) More terminology: Scores, are coefficients in eigenvalue representation: Loadings are entries of eigenvectors:

PCA Data Represent ’ n (Cont.) Reduced Rank Representation: Reconstruct using only terms (assuming decreasing eigenvalues) Gives: rank approximation of data Key to PCA dimension reduction And PCA for data compression (~.jpeg)

PCA Data Represent ’ n (Cont.) Choice of in Reduced Rank Represent’n: Generally very slippery problem SCREE plot (Kruskal 1964): Find knee in power spectrum

PCA Data Represent ’ n (Cont.) SCREE plot drawbacks: What is a knee? What if there are several? Knees depend on scaling (power? log?) Personal suggestion: Find auxiliary cutoffs (inter-rater variation) Use the full range (ala scale space)

PCA Simulation Idea: given Mean Vector Eigenvectors Eigenvalues Simulate data from corresponding Normal Distribution Approach: Invert PCA Data Represent’n where

Alternate PCA Computation Issue: for HDLSS data (recall ) may be quite large, Thus slow to work with, and to compute What about a shortcut? Approach: Singular Value Decomposition (of (centered, scaled) Data Matrix )

Alternate PCA Computation Singular Value Decomposition: Where: is unitary is diag ’ l matrix of singular val ’ s Assume: decreasing singular values

Alternate PCA Computation Singular Value Decomposition: Recall Relation to Eigen-analysis of Thus have same eigenvector matrix And eigenval ’ s are squares of singular val ’ s

Alternate PCA Computation Singular Value Decomposition, Computational advantage (for rank ): Use compact form, only need to find e-vec ’ s s-val ’ s scores Other components not useful So can be much faster for

Alternate PCA Computation Another Variation: Dual PCA Motivation: Recall for demography data, Useful to view as both Rows as Data & Columns as Data

Alternate PCA Computation Useful terminology (from optimization): Primal PCA problem: Columns as Data Dual PCA problem: Rows as Data

Alternate PCA Computation Dual PCA Computation: Same as above, but replace with So can almost replace with Then use SVD,, to get:

Alternate PCA Computation Appears to be cool symmetry: Primal  Dual Loadings  Scores  But, there is a problem with the means …

Alternate PCA Computation Next time: Explore Loadings & Scores issue More deeply, with explicit look at Notation….

Primal - Dual PCA Note different “ mean vectors ” : Primal Mean = Mean of Col. Vec ’ s: Dual Mean = Mean of Row Vec ’ s:

Primal - Dual PCA Primal PCA, based on SVD of Primal Data: Dual PCA, based on SVD of Dual Data: Very similar, except: Different centerings Different row – column interpretation

Primal - Dual PCA Next Time get factors of (n-1) straight. Maybe best to dispense with that in defn Of X_P and X_D…

Primal - Dual PCA Toy Example 1: Random Curves, all in Primal Space: * Constant Shift * Linear * Quadratic Cubic (chosen to be orthonormal) Plus (small) i.i.d. Gaussian noise d = 40, n = 20

Primal - Dual PCA Toy Example 1: Raw Data

Primal - Dual PCA Toy Example 1: Raw Data Primal (Col.) curves similar to before Data mat ’ x asymmetric (but same curves) Dual (Row) curves much rougher (showing Gaussian randomness) How data were generated Color map useful? (same as mesh view) See richer structure than before Is it useful?

Primal - Dual PCA Toy Example 1: Primal PCA Column Curves as Data

Primal - Dual PCA Toy Example 1: Primal PCA Expected to recover increasing poly ’ s But didn ’ t happen Although can see the poly ’ s (order???) Mean has quad ’ ic (since only n = 20???) Scores (proj ’ ns) very random Power Spectrum shows 4 components (not affected by subtracting Primal Mean)

Primal - Dual PCA Toy Example 1: Dual PCA Row Curves as Data

Primal - Dual PCA Toy Example 1: Dual PCA Curves all very wiggly (random noise) Mean much bigger, 54% of Total Var! Scores have strong smooth structure (reflecting ordered primal e.v. ’ s) (recall primal e.v.  dual scores) Power Spectrum shows 3 components (Driven by subtraction Dual Mean) Primal – Dual mean difference is critical