Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Linear Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp.

Slides:

Advertisements

Similar presentations

Eigen Decomposition and Singular Value Decomposition

Advertisements

3D Geometry for Computer Graphics

Dimensionality Reduction PCA -- SVD

Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Started detailed look at PCA Reviewed linear algebra Today: More linear algebra Multivariate.

1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.

Principal Component Analysis CMPUT 466/551 Nilanjan Ray.

3D Geometry for Computer Graphics

Dimensional reduction, PCA

The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.

3D Geometry for Computer Graphics

3D Geometry for Computer Graphics

Exploring Microarray data Javier Cabrera. Outline 1.Exploratory Analysis Steps. 2.Microarray Data as Multivariate Data. 3.Dimension Reduction 4.Correlation.

PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir

Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka Virginia de Sa (UCSD) Cogsci 108F Linear.

SVD(Singular Value Decomposition) and Its Applications

Object Orie’d Data Analysis, Last Time Finished Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp. gives rotation,

Detailed Look at PCA Three Important (& Interesting) Viewpoints: 1. Mathematics 2. Numerics 3. Statistics 1 st : Review Linear Alg. and Multivar. Prob.

Linear Least Squares Approximation. 2 Definition (point set case) Given a point set x 1, x 2, …, x n  R d, linear least squares fitting amounts to find.

Chapter 2 Dimensionality Reduction. Linear Methods

CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.

Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: More NCI 60 Data.

Object Orie’d Data Analysis, Last Time

CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.

Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

SVD: Singular Value Decomposition

Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: Detailed (math ’

CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2013.

Analyzing Expression Data: Clustering and Stats Chapter 16.

EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.

PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton

Principle Component Analysis and its use in MA clustering Lecture 12.

Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Thurs., Early, Oct., Nov.,

Principal Component Analysis (PCA)

1. Systems of Linear Equations and Matrices (8 Lectures) 1.1 Introduction to Systems of Linear Equations 1.2 Gaussian Elimination 1.3 Matrices and Matrix.

Point Distribution Models Active Appearance Models Compilation based on: Dhruv Batra ECE CMU Tim Cootes Machester.

Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.

Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.

Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.

PCA as Optimization (Cont.) Recall Toy Example Empirical (Sample) EigenVectors Theoretical Distribution & Eigenvectors Different!

GWAS Data Analysis. L1 PCA Challenge: L1 Projections Hard to Interpret (i.e. Little Data Insight) Solution: 1)Compute PC Directions Using L1 2)Compute.

PCA Data Represent ’ n (Cont.). PCA Simulation Idea: given Mean Vector Eigenvectors Eigenvalues Simulate data from Corresponding Normal Distribution.

Central limit theorem - go to web applet. Correlation maps vs. regression maps PNA is a time series of fluctuations in 500 mb heights PNA = 0.25 *

Cornea Data Main Point: OODA Beyond FDA Recall Interplay: Object Space  Descriptor Space.

CS246 Linear Algebra Review. A Brief Review of Linear Algebra Vector and a list of numbers Addition Scalar multiplication Dot product Dot product as a.

CSE 554 Lecture 8: Alignment

Introduction The central problems of Linear Algebra are to study the properties of matrices and to investigate the solutions of systems of linear equations.

Review of Linear Algebra

Object Orie’d Data Analysis, Last Time

Background on Classification

Exploring Microarray data

LECTURE 10: DISCRIMINANT ANALYSIS

Principal Component Analysis (PCA)

Functional Data Analysis

Dimension Reduction via PCA (Principal Component Analysis)

Principal Nested Spheres Analysis

Singular Value Decomposition

Participant Presentations

Probabilistic Models with Latent Variables

Multivariate Analysis: Theory and Geometric Interpretation

Principal Component Analysis

Descriptive Statistics vs. Factor Analysis

Recitation: SVD and dimensionality reduction

Parallelization of Sparse Coding & Dictionary Learning

Principal Component Analysis (PCA)

Feature space tansformation methods

Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)

Lecture 13: Singular Value Decomposition (SVD)

Marios Mattheakis and Pavlos Protopapas

Participant Presentations

Presentation transcript:

Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Linear Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp. gives rotation, easy sol ’ n) Connected Mathematics & Graphics

Class Listserv Tested on Thursday Evening, 9/8/05 If you did not get the Please add yourself to the list Use Instructions at bottom of Class Web Page:

PCA Redistribution of Energy Convenient summary of amount of structure: Total Sum of Squares Physical Interpetation: Total Energy in Data Insight comes from decomposition Statistical Terminology: ANalysis of VAriance (ANOVA)

PCA Redist ’ n of Energy (Cont.) ANOVA mean decomposition: Total Variation = = Mean Variation + Mean Residual Variation Mathematics: Pythagorean Theorem Intuition Quantified via Sums of Squares

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Residuals from Mean = Data – Mean Most of Variation = 92% is Mean Variation SS Remaining Variation = 8% is Resid. Var. SS

PCA Redist ’ n of Energy (Cont.) Now decompose SS about the mean where: Energy is expressed in trace of covar’ce matrix

PCA Redist ’ n of Energy (Cont.) Eigenvalues provide atoms of SS decomposi’n Useful Plots are: “Power Spectrum”: vs. “log Power Spectrum”: vs. “Cumulative Power Spectrum”: vs. Note PCA gives SS’s for free (as eigenvalues), but watch factors of

PCA Redist ’ n of Energy (Cont.) Note, have already considered some of these Useful Plots: Power Spectrum Cumulative Power Spectrum

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Revisit SS Decomposition for PC1: PC1 has “most of var’n” = 93% Reflected by good approximation in Object Space

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Revisit SS Decomposition for PC1: PC2 has “only a little var’n” = 7% Reflected by poor approximation in Object Space

Different Views of PCA Solves several optimization problems: 1.Direction to maximize SS of 1-d proj’d data 2.Direction to minimize SS of residuals (same, by Pythagorean Theorem) 3.“Best fit line” to data in “orthogonal sense” (vs. regression of Y on X = vertical sense & regression of X on Y = horizontal sense) Use one that makes sense…

Different Views of PCA 2-d Toy Example Feature Space Object Space 1.Max SS of Projected Data 2.Min SS of Residuals 3.Best Fit Line

PCA Data Representation Idea: Expand Data Matrix in terms of inner prod’ts & eigenvectors Recall notation: Eigenvalue expansion (centered data):

PCA Data Represent ’ n (Cont.) Now using: Eigenvalue expansion (raw data): Where: Entries of are loadings Entries of are scores

PCA Data Represent ’ n (Cont.) Can focus on individual data vectors: (part of above full matrix rep’n)

PCA Data Represent ’ n (Cont.) Reduced Rank Representation: Reconstruct using only terms (assuming decreasing eigenvalues) Gives: rank approximation of data Key to PCA data reduction And PCA for data compression (~.zip)

PCA Data Represent ’ n (Cont.) Choice of in Reduced Rank Represent’n: Generally very slippery problem SCREE plot (Kruskal 1964): Find knee in power spectrum

PCA Data Represent ’ n (Cont.) SCREE plot drawbacks: What is a knee? What if there are several? Knees depend on scaling (power? Log?) Personal suggestion: Find auxilliary cutoffs (inter-rater variation) Use the full range (ala scale space)

PCA Simulation Idea: given Mean Vector Eigenvectors Eigenvalues Simulate data from corresponding Normal Distribution Approach: Invert PCA Data Represent’n where

Alternate PCA Computation Issue: for HDLSS data (recall ) may be quite large, Thus slow to work with, and to compute What about a shortcut? Approach: Singular Value Decomposition (of Data Matrix )

Alternate PCA Computation Singular Value Decomposition: Where: is unitary is diag ’ l matrix of singular val ’ s Assume: decreasing singular values

Alternate PCA Computation Singular Value Decomposition: Recall Relation to Eigen-analysis of Thus have same eigenvector matrix And eigenval ’ s are squares of singular val ’ s

Alternate PCA Computation Singular Value Decomposition, Computational advantage: Use compact form, only need to find e-vec ’ s e-val ’ s scores Other components not useful So can be much faster for

Alternate PCA Computation Another Variation: Dual PCA Motivation: Recall for demography data, Useful to view as both Rows as Data & Columns as Data

Alternate PCA Computation Useful terminology (from optimization): Primal PCA problem: Columns as Data Dual PCA problem: Rows as Data

Alternate PCA Computation Dual PCA Computation: Same as above, but replace with So can almost replace with Then use SVD,, to get:

Alternate PCA Computation Appears to be cool symmetry: Primal  Dual Loadings  Scores  But, there is a problem with the means …

Primal - Dual PCA Note different “ mean vectors ” : Primal Mean = Mean of Col. Vec ’ s: Dual Mean = Mean of Row Vec ’ s:

Primal - Dual PCA Primal PCA, based on SVD of Primal Data: Dual PCA, based on SVD of Dual Data: Very similar, except: Different centerings Different row – column interpretation

Primal - Dual PCA Toy Example 1: Random Curves, all in Primal Space: * Constant Shift * Linear * Quadratic Cubic (chosen to be orthonormal) Plus (small) i.i.d. Gaussian noise d = 40, n = 20

Primal - Dual PCA Toy Example 1: Raw Data

Primal - Dual PCA Toy Example 1: Raw Data Primal (Col.) curves similar to before Data mat ’ x asymmetric (but same curves) Dual (Row) curves much rougher (showing Gaussian randomness) How data were generated Color map useful? (same as mesh view) See richer structure than before Is it useful?

Primal - Dual PCA Toy Example 1: Primal PCA Column Curves as Data

Primal - Dual PCA Toy Example 1: Primal PCA Expected to recover increasing poly ’ s But didn ’ t happen Although can see the poly ’ s (order???) Mean has quad ’ ic (since only n = 20???) Scores (proj ’ ns) very random Power Spectrum shows 4 components (not affected by subtracting Primal Mean)

Primal - Dual PCA Toy Example 1: Dual PCA Row Curves as Data

Primal - Dual PCA Toy Example 1: Dual PCA Curves all very wiggly (random noise) Mean much bigger, 54% of Total Var! Scores have strong smooth structure (reflecting ordered primal e.v. ’ s) (recall primal e.v.  dual scores) Power Spectrum shows 3 components (Driven by subtraction Dual Mean) Primal – Dual mean difference is critical

Primal - Dual PCA Toy Example 1: Dual PCA – Scatterplot

Primal - Dual PCA Toy Example 1: Dual PCA - Scatterplot Smooth Curve Structure But not usual curves (Since 1-d curves not quite poly ’ s) And only in 1 st 3 components –Recall only 3 non-noise components –Since constant curve went into mean (dual) Remainder is pure noise Suggests wrong rotation of axes???

Primal - Dual PCA A 3 rd Type of Analysis: Called “ SVD decomposition ” Main point: subtract neither mean Viewed as a serious competitor Advantage: gives best Mean Square Approximation of Data Matrix Vs. Primal PCA: best about col. Mean Vs. Dual PCA: best about row Mean Difference in means is critical!

Primal - Dual PCA Toy Example 1: SVD – Curves view

Primal - Dual PCA Toy Example 1: SVD Curves View Col. Curves view similar to Primal PCA Row Curves quite different (from dual): –Former mean, now SV1 –Former PC1, now SV2 –i.e. very similar shapes with shifted indices Again mean centering is crucial Main difference between PCAs and SVD

Primal - Dual PCA Toy Example 1: SVD – Mesh-Image View

Primal - Dual PCA Toy Example 1: SVD Mesh-Image View Think about decomposition into modes of variation –Constant x Gaussian –Linear x Gaussian –Cubic by Gaussian –Quadratic Shows up best in image view? Why is ordering “ wrong ” ???

Primal - Dual PCA Toy Example 1: All Primal Why is SVD mode ordering “ wrong ” ??? Well, not expected … Key is need orthogonality Present in space of column curves But only approximate in row Gaussians The implicit orthogonalization of SVD (both rows and columns) gave mixture of the poly ’ s.

Primal - Dual PCA Toy Example 2: All Primal, GS Noise Started with same column space Generated i.i.d. Gaussians for row cols Then did Graham-Schmidt Ortho- normalization (in row space) Visual impression: Amazingly similar to original data (used same seeds of random # generators)

Primal - Dual PCA Toy Example 2: Raw Data

Primal - Dual PCA Compare with Earlier Toy Example 1

Primal - Dual PCA Toy Example 2: Primal PCA Column Curves as Data Shows Explanation (of wrong components) was correct

Primal - Dual PCA Toy Example 2: Dual PCA Row Curves as Data Still have big mean But Scores look much better

Primal - Dual PCA Toy Example 2: Dual PCA – Scatterplot

Primal - Dual PCA Toy Example 2: Dual PCA - Scatterplot Now poly ’ s look beautifully symmetric Much like chemo spectrum examples But still only 3 –Same reason, dual mean ~ primal constants Last one is pure noise

Primal - Dual PCA Toy Example 2: SVD – Matrix-Image

Primal - Dual PCA Toy Example 2: SVD - Matrix-Image Similar Good Effects Again have all 4 components So “ better ” to not subtract mean???