Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Started detailed look at PCA Reviewed linear algebra Today: More linear algebra Multivariate.

Slides:



Advertisements
Similar presentations
Eigen Decomposition and Singular Value Decomposition
Advertisements

3D Geometry for Computer Graphics
Chapter 28 – Part II Matrix Operations. Gaussian elimination Gaussian elimination LU factorization LU factorization Gaussian elimination with partial.
The Simple Regression Model
Component Analysis (Review)
PCA + SVD.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Next Week, Early, Oct.,
Slides by Olga Sorkine, Tel Aviv University. 2 The plan today Singular Value Decomposition  Basic intuition  Formal definition  Applications.
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Symmetric Matrices and Quadratic Forms
Chapter 5 Orthogonality
Computer Graphics Recitation 5.
3D Geometry for Computer Graphics
The UNIVERSITY of Kansas EECS 800 Research Seminar Mining Biological Data Instructor: Luke Huan Fall, 2006.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Object Orie’d Data Analysis, Last Time Mildly Non-Euclidean Spaces Strongly Non-Euclidean Spaces –Tree spaces –No Tangent Plane Classification - Discrimination.
3D Geometry for Computer Graphics
3D Geometry for Computer Graphics
Ordinary least squares regression (OLS)
1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
Stats & Linear Models.
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka Virginia de Sa (UCSD) Cogsci 108F Linear.
SVD(Singular Value Decomposition) and Its Applications
Object Orie’d Data Analysis, Last Time Finished Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp. gives rotation,
Detailed Look at PCA Three Important (& Interesting) Viewpoints: 1. Mathematics 2. Numerics 3. Statistics 1 st : Review Linear Alg. and Multivar. Prob.
Linear Least Squares Approximation. 2 Definition (point set case) Given a point set x 1, x 2, …, x n  R d, linear least squares fitting amounts to find.
Chapter 2 Dimensionality Reduction. Linear Methods
Probability of Error Feature vectors typically have dimensions greater than 50. Classification accuracy depends upon the dimensionality and the amount.
Linear Algebra Review 1 CS479/679 Pattern Recognition Dr. George Bebis.
Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: More NCI 60 Data.
Object Orie’d Data Analysis, Last Time
Linear Regression Andy Jacobson July 2006 Statistical Anecdotes: Do hospitals make you sick? Student’s story Etymology of “regression”
SVD: Singular Value Decomposition
Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: Detailed (math ’
Discriminant Analysis
Linear Subspace Transforms PCA, Karhunen- Loeve, Hotelling C306, 2000.
Introduction to Linear Algebra Mark Goldman Emily Mackevicius.
Review of Linear Algebra Optimization 1/16/08 Recitation Joseph Bradley.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
Principle Component Analysis and its use in MA clustering Lecture 12.
Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Thurs., Early, Oct., Nov.,
Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.
Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.
PCA as Optimization (Cont.) Recall Toy Example Empirical (Sample) EigenVectors Theoretical Distribution & Eigenvectors Different!
GWAS Data Analysis. L1 PCA Challenge: L1 Projections Hard to Interpret (i.e. Little Data Insight) Solution: 1)Compute PC Directions Using L1 2)Compute.
PCA Data Represent ’ n (Cont.). PCA Simulation Idea: given Mean Vector Eigenvectors Eigenvalues Simulate data from Corresponding Normal Distribution.
1 Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors. Review Matrices.
Cornea Data Main Point: OODA Beyond FDA Recall Interplay: Object Space  Descriptor Space.
Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Linear Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp.
Lecture XXVII. Orthonormal Bases and Projections Suppose that a set of vectors {x 1,…,x r } for a basis for some space S in R m space such that r  m.
CS246 Linear Algebra Review. A Brief Review of Linear Algebra Vector and a list of numbers Addition Scalar multiplication Dot product Dot product as a.
CSE 554 Lecture 8: Alignment
Review of Linear Algebra
Object Orie’d Data Analysis, Last Time
Matrices and vector spaces
Background on Classification
LECTURE 10: DISCRIMINANT ANALYSIS
Functional Data Analysis
Singular Value Decomposition
Participant Presentations
Multivariate Analysis: Theory and Geometric Interpretation
Principal Component Analysis
Feature space tansformation methods
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)
LECTURE 09: DISCRIMINANT ANALYSIS
Lecture 13: Singular Value Decomposition (SVD)
Presentation transcript:

Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Started detailed look at PCA Reviewed linear algebra Today: More linear algebra Multivariate Probability Distribution PCA as an optimization problem

Detailed Look at PCA Three important (and interesting) viewpoints: 1. Mathematics 2. Numerics 3. Statistics 1 st : Review linear alg. and multivar. prob.

Review of Linear Algebra (Cont.) Singular Value Decomposition (SVD): For a matrix Find a diagonal matrix, with entries called singular values And unitary (rotation) matrices, (recall ) so that

Review of Linear Algebra (Cont.) Intuition behind Singular Value Decomposition: For a “linear transf’n” (via matrix multi’n) First rotate Second rescale coordinate axes (by ) Third rotate again i.e. have diagonalized the transformation

Review of Linear Algebra (Cont.) SVD Compact Representation: Useful Labeling: Singular Values in Increasing Order Note: singular values = 0 can be omitted Let = # of positive singular values Then: Where are truncations of

Review of Linear Algebra (Cont.) SVD Full Representation: =

Review of Linear Algebra (Cont.) SVD Reduced Representation: = Assumes

Review of Linear Algebra (Cont.) SVD Reduced Representation: = Assumes

Review of Linear Algebra (Cont.) SVD Compact Representation: =

Review of Linear Algebra (Cont.) SVD Compact Representation: =

Review of Linear Algebra (Cont.) Eigenvalue Decomposition: For a (symmetric) square matrix Find a diagonal matrix And an orthonormal matrix (i.e. ) So that:, i.e.

Review of Linear Algebra (Cont.) Eigenvalue Decomposition (cont.): Relation to Singular Value Decomposition (looks similar?): Eigenvalue decomposition “harder” Since needs Price is eigenvalue decomp’n is generally complex Except for square and symmetric Then eigenvalue decomp. is real valued Thus is the sing’r value decomp. with:

Review of Linear Algebra (Cont.) Better View of Relationship: Singular Value Dec.  Eigenvalue Dec. Start with data matrix: With SVD: Create square, symmetric matrix: Note that: Gives Eigenanalysis,

Review of Linear Algebra (Cont.) Computation of Singular Value and Eigenvalue Decompositions: Details too complex to spend time here A “primitive” of good software packages Eigenvalues are unique Columns of are called “eigenvectors” Eigenvectors are “ -stretched” by :

Review of Linear Algebra (Cont.) Eigenvalue Decomp. solves matrix problems: Inversion: Square Root: is positive (nonn’ve, i.e. semi) definite all

Recall Linear Algebra (Cont.) Moore-Penrose Generalized Inverse: For

Recall Linear Algebra (Cont.) Easy to see this satisfies the definition of Generalized (Pseudo) Inverse symmetric

Recall Linear Algebra (Cont.) Moore-Penrose Generalized Inverse: Idea: matrix inverse on non-null space of the corresponding linear transformation Reduces to ordinary inverse, in full rank case, i.e. for r = d, so could just always use this Tricky aspect: “>0 vs. = 0” & floating point arithmetic

Recall Linear Algebra (Cont.) Moore-Penrose Generalized Inverse: Folklore: most multivariate formulas involving matrix inversion “still work” when Generalized Inverse is used instead

Review of Multivariate Probability Given a “random vector”, A “center” of the distribution is the mean vector,

Review of Multivariate Probability Given a “random vector”, A “measure of spread” is the covariance matrix:

Review of Multivar. Prob. (Cont.) Covariance matrix: Noneg’ve Definite (since all varia’s are 0) Provides “elliptical summary of distribution” Calculated via “outer product”:

Review of Multivar. Prob. (Cont.) Empirical versions: Given a random sample, Estimate the theoretical mean, with the sample mean:

Review of Multivar. Prob. (Cont.) Empirical versions (cont.) And estimate the “theoretical cov.”, with the “sample cov.”: Normalizations: gives unbiasedness gives MLE in Gaussian case

Review of Multivar. Prob. (Cont.) Outer product representation:, where:

PCA as an Optimization Problem Find “direction of greatest variability”:

PCA as Optimization (Cont.) Find “direction of greatest variability”: Given a “direction vector”, (i.e. ) Projection of in the direction : Variability in the direction :

PCA as Optimization (Cont.) Variability in the direction : i.e. (proportional to) a quadratic form in the covariance matrix Simple solution comes from the eigenvalue representation of : where is orthonormal, &

PCA as Optimization (Cont.) Variability in the direction : But = “ transform of ” = “ rotated into coordinates”, and the diagonalized quadratic form becomes

PCA as Optimization (Cont.) Now since is an orthonormal basis matrix, and So the rotation gives a distribution of the (unit) energy of over the eigen- directions And is max’d (over ), by putting all energy in the “largest direction”, i.e., where “eigenvalues are ordered”,

PCA as Optimization (Cont.) Notes: Solution is unique when Else have sol’ns in subsp. gen’d by 1st s Projecting onto subspace to, gives as next direction Continue through,…, Replace by to get theoretical PCA Estimated by the empirical version

Iterated PCA Visualization

Connect Math to Graphics 2-d Toy Example Feature Space Object Space Data Points (Curves) are columns of data matrix, X

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Sample Mean, X

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Residuals from Mean = Data - Mean

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Recentered Data = Mean Residuals, shifted to 0 = (rescaling of) X

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space PC1 Direction = η = Eigenvector (w/ biggest λ)

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Centered Data PC1 Projection Residual

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space PC2 Direction = η = Eigenvector (w/ 2 nd biggest λ)

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Centered Data PC2 Projection Residual

Connect Math to Graphics (Cont.) Note for this 2-d Example: PC1 Residuals = PC2 Projections PC2 Residuals = PC1 Projections (i.e. colors common across these pics)

PCA Redistribution of Energy Convenient summary of amount of structure: Total Sum of Squares Physical Interpetation: Total Energy in Data Insight comes from decomposition Statistical Terminology: ANalysis Of VAriance (ANOVA)

PCA Redist ’ n of Energy (Cont.) ANOVA mean decomposition: Total Variation = = Mean Variation + Mean Residual Variation Mathematics: Pythagorean Theorem Intuition Quantified via Sums of Squares

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Residuals from Mean = Data – Mean Most of Variation = 92% is Mean Variation SS Remaining Variation = 8% is Resid. Var. SS

PCA Redist ’ n of Energy (Cont.) Now decompose SS about the mean where: Energy is expressed in trace of covar’ce matrix

PCA Redist ’ n of Energy (Cont.) Eigenvalues provide atoms of SS decomposi’n Useful Plots are: “Power Spectrum”: vs. “log Power Spectrum”: vs. “Cumulative Power Spectrum”: vs. Note PCA gives SS’s for free (as eigenvalues), but watch factors of

PCA Redist ’ n of Energy (Cont.) Note, have already considered some of these Useful Plots: Power Spectrum Cumulative Power Spectrum

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Revisit SS Decomposition for PC1: PC1 has “most of var’n” = 93% Reflected by good approximation in Object Space

Connect Math to Graphics (Cont.) 2-d Toy Example Feature Space Object Space Revisit SS Decomposition for PC1: PC2 has “only a little var’n” = 7% Reflected by poor approximation in Object Space

Different Views of PCA Solves several optimization problems: 1.Direction to maximize SS of 1-d proj’d data 2.Direction to minimize SS of residuals (same, by Pythagorean Theorem) 3.“Best fit line” to data in “orthogonal sense” (vs. regression of Y on X = vertical sense & regression of X on Y = horizontal sense) Use one that makes sense…

Different Views of PCA 2-d Toy Example Feature Space Object Space 1.Max SS of Projected Data 2.Min SS of Residuals 3.Best Fit Line