Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.

Slides:



Advertisements
Similar presentations
On an Improved Chaos Shift Keying Communication Scheme Timothy J. Wren & Tai C. Yang.
Advertisements

3D Geometry for Computer Graphics
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
PCA + SVD.
Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Started detailed look at PCA Reviewed linear algebra Today: More linear algebra Multivariate.
Slides by Olga Sorkine, Tel Aviv University. 2 The plan today Singular Value Decomposition  Basic intuition  Formal definition  Applications.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Lecture 4 Linear Filters and Convolution
x – independent variable (input)
Computer Graphics Recitation 5.
Computer Vision - A Modern Approach Set: Linear Filters Slides by D.A. Forsyth Differentiation and convolution Recall Now this is linear and shift invariant,
Image processing. Image operations Operations on an image –Linear filtering –Non-linear filtering –Transformations –Noise removal –Segmentation.
Dimensional reduction, PCA
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
3D Geometry for Computer Graphics
3D Geometry for Computer Graphics
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
1 MSU CSE 803 Fall 2014 Vectors [and more on masks] Vector space theory applies directly to several image processing/representation problems.
Transforms: Basis to Basis Normal Basis Hadamard Basis Basis functions Method to find coefficients (“Transform”) Inverse Transform.
Separate multivariate observations
5.1 Orthogonality.
Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka Virginia de Sa (UCSD) Cogsci 108F Linear.
SVD(Singular Value Decomposition) and Its Applications
Statistics – O. R. 892 Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research University of North Carolina.
Computational Methods in Physics PHYS 3437 Dr Rob Thacker Dept of Astronomy & Physics (MM-301C)
Object Orie’d Data Analysis, Last Time Finished Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp. gives rotation,
NUS CS5247 A dimensionality reduction approach to modeling protein flexibility By, By Miguel L. Teodoro, George N. Phillips J* and Lydia E. Kavraki Rice.
Chapter 2 Dimensionality Reduction. Linear Methods
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.
Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: More NCI 60 Data.
Object Orie’d Data Analysis, Last Time
Chap 3. Formalism Hilbert Space Observables
Principal Component Analysis Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Return to Big Picture Main statistical goals of OODA: Understanding population structure –Low dim ’ al Projections, PCA … Classification (i. e. Discrimination)
Statistics – O. R. 891 Object Oriented Data Analysis J. S. Marron Dept. of Statistics and Operations Research University of North Carolina.
Robust PCA Robust PCA 3: Spherical PCA. Robust PCA.
§ Linear Operators Christopher Crawford PHY
Object Orie’d Data Analysis, Last Time Gene Cell Cycle Data Microarrays and HDLSS visualization DWD bias adjustment NCI 60 Data Today: Detailed (math ’
Object Orie’d Data Analysis, Last Time Classification / Discrimination Classical Statistical Viewpoint –FLD “good” –GLR “better” –Conclude always do GLR.
Section 5.1 Length and Dot Product in ℝ n. Let v = ‹v 1­­, v 2, v 3,..., v n › and w = ‹w 1­­, w 2, w 3,..., w n › be vectors in ℝ n. The dot product.
Maximal Data Piling Visual similarity of & ? Can show (Ahn & Marron 2009), for d < n: I.e. directions are the same! How can this be? Note lengths are different.
Class 26: Question 1 1.An orthogonal basis for A 2.An orthogonal basis for the column space of A 3.An orthogonal basis for the row space of A 4.An orthogonal.
Introduction to Linear Algebra Mark Goldman Emily Mackevicius.
Analyzing Expression Data: Clustering and Stats Chapter 16.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
Participant Presentations Please Sign Up: Name (Onyen is fine, or …) Are You ENRolled? Tentative Title (???? Is OK) When: Thurs., Early, Oct., Nov.,
Principal Component Analysis (PCA)
1. Systems of Linear Equations and Matrices (8 Lectures) 1.1 Introduction to Systems of Linear Equations 1.2 Gaussian Elimination 1.3 Matrices and Matrix.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
Object Orie’d Data Analysis, Last Time
CS654: Digital Image Analysis Lecture 11: Image Transforms.
1 UNC, Stat & OR Hailuoto Workshop Object Oriented Data Analysis, I J. S. Marron Dept. of Statistics and Operations Research, University of North Carolina.
Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual.
Chapter 61 Chapter 7 Review of Matrix Methods Including: Eigen Vectors, Eigen Values, Principle Components, Singular Value Decomposition.
GWAS Data Analysis. L1 PCA Challenge: L1 Projections Hard to Interpret (i.e. Little Data Insight) Solution: 1)Compute PC Directions Using L1 2)Compute.
PCA Data Represent ’ n (Cont.). PCA Simulation Idea: given Mean Vector Eigenvectors Eigenvalues Simulate data from Corresponding Normal Distribution.
1 Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors. Review Matrices.
Cornea Data Main Point: OODA Beyond FDA Recall Interplay: Object Space  Descriptor Space.
Object Orie’d Data Analysis, Last Time Finished NCI 60 Data Linear Algebra Review Multivariate Probability Review PCA as an Optimization Problem (Eigen-decomp.
CSE 554 Lecture 8: Alignment
Functional Data Analysis
Statistics – O. R. 881 Object Oriented Data Analysis
Maximal Data Piling MDP in Increasing Dimensions:
Participant Presentations
Principal Component Analysis
Parallelization of Sparse Coding & Dictionary Learning
Lecture 13: Singular Value Decomposition (SVD)
Maths for Signals and Systems Linear Algebra in Engineering Lecture 18, Friday 18th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL.
Marios Mattheakis and Pavlos Protopapas
Presentation transcript:

Object Orie’d Data Analysis, Last Time PCA Redistribution of Energy - ANOVA PCA Data Representation PCA Simulation Alternate PCA Computation Primal – Dual PCA vs. SVD (centering by means is key)

Primal - Dual PCA Toy Example 1: Random Curves, all in Primal Space: * Constant Shift * Linear * Quadratic Cubic (chosen to be orthonormal) Plus (small) i.i.d. Gaussian noise d = 40, n = 20

Primal - Dual PCA Toy Example 1: Raw Data

Primal - Dual PCA Toy Example 1: Raw Data Primal (Col.) curves similar to before Data mat ’ x asymmetric (but same curves) Dual (Row) curves much rougher (showing Gaussian randomness) How data were generated Color map useful? (same as mesh view) See richer structure than before Is it useful?

Primal - Dual PCA Toy Example 1: Primal PCA Column Curves as Data

Primal - Dual PCA Toy Example 1: Primal PCA Expected to recover increasing poly ’ s But didn ’ t happen Although can see the poly ’ s (order???) Mean has quad ’ ic (since only n = 20???) Scores (proj ’ ns) very random Power Spectrum shows 4 components (not affected by subtracting Primal Mean)

Primal - Dual PCA Toy Example 1: Dual PCA Row Curves as Data

Primal - Dual PCA Toy Example 1: Dual PCA Curves all very wiggly (random noise) Mean much bigger, 54% of Total Var! Scores have strong smooth structure (reflecting ordered primal e.v. ’ s) (recall primal e.v.  dual scores) Power Spectrum shows 3 components (Driven by subtraction Dual Mean) Primal – Dual mean difference is critical

Primal - Dual PCA Toy Example 1: Dual PCA – Scatterplot

Primal - Dual PCA Toy Example 1: Dual PCA - Scatterplot Smooth Curve Structure But not usual curves (Since 1-d curves not quite poly ’ s) And only in 1 st 3 components –Recall only 3 non-noise components –Since constant curve went into mean (dual) Remainder is pure noise Suggests wrong rotation of axes???

Primal - Dual PCA A 3 rd Type of Analysis: Called “ SVD decomposition ” Main point: subtract neither mean Viewed as a serious competitor Advantage: gives best Mean Square Approximation of Data Matrix Vs. Primal PCA: best about col. Mean Vs. Dual PCA: best about row Mean Difference in means is critical!

Primal - Dual PCA Toy Example 1: SVD – Curves view

Primal - Dual PCA Toy Example 1: SVD Curves View Col. Curves view similar to Primal PCA Row Curves quite different (from dual): –Former mean, now SV1 –Former PC1, now SV2 –i.e. very similar shapes with shifted indices Again mean centering is crucial Main difference between PCAs and SVD

Primal - Dual PCA Toy Example 1: SVD – Mesh-Image View

Primal - Dual PCA Toy Example 1: SVD Mesh-Image View Think about decomposition into modes of variation –Constant x Gaussian –Linear x Gaussian –Cubic by Gaussian –Quadratic Shows up best in image view? Why is ordering “ wrong ” ???

Primal - Dual PCA Toy Example 1: All Primal Why is SVD mode ordering “ wrong ” ??? Well, not expected … Key is need orthogonality Present in space of column curves But only approximate in row Gaussians The implicit orthogonalization of SVD (both rows and columns) gave mixture of the poly ’ s.

Primal - Dual PCA Toy Example 2: All Primal, GS Noise Started with same column space Generated i.i.d. Gaussians for row cols Then did Graham-Schmidt Ortho- normalization (in row space) Visual impression: Amazingly similar to original data (used same seeds of random # generators)

Primal - Dual PCA Toy Example 2: Raw Data

Primal - Dual PCA Compare with Earlier Toy Example 1

Primal - Dual PCA Toy Example 2: Primal PCA Column Curves as Data Shows Explanation (of wrong components) was correct

Primal - Dual PCA Toy Example 2: Dual PCA Row Curves as Data Still have big mean But Scores look much better

Primal - Dual PCA Toy Example 2: Dual PCA – Scatterplot

Primal - Dual PCA Toy Example 2: Dual PCA - Scatterplot Now poly ’ s look beautifully symmetric Much like chemo spectrum examples But still only 3 –Same reason, dual mean ~ primal constants Last one is pure noise

Primal - Dual PCA Toy Example 2: SVD – Matrix-Image

Primal - Dual PCA Toy Example 2: SVD - Matrix-Image Similar Good Effects Again have all 4 components So “ better ” to not subtract mean???

Primal - Dual PCA Toy Example 3: Random Curves, all in Dual Space: 1 * Constant Shift 2 * Linear 4 * Quadratic 8 * Cubic (chosen to be orthonormal) Plus (small) i.i.d. Gaussian noise d = 40, n = 20

Primal - Dual PCA Toy Example 3: Raw Data

Primal - Dual PCA Toy Example 3: Raw Data Similar Structure to e.g. 1 But Rows and Columns trade places And now cubics visually dominant (as expected)

Primal - Dual PCA Toy Example 3: Primal PCA Column Curves as Data Gaussian Noise Only 3 components Poly Scores (as expected)

Primal - Dual PCA Toy Example 3: Dual PCA Row Curves as Data Components as expected No Gram-Schmidt (since stronger signal)

Primal - Dual PCA Toy Example 3: SVD – Matrix-Image

Primal - Dual PCA Toy Example 4: Mystery #1

Primal - Dual PCA Toy Example 4: SVD – Curves View

Primal - Dual PCA Toy Example 4: SVD – Matrix-Image

Primal - Dual PCA Toy Example 4: Mystery #1 Structure: Primal - Dual Constant Gaussian Gaussian Linear Parabola Gaussian Gaussian Cubic Nicely revealed by Full Matrix decomposition and views

Primal - Dual PCA Toy Example 5: Mystery #2

Primal - Dual PCA Toy Example 5: SVD – Curves View

Primal - Dual PCA Toy Example 5: SVD – Matrix-Image

Primal - Dual PCA Toy Example 5: Mystery #2 Structure: Primal - Dual Constant Linear Parabola Cubic Gaussian Gaussian Visible via either curves, or matrices …

Primal - Dual PCA Is SVD (i.e. no mean centering) always “ better ” ? What does “ better ” mean??? A definition: Provides most useful insights into data Others???

Primal - Dual PCA Toy Example where SVD is less informative: Simple Two dimensional Key is subtraction of mean is bad I.e. Mean dir ’ n different from PC dir ’ ns And Mean Less Informative

Primal - Dual PCA Toy Example where SVD is less informative: Raw Data

Primal - Dual PCA PC1 mode of variation (centered at mean): Yields useful major mode of variation

Primal - Dual PCA PC2 mode of variation (centered at mean): Informative second mode of variation

Primal - Dual PCA SV1 mode of variation (centered at 0): Unintuitive major mode of variation

Primal - Dual PCA SV2 mode of variation (centered at 0): Unintuitive second mode of variation

Primal - Dual PCA Summary of SVD: Does give a decomposition I.e. sum of two pieces is data But not good insights about data structure Since center point of analysis is far from center point of data So mean strongly influences the impression of variation Maybe better to keep these separate???

Primal - Dual PCA Bottom line on: Primal PCA vs. SVD vs. Dual PCA These are not comparable: Each has situations where it is “ best ” And where it is “ worst ” Generally should consider all And choose on basis of insights See work of Lingsong Zhang on this …

Real Data: Primal - Dual PCA Analysis by: Lingsong Zhang Zhang, L., (2006), "SVD movies and plots for Singular Value Decomposition and its Visualization", University of North Carolina at Chapel Hill, available at work/SVDmovie

Real Data: Primal - Dual PCA Use slides from a talk: LingsongZhangFunctionalSVD.pdf Main Points: Different approaches all can be “best” Show different aspects of data Generalized SCREE ploy “outliers” are interesting

Real Data: Primal - Dual PCA Visual Point: Rotations can show useful aspects Movies: LingsongZhangSVDcurvemovie4int30m.avi LingsongZhangSVDMOVIEbyCOMPofSV1.avi LingsongZhangSVDMOVIEbyCOMPofSV2.avi LingsongZhangSVDMOVIEbyCOMPofSV3.avi

Vectors vs. Functions Recall overall structure: Object Space Feature Space Curves (functions) Vectors Connection 1: Digitization Parallel Coordinates Connection 2: Basis Representation

Vectors vs. Functions Connection 1: Digitization: Given a function, define vector Where is a suitable grid, e.g. equally spaced:

Vectors vs. Functions Connection 1: Parallel Coordinates: Given a vector, define a function where And linearly interpolate to “ connect the dots ” Proposed as High Dimensional Visualization Method by Inselberg (1985)

Vectors vs. Functions Parallel Coordinates: Given, define Now can “ rescale argument ” To get function on [0,1], evaluated at equally spaced grid

Vectors vs. Functions Bridge between vectors & functions: Vectors  Functions Isometry follows from convergence of: Inner Products By Reimann Summation

Vectors vs. Functions Main lesson: -OK to think about functions -But actually work with vectors For me, there is little difference But there is a statistical theory, and mathematical statistical literature on this Start with Ramsay & Silverman (2005)

Vectors vs. Functions Recall overall structure: Object Space Feature Space Curves (functions) Vectors Connection 1: Digitization Parallel Coordinates Connection 2: Basis Representation

Vectors vs. Functions Connection 2: Basis Representations: Given an orthonormal basis (in function space) E.g. –Fourier –B-spline –Wavelet Represent functions as:

Vectors vs. Functions Connection 2: Basis Representations: Represent functions as: Bridge between discrete and continuous:

Vectors vs. Functions Connection 2: Basis Representations: Represent functions as: Finite dimensional approximation: Again there is mathematical statistical theory, based on (same ref.)

Vectors vs. Functions Repeat Main lesson: -OK to think about functions -But actually work with vectors For me, there is little difference (but only personal taste)