Linear Least Squares Approximation. 2 Definition (point set case) Given a point set x 1, x 2, …, x n  R d, linear least squares fitting amounts to find.

Slides:



Advertisements
Similar presentations
3D Geometry for Computer Graphics
Advertisements

Component Analysis (Review)
Mutidimensional Data Analysis Growth of big databases requires important data processing.  Need for having methods allowing to extract this information.
PCA + SVD.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Principal component analysis (PCA) is a technique that is useful for the compression and classification.
Slides by Olga Sorkine, Tel Aviv University. 2 The plan today Singular Value Decomposition  Basic intuition  Formal definition  Applications.
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Principal Component Analysis
Symmetric Matrices and Quadratic Forms
Principal Component Analysis
Computer Graphics Recitation 5.
3D Geometry for Computer Graphics
Computer Graphics Recitation 6. 2 Last week - eigendecomposition A We want to learn how the transformation A works:
3D Geometry for Computer Graphics
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
3D Geometry for Computer Graphics
3D Geometry for Computer Graphics
Principal Components Analysis (PCA) 273A Intro Machine Learning.
Olga Sorkine’s slides Tel Aviv University. 2 Spectra and diagonalization A If A is symmetric, the eigenvectors are orthogonal (and there’s always an eigenbasis).
Orthogonality and Least Squares
Computer Graphics Recitation The plan today Least squares approach  General / Polynomial fitting  Linear systems of equations  Local polynomial.
1cs542g-term Notes  Extra class next week (Oct 12, not this Friday)  To submit your assignment: me the URL of a page containing (links to)
Normal Estimation in Point Clouds 2D/3D Shape Manipulation, 3D Printing March 13, 2013 Slides from Olga Sorkine.
5.1 Orthogonality.
SVD(Singular Value Decomposition) and Its Applications
Summarized by Soo-Jin Kim
Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)
Chapter 2 Dimensionality Reduction. Linear Methods
Digital Image Processing, 3rd ed. © 1992–2008 R. C. Gonzalez & R. E. Woods Gonzalez & Woods Matrices and Vectors Objective.
1 Recognition by Appearance Appearance-based recognition is a competing paradigm to features and alignment. No features are extracted! Images are represented.
AN ORTHOGONAL PROJECTION
Orthogonality and Least Squares
N– variate Gaussian. Some important characteristics: 1)The pdf of n jointly Gaussian R.V.’s is completely described by means, variances and covariances.
ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.
4.8 Rank Rank enables one to relate matrices to vectors, and vice versa. Definition Let A be an m  n matrix. The rows of A may be viewed as row vectors.
Linear Subspace Transforms PCA, Karhunen- Loeve, Hotelling C306, 2000.
EIGENSYSTEMS, SVD, PCA Big Data Seminar, Dedi Gadot, December 14 th, 2014.
5.1 Eigenvectors and Eigenvalues 5. Eigenvalues and Eigenvectors.
Signal & Weight Vector Spaces
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
4.8 Rank Rank enables one to relate matrices to vectors, and vice versa. Definition Let A be an m  n matrix. The rows of A may be viewed as row vectors.
3D Geometry for Computer Graphics Class 3. 2 Last week - eigendecomposition A We want to learn how the transformation A works:
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 10: PRINCIPAL COMPONENTS ANALYSIS Objectives:
Chapter 61 Chapter 7 Review of Matrix Methods Including: Eigen Vectors, Eigen Values, Principle Components, Singular Value Decomposition.
Principal Components Analysis ( PCA)
1 Objective To provide background material in support of topics in Digital Image Processing that are based on matrices and/or vectors. Review Matrices.
Let W be a subspace of R n, y any vector in R n, and the orthogonal projection of y onto W. …
Lecture XXVII. Orthonormal Bases and Projections Suppose that a set of vectors {x 1,…,x r } for a basis for some space S in R m space such that r  m.
CSE 554 Lecture 8: Alignment
Principal Component Analysis (PCA)
LECTURE 10: DISCRIMINANT ANALYSIS
9.3 Filtered delay embeddings
Spectral Methods Tutorial 6 1 © Maks Ovsjanikov
Principal Component Analysis
Techniques for studying correlation and covariance structure
Principal Component Analysis
Recitation: SVD and dimensionality reduction
Linear Algebra Lecture 40.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Outline Singular Value Decomposition Example of PCA: Eigenfaces.
Feature space tansformation methods
Symmetric Matrices and Quadratic Forms
Quantum Two.
Principal Components What matters most?.
LECTURE 09: DISCRIMINANT ANALYSIS
Linear Algebra Lecture 32.
Principal Component Analysis
Linear Algebra Lecture 41.
Symmetric Matrices and Quadratic Forms
Presentation transcript:

Linear Least Squares Approximation

2 Definition (point set case) Given a point set x 1, x 2, …, x n  R d, linear least squares fitting amounts to find the linear sub- space of R d which minimizes the sum of squared distances from the points to their projection onto this linear sub-space.

3 Definition (point set case) This problem is equivalent to search for the linear sub-space which maximizes the variance of projected points, the latter being obtained by eigen decomposition of the covariance (scatter) matrix. Eigenvectors corresponding to large eigenvalues are the directions in which the data has strong component, or equivalently large variance. If eigenvalues are the same there is no preferable sub-space.

4 PCA finds an orthogonal basis that best represents given data set. The sum of distances 2 from the x’ axis is minimized. PCA – the general idea x y x’ y’

5 PCA – the general idea PCA finds an orthogonal basis that best represents given data set. PCA finds a best approximating plane (again, in terms of  distances 2 ) 3D point set x y z

6Notations Denote our data points by x 1, x 2, …, x n  R d

7 The origin is zero-order approximation of our data set (a point) It will be the center of mass: It can be shown that: The origin of the new axes

8 Scatter matrix Denote y i = x i – m, i = 1, 2, …, n where Y is d  n matrix with y k as columns ( k = 1, 2, …, n ) YYTYT

9 Variance of projected points In a way, S measures variance (= scatterness) of the data in different directions. Let’s look at a line L through the center of mass m, and project our points x i onto it. The variance of the projected points x’ i is: Original setSmall varianceLarge variance L L L L

10 Variance of projected points Given a direction v, ||v|| = 1, the projection of x i onto L = m + vt is: v m xixi x’ i L

11 Variance of projected points So,

12 Directions of maximal variance So, we have: var(L) = Theorem: Let f : {v  R d | ||v|| = 1}  R, f (v) = (and S is a symmetric matrix). Then, the extrema of f are attained at the eigenvectors of S. So, eigenvectors of S are directions of maximal/minimal variance.

13 Summary so far We take the centered data points y 1, y 2, …, y n  R d Construct the scatter matrix S measures the variance of the data points Eigenvectors of S are directions of max/min variance.

14 Scatter matrix - eigendecomposition S is symmetric  S has eigendecomposition: S = V  V T S = v2v2 v1v1 vdvd 1 2 d v2v2 v1v1 vdvd The eigenvectors form orthogonal basis

15 Principal components Eigenvectors that correspond to big eigenvalues are the directions in which the data has strong components (= large variance). If the eigenvalues are more or less the same – there is no preferable direction.

16 Principal components There’s no preferable direction S looks like this: Any vector is an eigenvector There is a clear preferable direction S looks like this:  is close to zero, much smaller than.

17 For approximation x y v1v1 v2v2 x y This line segment approximates the original data set The projected data set approximates the original data set x y

18 For approximation In general dimension d, the eigenvalues are sorted in descending order: 1  2  …  d The eigenvectors are sorted accordingly. To get an approximation of dimension d’ < d, we take the d’ first eigenvectors and look at the subspace they span ( d’ = 1 is a line, d’ = 2 is a plane…)

19 For approximation To get an approximating set, we project the original data points onto the chosen subspace: x i = m +  1 v 1 +  2 v 2 +…+  d’ v d’ +…+  d v d Projection : x i ’ = m +  1 v 1 +  2 v 2 +…+  d’ v d’ +0  v d’+1 +…+ 0  v d

20 Optimality of approximation The approximation is optimal in least-squares sense. It gives the minimal of: The projected points have maximal variance. Original setprojection on arbitrary lineprojection on v 1 axis