Olga Sorkine’s slides Tel Aviv University
2 Spectra and diagonalization A If A is symmetric, the eigenvectors are orthogonal (and there’s always an eigenbasis). A = U U T ==Au i = i u i
3 PCA finds an orthogonal basis that best represents given data set. The sum of distances 2 from the x’ axis is minimized. PCA – the general idea x y x’ y’
4 PCA – the general idea PCA finds an orthogonal basis that best represents given data set. PCA finds a best approximating plane (again, in terms of distances 2 ) 3D point set in standard basis x y z
5 Managing high-dimensional data Data base of face scans (3D geometry + texture) 10,000 points in each scan x, y, z, R, G, B 6 numbers for each point Thus, each scan is a 10,000*6 = 60,000-dimensional vector!
6 Managing high-dimensional data How to find interesting axes is this dimensional space? axes that measures age, gender, etc… There is hope: the faces are likely to be governed by a small set of parameters (much less than 60,000…) age axisgender axis
7 Notations Denote our data points by x 1, x 2, …, x n R d
8 The origin is zero-order approximation of our data set (a point) It will be the center of mass: It can be shown that: The origin of the new axes
9 Scatter matrix Denote y i = x i – m, i = 1, 2, …, n where Y is d n matrix with y k as columns ( k = 1, 2, …, n ) YYTYT
10 Variance of projected points In a way, S measures variance (= scatterness) of the data in different directions. Let’s look at a line L through the center of mass m, and project our points x i onto it. The variance of the projected points x’ i is: Original setSmall varianceLarge variance L L L L
11 Variance of projected points Given a direction v, ||v|| = 1, the projection of x i onto L = m + vt is: v m xixi x’ i L
12 Variance of projected points So,
13 Directions of maximal variance So, we have: var(L) = Theorem: Let f : {v R d | ||v|| = 1} R, f (v) = (and S is a symmetric matrix). Then, the extrema of f are attained at the eigenvectors of S. So, eigenvectors of S are directions of maximal/minimal variance!
14 Summary so far We take the centered data points y 1, y 2, …, y n R d Construct the scatter matrix S measures the variance of the data points Eigenvectors of S are directions of maximal variance.
15 Scatter matrix - eigendecomposition S is symmetric S has eigendecomposition: S = V V T S = v2v2 v1v1 vdvd 1 2 d v2v2 v1v1 vdvd The eigenvectors form orthogonal basis
16 Principal components Eigenvectors that correspond to big eigenvalues are the directions in which the data has strong components (= large variance). If the eigenvalues are more or less the same – there is no preferable direction.
17 Principal components There’s no preferable direction S looks like this: Any vector is an eigenvector There is a clear preferable direction S looks like this: is close to zero, much smaller than.
18 How to use what we got For finding oriented bounding box – we simply compute the bounding box with respect to the axes defined by the eigenvectors. The origin is at the mean point m. v2v2 v1v1 v3v3
19 For approximation x y v1v1 v2v2 x y This line segment approximates the original data set The projected data set approximates the original data set x y
20 For approximation In general dimension d, the eigenvalues are sorted in descending order: 1 2 … d The eigenvectors are sorted accordingly. To get an approximation of dimension d’ < d, we take the d’ first eigenvectors and look at the subspace they span ( d’ = 1 is a line, d’ = 2 is a plane…)
21 For approximation To get an approximating set, we project the original data points onto the chosen subspace: x i = m + 1 v 1 + 2 v 2 +…+ d’ v d’ +…+ d v d Projection : x i ’ = m + 1 v 1 + 2 v 2 +…+ d’ v d’ +0 v d’+1 +…+ 0 v d
22 Optimality of approximation The approximation is optimal in least-squares sense. It gives the minimal of: The projected points have maximal variance. Original setprojection on arbitrary lineprojection on v 1 axis
23 Technical remarks: i 0, i = 1,…,d (such matrices are called positive semi- definite). So we can indeed sort by the magnitude of i Theorem: i 0 0 v Proof: Therefore, i 0 0 v