Download presentation
Presentation is loading. Please wait.
Published byBrian Mooney Modified over 10 years ago
1
Data Analysis Lecture 8 Tijl De Bie
2
Dimensionality reduction How to deal with high-dimensional data? How to visualize it? How to explore it? Dimensionality reduction is one way…
3
Projections in vector spaces wx Meaning: –||w||*||x||*cos(theta) –For unit norm w: projection of x on w –To express hyperplanes: wx=b –To express halfspaces: wx>b All these interpretations are relevant
4
Projections in vector spaces [Some drawings…]
5
Variance of a projection wx=xw is the projection of x on w Let X contain many points x as its rows Projection of all points in X is: –Xw = (x 1 w, x 2 w, …, x n w) Variance of projection on w: –sum i (x i w/||w||) 2 = (wXXw)/(ww) –Or, if ||w||=1, this is: sum i (x i w) 2 = wXXw
6
Principal Component Analysis Direction / unit vector w with largest variance? –max w wXXw subject to ww=1 Lagrangian: –L(w) = wXXw-lambda(ww-1) Gradient w.r.t. w equal to zero: –2*XXw=2*lambda*w –(XX)*w=lambda*w Eigenvalue problem!
7
Principal Component Analysis Find w as dominant eigenvector of XX! Then we can project the data on this w For no other projection the variance is larger This projection is the best 1-D representation of the data
8
Principal Component Analysis Best 1-D representation given by projection on dominant eigenvector Second best w: the second eigenvector and so on…
9
Technical but important… I havent mentioned: –The data should be centred –That is: the mean of each of the features should be 0 –If that is not the case: subtract from each feature its mean (centering)
10
Clustering Another way to make sense of high- dimensional data Find coherent groups in the data Points that are: –close to one another within a cluster, but –distant from points in other clusters
11
Distances between points Distance between points: ||x i -x j || Can we assign points to K different clusters –each of which is coherent –distant from each other? Define the clusters by means of cluster centres m k with k=1,2,…,K
12
K-means cost function Ideal clustering: –||x i -m k(i) || small for all x i if m k(i) is its cluster centre –sum i ||x i -m k(i) || 2 small Unfortunately: hard to minimise… Simultaneous optimisation of: –k(i) (which cluster centre for which point) –m k (where are the cluster centres) Iterative strategy!
13
K-means clustering Iteratively optimise centres and cluster assignments K-means algorithm: –Start with random choices of K centres m k –Set k(i)=argmin k ||x i -m k || 2 –Set m k =mean({x i : k(i)=k}) Do this for many different random starts, and pick the best result (with lowest cost)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.