Download presentation
Presentation is loading. Please wait.
1
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 9 Data Analysis Martin Russell
2
Slide 2 EE3J2 Data Mining Objectives To review basic data analysis To review the notions of mean, variance and covariance To explain Principle Components Analysis (PCA)
3
Slide 3 EE3J2 Data Mining Example from speech processing Plot of high-frequency energy vs low- frequency energy, for 25 ms speech segments, sampled every 10ms
4
Slide 4 EE3J2 Data Mining Basic statistics Sample mean Sample variance ‘y’ Sample variance ‘x’ ‘y’ max ‘y’ min ‘x’ min ‘x’ max
5
Slide 5 EE3J2 Data Mining Basic statistics Denote samples by X = x 1, x 2, …,x T, where x t = (x t 1, x t 2, …, x t N ) The sample mean (X) is given by:
6
Slide 6 EE3J2 Data Mining More basic statistics The sample variance (X) is given by:
7
Slide 7 EE3J2 Data Mining Covariance As the x value increases, the y value also increases This is (positive) co-variance If y decreases as x increases, the result is negative covariance
8
Slide 8 EE3J2 Data Mining Definition of covariance The covariance between the m th and n th components of the sample data is defined by: In practice it is useful to subtract the mean (X) from each of the data points x t. The sample mean is then 0 and
9
Slide 9 EE3J2 Data Mining Data with mean subtracted Implies positive covariance
10
Slide 10 EE3J2 Data Mining Sample data rotated through 2 Implies negative covariance
11
Slide 11 EE3J2 Data Mining Data with covariance removed
12
Slide 12 EE3J2 Data Mining Principle Components Analysis PCA is the technique which I used to diagonalise the sample covariance matrix The first step is to write the covariance matrix in the form: where D is diagonal and U is a matrix corresponding to a rotation Can do this using SVD (see lecture 8) or eigenvalue decomposition
13
Slide 13 EE3J2 Data Mining PCA continued U implements rotation through angle e 1 is the first column of U d 11 is the variance in the direction e 1 e 2 is the second column of U d 22 is the variance in the direction e 2 e1e1 e2e2
14
Slide 14 EE3J2 Data Mining Summary Basic data analysis Means, variance and covariance Principle Components Analysis
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.