Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 5 Part II 5.3 Spread of Data 5.4 Fisher Discriminant.

Similar presentations


Presentation on theme: "Chapter 5 Part II 5.3 Spread of Data 5.4 Fisher Discriminant."— Presentation transcript:

1 Chapter 5 Part II 5.3 Spread of Data 5.4 Fisher Discriminant

2 Measuring the spread of data Covariance of two random variables, x and y – Expectation of their product x, y need to be standardized if they use different units of measurement

3 Correlation Covariance of x and y measure correlation This treats coordinates independently – In kernel-induced feature space we don’t have access to the coordinates

4 Spread in the Feature Space Consider l × N matrix X Assume zero mean, then covariance matrix C

5 Spread in the Feature Space Observe Consider (unit vector) then the value of the projection is:

6 Spread in the Feature Space Variance of the norms of the projections onto v Where

7 Spread in the Feature Space So the covariance matrix contains everything needed to calculate the variance of data along any projection direction If the data is not centered, suntract square of mean projection

8 Variance of Projections Variance of projections onto fixed direction v in feature space using only inner product v is a linear combination of training points Then:

9 Now that we can compute the variance of projections in feature space we can implement a linear classifier The Fisher discriminant

10 Fisher Discriminant Classification function: Where w is chosen to maximize

11 Regularized Fisher discriminant Choose w to solve Quotient is invariant to rescalings of w – Use fixed value C for denominator Using a Lagrange multiplier v, the solution is

12 Regularized Fisher discriminant We then have Where – y is vector of labels {-1, +1} – I + (I - ) is identity matrix with only positive (negative) columns containing 1s – j + (j - ) all-1s vector, similar to I + (I - )

13 Regularized Fisher discriminant Furthermore, let – Where D is a diagonal matrix – And where C +, C - are given by

14 Regularized Fisher discriminant Then – With appropriate redefinitions of v, λ and C – Taking derivatives with respect to w produces

15 Dual expression of w We can express w in feature space as a linear combination of training samples w=X’α, with – Substituting w produces – Giving This is invariant to rescalings of w, so we can rescale α by v to obtain

16 Regularized kernel Fisher discriminant Solution given by Classification function is – Where k is the vector with entries k(x,x i ), i=1,…,l – And b is chosen so that w’μ + -b = b-w’μ -

17 Regularized kernel Fisher discriminant Taking w=X’α, we have – where


Download ppt "Chapter 5 Part II 5.3 Spread of Data 5.4 Fisher Discriminant."

Similar presentations


Ads by Google