Download presentation
Presentation is loading. Please wait.
Published bySteven Brooks Modified over 9 years ago
1
Generalizing Linear Discriminant Analysis
2
Linear Discriminant Analysis Objective -Project a feature space (a dataset n-dimensional samples) onto a smaller -Maintain the class separation Reason -Reduce computational costs -Minimize overfitting
3
Linear Discriminant Analysis Want to reduce dimensionality while preserving ability to discriminate Figures from [1]
4
Linear Discriminant Analysis Could just look at means and find dimension that separates means most: Equation from [1]
5
Linear Discriminant Analysis Could just look at means and find dimension that separates means most: Equations from [1]
6
Linear Discriminant Analysis Figure from [1]
7
Linear Discriminant Analysis Fisher’s solution.
8
Linear Discriminant Analysis Fisher’s solution… Scatter: Equation from [1]
9
Linear Discriminant Analysis Fisher’s solution… Scatter: Maximize: Equations from [1]
10
Linear Discriminant Analysis Fisher’s solution… Figure from [1]
11
Linear Discriminant Analysis How to get optimum w*?
12
Linear Discriminant Analysis How to get optimum w*? ◦Must express J(w) as a function of w. Equation from [1]
13
Linear Discriminant Analysis How to get optimum w*8… Equation from [1]
14
Linear Discriminant Analysis How to get optimum w*… Equations modified from [1]
15
Linear Discriminant Analysis How to get optimum w*… Equation from [1]
16
Linear Discriminant Analysis How to get optimum w*… Equation from [1]
17
Linear Discriminant Analysis How to get optimum w*… Equations from [1]
18
Linear Discriminant Analysis How to generalize for >2 classes: -Instead of a single projection, we calculate a matrix of projections.
19
Linear Discriminant Analysis How to generalize for >2 classes: -Instead of a single projection, we calculate a matrix of projections. -Within-class scatter becomes: -Between-class scatter becomes: Equations from [1]
20
Linear Discriminant Analysis How to generalize for >2 classes… Here, W is a projection matrix. Equation from [1]
21
Linear Discriminant Analysis Limitations of LDA: -Parametric method -Produces at most (C-1) projections Benefits of LDA: -Linear Decision Boundaries ◦Human interpretation ◦Implementation -Good classification results
22
Flexible Discriminant Analysis
23
-Turns the LDA problem into a linear regression problem.
24
Flexible Discriminant Analysis -Turns the LDA problem into a linear regression problem. -“Differences between LDA and FDA and what criteria can be used to pick one for a given task?” (Tavish)
25
Flexible Discriminant Analysis -Turns the LDA problem into a linear regression problem. -“Differences between LDA and FDA and what criteria can be used to pick one for a given task?” (Tavish) ◦Linear regression can be generalized into more flexible, nonparametric forms of regression. ◦ (Parametric – mean, variance…)
26
Flexible Discriminant Analysis -Turns the LDA problem into a linear regression problem. -“Differences between LDA and FDA and what criteria can be used to pick one for a given task?” (Tavish) ◦Linear regression can be generalized into more flexible, nonparametric forms of regression. ◦ (Parametric – mean, variance…) ◦Expands the set of predictors via basis expansions
27
Flexible Discriminant Analysis Figure from [2]
28
Penalized Discriminant Analysis
29
-Fit an LDA model, but ‘penalize’ the coefficients to be more smooth. ◦Directly curbing ‘overfitting’ problem
30
Penalized Discriminant Analysis -Fit an LDA model, but ‘penalize’ the coefficients to be more smooth. ◦Directly curbing ‘overfitting’ problem Positively correlated predictors lead to noisy, negatively correlated coefficient estimates, and this noise results in unwanted sampling variance. ◦Example: images
31
Penalized Discriminant Analysis Images from [2]
32
Mixture Discriminant Analysis
33
-Instead of enlarging (FDA) the set of predictors, or smoothing the coefficients (PDA) for the predictors, and using one Gaussian:
34
Mixture Discriminant Analysis -Instead of enlarging (FDA) the set of predictors, or smoothing the coefficients (PDA) for the predictors, and using one Gaussian: -Model each class as a mixture of two or more Gaussian components. -All components sharing the same covariance matrix
35
Mixture Discriminant Analysis Image from [2]
36
Sources 1.Gutierrez-Osuna, Ricardo– “CSCE 666 Pattern Analysis – Lecture 10” http://research.cs.tamu.edu/prism/lectures/pr/pr_l10.pdf http://research.cs.tamu.edu/prism/lectures/pr/pr_l10.pdf 2.Hastie, Trever, et al. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 3.Raschka, Sebastian - “Linear Discriminant Analysis bit by bit” http://sebastianraschka.com/Articles/2014_python_lda.html
37
END.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.