Linear Discriminant Analysis

Slides:



Advertisements
Similar presentations
Component Analysis (Review)
Advertisements

Face Recognition and Biometric Systems Eigenfaces (2)
Dimension reduction (1)
Indian Statistical Institute Kolkata
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Pattern Recognition and Machine Learning
Dimensionality Reduction Chapter 3 (Duda et al.) – Section 3.8
Fisher’s Linear Discriminant  Find a direction and project all data points onto that direction such that:  The points in the same class are as close.
L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Dimensional reduction, PCA
Face Recognition using PCA (Eigenfaces) and LDA (Fisherfaces)
Chapter 5 Part II 5.3 Spread of Data 5.4 Fisher Discriminant.
Discriminant Functions Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Intro. ANN & Fuzzy Systems Lecture 21 Clustering (2)
Ch. 10: Linear Discriminant Analysis (LDA) based on slides from
Ron Yanovich & Guy Peled 1. Contents Grayscale coloring background Luminance / Luminance channel Segmentation Discrete Cosine Transform K-nearest-neighbor.
Summarized by Soo-Jin Kim
Machine Learning CS 165B Spring Course outline Introduction (Ch. 1) Concept learning (Ch. 2) Decision trees (Ch. 3) Ensemble learning Neural Networks.
Probability of Error Feature vectors typically have dimensions greater than 50. Classification accuracy depends upon the dimensionality and the amount.
Feature extraction 1.Introduction 2.T-test 3.Signal Noise Ratio (SNR) 4.Linear Correlation Coefficient (LCC) 5.Principle component analysis (PCA) 6.Linear.
General Tensor Discriminant Analysis and Gabor Features for Gait Recognition by D. Tao, X. Li, and J. Maybank, TPAMI 2007 Presented by Iulian Pruteanu.
IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
Generalizing Linear Discriminant Analysis. Linear Discriminant Analysis Objective -Project a feature space (a dataset n-dimensional samples) onto a smaller.
Functional Brain Signal Processing: EEG & fMRI Lesson 7 Kaushik Majumdar Indian Statistical Institute Bangalore Center M.Tech.
Using Support Vector Machines to Enhance the Performance of Bayesian Face Recognition IEEE Transaction on Information Forensics and Security Zhifeng Li,
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction Presented by Xianwang Wang Masashi Sugiyama.
ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 471/571 – Lecture 6 Dimensionality Reduction – Fisher’s Linear Discriminant 09/08/15.
Lecture 4 Linear machine
Discriminant Analysis
Reduces time complexity: Less computation Reduces space complexity: Less parameters Simpler models are more robust on small datasets More interpretable;
Support Vector Machine Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata November 3, 2014.
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
Text Classification using Support Vector Machine Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Dimensionality reduction
Principal Component Analysis and Linear Discriminant Analysis for Feature Reduction Jieping Ye Department of Computer Science and Engineering Arizona State.
Feature Extraction 主講人:虞台文. Content Principal Component Analysis (PCA) PCA Calculation — for Fewer-Sample Case Factor Analysis Fisher’s Linear Discriminant.
2D-LDA: A statistical linear discriminant analysis for image matrix
Linear Classifiers Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 10: PRINCIPAL COMPONENTS ANALYSIS Objectives:
LDA (Linear Discriminant Analysis) ShaLi. Limitation of PCA The direction of maximum variance is not always good for classification.
Feature Extraction 主講人:虞台文.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 09: Discriminant Analysis Objectives: Principal.
Out of sample extension of PCA, Kernel PCA, and MDS WILSON A. FLORERO-SALINAS DAN LI MATH 285, FALL
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Introduction to Vectors and Matrices
Principal Component Analysis (PCA)
Machine Learning Fisher’s Criteria & Linear Discriminant Analysis
Dimensionality Reduction
LECTURE 11: Advanced Discriminant Analysis
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
LECTURE 10: DISCRIMINANT ANALYSIS
Roberto Battiti, Mauro Brunato
Classification Discriminant Analysis
PCA vs ICA vs LDA.
Classification Discriminant Analysis
Computational Intelligence: Methods and Applications
Linear Discriminant Analysis(LDA)
Introduction PCA (Principal Component Analysis) Characteristics:
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Feature space tansformation methods
X.2 Linear Discriminant Analysis: 2-Class
Generally Discriminant Analysis
Principal Components What matters most?.
LECTURE 09: DISCRIMINANT ANALYSIS
Introduction to Vectors and Matrices
Presentation transcript:

Linear Discriminant Analysis Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 28, 2014

The owning house data Can we separate the points with a line? Equivalently, project the points onto another line so that the projection of the points in the two classes are separated

Linear Discriminant Analysis (LDA) Not same as Latent Dirichlet Allocation (also LDA) Linear Discriminant Analysis (LDA) Reduce dimensionality, preserve as much class discriminatory information as possible A projection with non-ideal separation A projection with ideal separation The figures are from Ricardo Gutierrez-Osuna’s slides

Projection onto a line – basics 1×2 vector norm=1 represents the x axis 2×2 matrix two data points (0.5,0.7) and (1.1,0.8) Projection onto the x axis Distances from the origin Projection onto the y axis Distances from the origin

Projection onto a line – basics 1×2 vector, norm=1 the x=y line Projection onto the x=y line Distances from the origin distance of projection of x onto the line along w from origin = wTx wTx : a scalar x : any point w : some unit vector

Projection vector for LDA Define a measure of separation (discrimination) Mean vectors μ1 and μ2 for the two classes c1 and c2, with N1 and N2 points: The mean vector projected onto the a unit vector w:

Towards maximizing separation One approach: find a line such that the distance between projected means is maximized Objective function J(w) Example: if w is the unit vector along x or y axis μ1 Better separation μ2 Better separation of means

How much are the points scattered? Scatter: within each class, variance of the projected points Within-class scatter of the projected samples: μ1 μ2

Fisher’s discriminant Maximize difference between the projected means, normalized by within-class scatter μ1 μ2 Separation of means and the points as well

Formulation of the objective function Measure of scatter in the feature space (x) The within-class scatter matrix is: SW = S1 + S2 The scatter of projections, in terms of SW Hence:

Formulation of the objective function Similarly, the difference in terms of μi’s in the feature space Between class scatter matrix Fisher’s objective function in terms of SB and SW

Maximizing the objective function Take derivative and solve for it being zero Dividing by same denominator The generalized eigenvalue problem

Limitations of LDA LDA is a parametric method Assumes Gaussian (normal) distribution of data What if the data is very much non-Gaussian? μ2 μ1 μ1=μ2 μ1=μ2 LDA depends on mean for the discriminatory information What if it is mainly in the variance?