Download presentation
Presentation is loading. Please wait.
Published byShawn Byrd Modified over 9 years ago
1
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Information Loss of the Mahalanobis Distance in High Dimensions- Application to Feature Selection Presenter : Chien-Hsing Chen Author: Dimitrios Ververidis Constantine Kotropoulos 1 2009.IEEE Transactions on Pattern Analysis and Machine Intelligence.7
2
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 2 Outline Motivation Objective Method Experiments Conclusion Comment
3
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 3 calculate information loss for a dataset to estimate distance between a point and a class center P(x i | U c ) total variance estimate total variance for k datasets which are generated by the k-fold cross validation approach estimate total variance for a dataset which is randomly sampling with resubstitution Motivation
4
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 4 to estimate distance between a point and a class center P(x i | U c ) total variance with Mahalanobis distance? estimate total variance for k datasets which are generated by the k-fold cross validation approach, and for substitution such information loss provide us : information loss of 「 k-fold cross validation 」 is larger than that of 「 substitution 」 feature selection Objective
5
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 5 Idea Overall information loss a information loss 2a information loss 1/2a a dataset k-fold cross validation resubstitution It is proven. The difference (relationship) between a dataset, k-fold cross validation and resubstitution
6
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 6 Information loss with Mahalanobis a dataset k-fold cross validation or resubstitution
7
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 7 Information loss with Mahalanobis a dataset k-fold cross validation S 1 +S 2 =S 3 +S 2 =1 => S 1 +S 3 =2S 3
8
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 8 Information loss with Mahalanobis a dataset resubstitution S 1 +S 2 +S 4 =S 3 +S 2 =1 => S 1 +S 4 =S 3
9
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 9 Experimental results information loss of 「 k-fold cross validation 」 is larger than that of 「 substitution 」
10
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 10 CCR (correct classification rate) Feature Selection with k-cross validation Sequential forward selection: education, age, address (average CCR over b ={1,…B} A better feature, the lower LOSS
11
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 11 Experimental results Sequential forward selection, sequential floating forward selection ReliefF
12
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 12 Correct classification rate estimated by cross validation Feature subset selection method Conclusion
13
Intelligent Database Systems Lab N.Y.U.S.T. I. M. 13 Advantage idea is easy proof Disadvantage My weak knowledge cannot understand the proofs Application dimension reduction with the help of reference class label My Comment
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.