Feature Selection Analysis

Feature Selection Analysis
An attempt at a generalized relationship between sample size and dimensionality Project for 9.520 Nathan Eagle

Motivation Expense of taking/labeling additional sample data
How much training data is really necessary?

Empirical Evidence – “s-curve”
SVM(fu) Classifier on Sayan’s Feature Selection Technique

Empirical Evidence – linearity
Linear relationship between samples and dimensions

Proof I – Hypothesis Testing
(1) (2) But what are the priors – pfeat? What if there are more than 1 relevant feature?

Proof II – Chebyshev and Weak Law of Large Numbers
(3) From W.L.L.N.: (4) From Chebyshev’s inequality: (5)

Proof II (cont) From before: (6) Inversing the probability: (7)
For all features: (8)

Proof II (cont) From before: (9) setting: Sample Size vs. Dimensions
Irrelevant Dimensions/Features Training Sample Size

Proof III – Sayan’s Generalization Error Algorithm
Generalization Error for two classes drawn from Gaussian distributions:* (10) Where the separating hyperplane is define as: (11) Fisher Linear Discriminant * As proved in Sayan Mukherjee’s PhD thesis

Results THEORITICAL EMPERICAL 1 iteration 50 iterations

Conclusions Sample size seems to scale linearly with irrelevant features both empirically and theoretically – regardless of the classifier. The ‘s-curve’ does not seems to be a generalized property of all feature selection methods

Feature Selection Analysis

Similar presentations

Presentation on theme: "Feature Selection Analysis"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Feature Selection Analysis

Similar presentations

Presentation on theme: "Feature Selection Analysis"— Presentation transcript:

Similar presentations

About project

Feedback