Measuring the Stability of Feature Selection

Measuring the Stability of Feature Selection
Sarah Nogueira and Gavin Brown School of Computer Science University of Manchester

Curse of dimensionality Lack of interpretability
Learning Algorithm LOTS OF FEATURES! Predictive model Too expensive Curse of dimensionality Lack of interpretability

Analogous concept of variance for feature sets
BIG DATA Feature Selection FEATURE SET Learning Algorithm Predictive model Stability 𝑀𝑆𝐸=𝑏𝑖𝑎 𝑠 2 +𝑣𝑎𝑟 Analogous concept of variance for feature sets Sensitivity of the selected features to small perturbations in the data

Stability A recent and growing area of research
Indicator of reproducible research In biomarker identification - stability is said to be as important as predictive power - Instability has been a major obstacle to clinical applications

Literature to quantify stability!
Jaccard (2002) POG (2006) Hamming (2007) Kuncheva (2007) Krizek (2007) Dice (2008) nPOG (2009) Lustgarten (2009) CWrel (2010) Wald (2013) Lots of definitions Some statistical, some heuristics Conflicting opinions Different use cases

Measuring Stability Φ DATA
Sample 1 Sample 2 Sample M Feature set s1 Feature set s2 Feature set sM Feature Selection Stability Φ DATA Feature Selection Feature Selection 𝑆𝑡𝑎𝑏𝑖𝑙𝑖𝑡𝑦= Φ ( 𝑠 1 ,…, 𝑠 𝑀 )= 1 𝑀(𝑀−1) 𝑖 𝑗≠𝑖 𝑠𝑖𝑚( 𝑠 𝑖 , 𝑠 𝑗 )

such that values are interpretable and comparable?
What properties/behaviours should a measure have, such that values are interpretable and comparable? Jaccard (2002) POG (2006) Hamming (2007) Kuncheva (2007) Krizek (2007) Dice (2008) nPOG (2009) Lustgarten (2009) CWrel (2010) Wald (2013)

11100 00101 Imagine we had d features…. binary string of length d.
Select features 1,2,3 00101 Select features 3,5

Imagine we had d features…. binary string of length d.
11100 10110 00111 . 11010 Select EXACTLY 3 features, M times. Binary matrix M x d

→ Defined for all possible feature sets
Desirable property 1: Fully defined 11100 00110 00111 . 10010 Sometimes select 2. Sometimes select 3. Not all measures work in this scenario! → Defined for all possible feature sets

Φ should be bounded by constants
Desirable property 2: Bounds Fully stable Random Φ should be bounded by constants

Desirable property 3: Maximum
𝐴= 𝐵= Lustgarten (2009) measure Φ 𝐴 =0.6 Φ 𝐵 =0.8 All feature sets are identical → Φ reaches its maximum

Desirable property 3: Maximum
𝐴= Wald (2013) and CWrel (2010): Φ reaches its maximal value of 1 Φ reaches its maximum ↔ All feature sets are identical

Desirable property 4: Correction for chance
000………………………………00 1 1 110 000 𝑬 𝜱 =𝟎 when the selection is random

Properties Fully defined Bounds Maximum Correction For Chance
Jaccard (2002)  POG (2006) Hamming (2007) Kuncheva (2007) Krizek (2007) Dice (2008) nPOG (2009) Lustgarten (2009) CWrel (2010) Wald (2013)

Properties Fully defined Bounds Maximum Correction For Chance
Jaccard (2002)  POG (2006) Hamming (2007) Kuncheva (2007) Krizek (2007) Dice (2008) nPOG (2009) Lustgarten (2009) CWrel (2010) Wald (2013) Pearson

Average Pairwise Pearson’s correlation
Φ =1 Φ =0.58 Φ =0 Fully stable Random selection

Experiments Use L1 regularized logistic regression with regularizing parameter 𝜆 Can we increase stability without loss of accuracy? OPTIMAL PARETO TRADE-OFF  Minimal error  Maximal stability Selection of a regularizing parameter that improves stability without loss of predictive power

Conclusions Increasing stability brings more confidence in
the features selected in the model. Pearson’s correlation can do the job, having all desirable properties. Implementation in Matlab available online at:

Thank you

Measuring the Stability of Feature Selection

Similar presentations

Presentation on theme: "Measuring the Stability of Feature Selection"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Measuring the Stability of Feature Selection

Similar presentations

Presentation on theme: "Measuring the Stability of Feature Selection"— Presentation transcript:

Similar presentations

About project

Feedback