Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 3: Causality and Feature Selection

Similar presentations


Presentation on theme: "Lecture 3: Causality and Feature Selection"— Presentation transcript:

1 Lecture 3: Causality and Feature Selection
Isabelle Guyon

2 Variable/feature selection
Y Prediction performance is assessed with a given risk functional, not depicted. X Remove features Xi to improve (or least degrade) prediction of Y.

3 What can go wrong? Guyon-Aliferis-Elisseeff, 2007
Example of NL classifier: SVM Guyon-Aliferis-Elisseeff, 2007

4 What can go wrong?

5 What can go wrong? Y X2 Y X2 X1 X1 Guyon-Aliferis-Elisseeff, 2007
Example of NL classifier: SVM Y X1 X2 X2 Y X1 Guyon-Aliferis-Elisseeff, 2007

6 Causal feature selection
Y X Y Prediction performance is assessed with a given risk functional, not depicted. Uncover causal relationships between Xi and Y.

7 Causal feature relevance
Lung cancer

8 Causal feature relevance
Lung cancer

9 Causal feature relevance
Lung cancer

10 Markov Blanket Lung cancer
Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

11 Feature relevance Surely irrelevant feature Xi:
P(Xi, Y |S\i) = P(Xi |S\i)P(Y |S\i) for all S\i  X\i and all assignment of values to S\i Strongly relevant feature Xi: P(Xi, Y |X\i)  P(Xi |X\i)P(Y |X\i) for some assignment of values to X\i Weakly relevant feature Xi: P(Xi, Y |S\i)  P(Xi |S\i)P(Y |S\i) for some assignment of values to S\i  X\i

12 Markov Blanket Lung cancer
Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

13 Markov Blanket PARENTS Lung cancer
Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

14 Markov Blanket Lung cancer CHILDREN
Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

15 Markov Blanket SPOUSES Lung cancer
Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

16 Causal relevance Surely irrelevant feature Xi:
P(Xi, Y |S\i) = P(Xi |S\i)P(Y |S\i) for all S\i  X\i and all assignment of values to S\i Causally relevant feature Xi: P(Xi,Y|do(S\i))  P(Xi |do(S\i))P(Y|do(S\i)) for some assignment of values to S\i Weak/strong causal relevance: Weak=ancestors, indirect causes Strong=parents, direct causes.

17 Examples Lung cancer

18 Immediate causes (parents)
Genetic factor1 Smoking Lung cancer

19 Non-immediate causes (other ancestors)
Smoking Anxiety Lung cancer

20 Non causes (e.g. siblings)
Genetic factor1 Other cancers Lung cancer

21 X || Y | C CHAIN FORK C C X X Y

22 Hidden more direct cause
Smoking Tar in lungs Anxiety Lung cancer

23 Confounder Smoking Genetic factor2 Lung cancer

24 Immediate consequences (children)
Lung cancer Metastasis Coughing Biomarker1

25 X || Y but X || Y | C Lung cancer X X C C C X
Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

26 Non relevant spouse (artifact)
Lung cancer Bio-marker2 Biomarker1

27 Another case of confounder
Lung cancer Bio-marker2 Systematic noise Biomarker1

28 Truly relevant spouse Lung cancer Allergy Coughing

29 Sampling bias Lung cancer Metastasis Hormonal factor

30 Causal feature relevance
Genetic factor1 Smoking Genetic factor2 Tar in lungs Other cancers Anxiety Lung cancer Bio-marker2 Systematic noise Allergy Metastasis Coughing (b) Biomarker1 Hormonal factor

31 Conclusion Feature selection focuses on uncovering subsets of variables X1, X2, … predictive of the target Y. Multivariate feature selection is in principle more powerful than univariate feature selection, but not always in practice. Taking a closer look at the type of dependencies in terms of causal relationships may help refining the notion of variable relevance.

32 Acknowledgements and references
Feature Extraction, Foundations and Applications I. Guyon et al, Eds. Springer, 2006. 2) Causal feature selection I. Guyon, C. Aliferis, A. Elisseeff In “Computational Methods of Feature Selection”, Huan Liu and Hiroshi Motoda Eds., Chapman and Hall/CRC Press, 2007.


Download ppt "Lecture 3: Causality and Feature Selection"

Similar presentations


Ads by Google