Lecture 3: Causality and Feature Selection

Slides:

Advertisements

Similar presentations

Alexander Statnikov1, Douglas Hardin1,2, Constantin Aliferis1,3

Advertisements

Making Inferences about Causality In general, children who watch violent television programs tend to behave more aggressively toward their peers and siblings.

Predictive Analysis of Gene Expression Data from Human SAGE Libraries Alexessander Alves* Nikolay Zagoruiko + Oleg Okun § Olga Kutnenko + Irina Borisova.

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.

Can causal models be evaluated? Isabelle Guyon ClopiNet / ChaLearn

Lecture 5: Causality and Feature Selection Isabelle Guyon

BIAS AND CONFOUNDING Nigel Paneth. HYPOTHESIS FORMULATION AND ERRORS IN RESEARCH All analytic studies must begin with a clearly formulated hypothesis.

Lecture 4: Embedded methods

Using Markov Blankets for Causal Structure Learning Jean-Philippe Pellet Andre Ellisseeff Presented by Na Dai.

Causality Workbenchclopinet.com/causality Results of the Causality Challenge Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt.

Lecture 3: Introduction to Feature Selection Isabelle Guyon

Causality Workbenchclopinet.com/causality The LOCANET task (Pot-luck challenge, NIPS 2008) Isabelle Guyon, Clopinet Alexander Statnikov, Vanderbilt Univ.

Software Engineering Laboratory1 Introduction of Bayesian Network 4 / 20 / 2005 CSE634 Data Mining Prof. Anita Wasilewska Hiroo Kusaba.

Feature selection methods from correlation to causality Isabelle Guyon NIPS 2008 workshop on kernel learning.

Lecture 2: Introduction to Feature Selection Isabelle Guyon

Lecture 6: Causal Discovery Isabelle Guyon

Feature selection methods Isabelle Guyon IPAM summer school on Mathematics in Brain Imaging. July 2008.

Feature selection and causal discovery fundamentals and applications Isabelle Guyon

C RAAP Analysis: Relevancy Relevancy CRAAP Analysis: Relevancy Determine:  For whom the information is intended; e.g. children, scholars/experts, general.

Feature Selection and Causal discovery Isabelle Guyon, Clopinet André Elisseeff, IBM Zürich Constantin Aliferis, Vanderbilt University.

315 Feature Selection. 316 Goals –What is Feature Selection for classification? –Why feature selection is important? –What is the filter and what is the.

Week 6: Model selection Overview Questions from last week Model selection in multivariable analysis -bivariate significance -interaction and confounding.

Causality challenge #2: Pot-Luck

1 Effective Feature Selection Framework for Cluster Analysis of Microarray Data Gouchol Pok Computer Science Dept. Yanbian University China Keun Ho Ryu.

S TATISTICS Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.

Lecture 5: Causality and Feature Selection Isabelle Guyon

273 Discovery of Causal Structure Using Causal Probabilistic Network Induction AMIA 2003, Machine Learning Tutorial Constantin F. Aliferis & Ioannis Tsamardinos.

NIPS 2001 Workshop on Feature/Variable Selection Isabelle Guyon BIOwulf Technologies.

Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.

Guest lecture: Feature Selection Alan Qi Dec 2, 2004.

Challenges in causality: Results of the WCCI 2008 challenge Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. André.

JOHN DOE PERIOD 8. AT LEAST 80% OF HUMANITY LIVES ON LESS THAN $10 A DAY.

276 Causal Discovery Methods Using Causal Probabilistic Networks MEDINFO 2004, T02: Machine Learning Methods for Decision Support and Discovery Constantin.

MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

Introduction on Graphic Models

Research Design Vocabulary

Psychology Research Methods. Characteristics of Good Psychological Research © 2002 John Wiley & Sons, Inc.

AP Statistics Review Day 2 Chapter 5. AP Exam Producing Data accounts for 10%-15% of the material covered on the AP Exam. “Data must be collected according.

Using Directed Acyclic Graphs (DAGs) to assess confounding Glenys Webster & Anne Harris May 14, 2007 St Paul’s Hospital Statistical “Rounds”

Learning and Testing Junta Distributions Maryam Aliakbarpour (MIT) Joint work with: Eric Blais (U Waterloo) and Ronitt Rubinfeld (MIT and TAU) 1.

Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.

BIAS AND CONFOUNDING Nigel Paneth.

Reasoning Under Uncertainty: Belief Networks

Evidence-based Medicine

Lecture on Bayesian Belief Networks (Basics)

Read R&N Ch Next lecture: Read R&N

Proving Causation Why do you think it was me?!.

Figure 1. Multivariable risk modeling and minimally invasive biomarkers can aid the screening and diagnosis of lung cancer in high-risk individuals. BMI.

BIAS AND CONFOUNDING

Data Collection Principles

COMP61011 Foundations of Machine Learning Feature Selection

Bell & Coins Example Coin1 Bell Coin2

Lecture 3: Introduction to confounding (part 1)

Learning Markov Blankets

Research Methods Part 2.

Making Statistical Inferences

High level GWAS analysis

A Bayesian Approach to Learning Causal networks

Read R&N Ch Next lecture: Read R&N

Lecture 4: Introduction to confounding (part 2)

Feature Selection Ioannis Tsamardinos Machine Learning Course, 2006

A Short Tutorial on Causal Network Modeling and Discovery

Lecture 6: Introduction to effect modification (part 2)

Machine Learning 101 Intro to AI, ML, Deep Learning

POSC 202A: Lecture 1 Introductions Syllabus R

Improving Overlap Farrokh Alemi, Ph.D.

Read R&N Ch Next lecture: Read R&N

Effect Modifiers.

Risk (aOR) of breast cancer associated with the combination of putative at-risk genotypes of the MRN genes and the reproductive risk factors of a history.

Presentation transcript:

Lecture 3: Causality and Feature Selection Isabelle Guyon isabelle@clopinet.com

Variable/feature selection Y Prediction performance is assessed with a given risk functional, not depicted. X Remove features Xi to improve (or least degrade) prediction of Y.

What can go wrong? Guyon-Aliferis-Elisseeff, 2007 Example of NL classifier: SVM Guyon-Aliferis-Elisseeff, 2007

What can go wrong?

What can go wrong? Y X2 Y X2 X1 X1 Guyon-Aliferis-Elisseeff, 2007 Example of NL classifier: SVM Y X1 X2 X2 Y X1 Guyon-Aliferis-Elisseeff, 2007

Causal feature selection Y X Y Prediction performance is assessed with a given risk functional, not depicted. Uncover causal relationships between Xi and Y.

Causal feature relevance Lung cancer

Causal feature relevance Lung cancer

Causal feature relevance Lung cancer

Markov Blanket Lung cancer Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

Feature relevance Surely irrelevant feature Xi: P(Xi, Y |S\i) = P(Xi |S\i)P(Y |S\i) for all S\i  X\i and all assignment of values to S\i Strongly relevant feature Xi: P(Xi, Y |X\i)  P(Xi |X\i)P(Y |X\i) for some assignment of values to X\i Weakly relevant feature Xi: P(Xi, Y |S\i)  P(Xi |S\i)P(Y |S\i) for some assignment of values to S\i  X\i

Markov Blanket Lung cancer Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

Markov Blanket PARENTS Lung cancer Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

Markov Blanket Lung cancer CHILDREN Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

Markov Blanket SPOUSES Lung cancer Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

Causal relevance Surely irrelevant feature Xi: P(Xi, Y |S\i) = P(Xi |S\i)P(Y |S\i) for all S\i  X\i and all assignment of values to S\i Causally relevant feature Xi: P(Xi,Y|do(S\i))  P(Xi |do(S\i))P(Y|do(S\i)) for some assignment of values to S\i Weak/strong causal relevance: Weak=ancestors, indirect causes Strong=parents, direct causes.

Examples Lung cancer

Immediate causes (parents) Genetic factor1 Smoking Lung cancer

Non-immediate causes (other ancestors) Smoking Anxiety Lung cancer

Non causes (e.g. siblings) Genetic factor1 Other cancers Lung cancer

X || Y | C CHAIN FORK C C X X Y

Hidden more direct cause Smoking Tar in lungs Anxiety Lung cancer

Confounder Smoking Genetic factor2 Lung cancer

Immediate consequences (children) Lung cancer Metastasis Coughing Biomarker1

X || Y but X || Y | C Lung cancer X X C C C X Strongly relevant features (Kohavi-John, 1997)  Markov Blanket (Tsamardinos-Aliferis, 2003)

Non relevant spouse (artifact) Lung cancer Bio-marker2 Biomarker1

Another case of confounder Lung cancer Bio-marker2 Systematic noise Biomarker1

Truly relevant spouse Lung cancer Allergy Coughing

Sampling bias Lung cancer Metastasis Hormonal factor

Causal feature relevance Genetic factor1 Smoking Genetic factor2 Tar in lungs Other cancers Anxiety Lung cancer Bio-marker2 Systematic noise Allergy Metastasis Coughing (b) Biomarker1 Hormonal factor

Conclusion Feature selection focuses on uncovering subsets of variables X1, X2, … predictive of the target Y. Multivariate feature selection is in principle more powerful than univariate feature selection, but not always in practice. Taking a closer look at the type of dependencies in terms of causal relationships may help refining the notion of variable relevance.

Acknowledgements and references Feature Extraction, Foundations and Applications I. Guyon et al, Eds. Springer, 2006. http://clopinet.com/fextract-book 2) Causal feature selection I. Guyon, C. Aliferis, A. Elisseeff In “Computational Methods of Feature Selection”, Huan Liu and Hiroshi Motoda Eds., Chapman and Hall/CRC Press, 2007. http://clopinet.com/isabelle/Papers/causalFS.pdf