Ambiguous Manipulations

Slides:

Advertisements

Similar presentations

1 Learning Causal Structure from Observational and Experimental Data Richard Scheines Carnegie Mellon University.

Advertisements

Elena Popa.  Children’s causal learning and evidence.  Causation, intervention, and Bayes nets.  The conditional intervention principle and Woodward’s.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for 1 Lecture Notes for E Alpaydın 2010.

Discovering Cyclic Causal Models by Independent Components Analysis Gustavo Lacerda Peter Spirtes Joseph Ramsey Patrik O. Hoyer.

Topic Outline Motivation Representing/Modeling Causal Systems

Weakening the Causal Faithfulness Assumption

Outline 1)Motivation 2)Representing/Modeling Causal Systems 3)Estimation and Updating 4)Model Search 5)Linear Latent Variable Models 6)Case Study: fMRI.

Structure Learning Using Causation Rules Raanan Yehezkel PAML Lab. Journal Club March 13, 2003.

Peter Spirtes, Jiji Zhang 1. Faithfulness comes in several flavors and is a kind of principle that selects simpler (in a certain sense) over more complicated.

Measurements and Errors Introductory Lecture Prof Richard Thompson 4 th October 2007.

Lecture 5: Causality and Feature Selection Isabelle Guyon

PHSSR IG CyberSeminar Introductory Remarks Bryan Dowd Division of Health Policy and Management School of Public Health University of Minnesota.

Correlation AND EXPERIMENTAL DESIGN

Challenges posed by Structural Equation Models Thomas Richardson Department of Statistics University of Washington Joint work with Mathias Drton, UC Berkeley.

Learning Causality Some slides are from Judea Pearl’s class lecture

1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.

Bayes Nets Rong Jin. Hidden Markov Model  Inferring from observations (o i ) to hidden variables (q i )  This is a general framework for representing.

1Causality & MDL Causal Models as Minimal Descriptions of Multivariate Systems Jan Lemeire June 15 th 2006.

1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.

Mediating Between Causes and Probabilities: the Use of Graphical Models in Econometrics Alessio Moneta Max Planck Institute of Economics, Jena, and Sant’Anna.

Statistics Micro Mini Threats to Your Experiment!

DECISION THEORY GOOD CAUSALITY BAD

1 gR2002 Peter Spirtes Carnegie Mellon University.

MACHINE LEARNING 6. Multivariate Methods 1. Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 Motivating Example  Loan.

Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006.

Causal Modeling for Anomaly Detection Andrew Arnold Machine Learning Department, Carnegie Mellon University Summer Project with Naoki Abe Predictive Modeling.

1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.

CAUSAL SEARCH IN THE REAL WORLD. A menu of topics  Some real-world challenges:  Convergence & error bounds  Sample selection bias  Simpson’s paradox.

Bayes Net Perspectives on Causation and Causal Inference

1 Tetrad: Machine Learning and Graphcial Causal Models Richard Scheines Joe Ramsey Carnegie Mellon University Peter Spirtes, Clark Glymour.

FPP Chapters Design of Experiments. Main topics Designed experiments Comparison Randomization Observational studies “control” Compare and contrast.

Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.

Causal Inference and Graphical Models Peter Spirtes Carnegie Mellon University.

Feature Selection and Causal discovery Isabelle Guyon, Clopinet André Elisseeff, IBM Zürich Constantin Aliferis, Vanderbilt University.

Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?

1 Tutorial: Causal Model Search Richard Scheines Carnegie Mellon University Peter Spirtes, Clark Glymour, Joe Ramsey, others.

1 Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon.

1 Center for Causal Discovery: Summer Workshop June 8-11, 2015 Carnegie Mellon University.

Nov. 13th, Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour, and many others Dept. of Philosophy & CALD Carnegie Mellon.

V13: Causality Aims: (1) understand the causal relationships between the variables of a network (2) interpret a Bayesian network as a causal model whose.

Methodological Problems in Cognitive Psychology David Danks Institute for Human & Machine Cognition January 10, 2003.

Penn State - March 23, The TETRAD Project: Computational Aids to Causal Discovery Peter Spirtes, Clark Glymour, Richard Scheines and many others.

INTERVENTIONS AND INFERENCE / REASONING. Causal models  Recall from yesterday:  Represent relevance using graphs  Causal relevance ⇒ DAGs  Quantitative.

1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2008 Readings: K&F: 3.1, 3.2, –  Carlos.

Exploratory studies: you have empirical data and you want to know what sorts of causal models are consistent with it. Confirmatory tests: you have a causal.

INTRODUCTION TO STATISTICS. Anthony J Greene2 Lecture Outline I.The Idea of Science II.Experimental Designs A.Variables 1.Independent Variables 2.Dependent.

Lecture 2: Statistical learning primer for biologists

276 Causal Discovery Methods Using Causal Probabilistic Networks MEDINFO 2004, T02: Machine Learning Methods for Decision Support and Discovery Constantin.

The Visual Causality Analyst: An Interactive Interface for Causal Reasoning Jun Wang, Stony Brook University Klaus Mueller, Stony Brook University, SUNY.

Exploring Social Psychology by David G. Myers 7th Edition

CAUSAL REASONING FOR DECISION AIDING SYSTEMS COGNITIVE SYSTEMS LABORATORY UCLA Judea Pearl, Mark Hopkins, Blai Bonet, Chen Avin, Ilya Shpitser.

Mediation: The Causal Inference Approach David A. Kenny.

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.

1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2006 Readings: K&F: 3.1, 3.2, 3.3.

Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,

Variable selection in Regression modelling Simon Thornley.

1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.

Modeling of Core Protection Calculator System Software February 28, 2005 Kim, Sung Ho Kim, Sung Ho.

1Causal Inference and the KMSS. Predicting the future with the right model Stijn Meganck Vrije Universiteit Brussel Department of Electronics and Informatics.

1 Day 2: Search June 14, 2016 Carnegie Mellon University Center for Causal Discovery.

Tetrad 1)Main website: 2)Download:

Markov Properties of Directed Acyclic Graphs

Center for Causal Discovery: Summer Short Course/Datathon

Causal Data Mining Richard Scheines

Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour,

Searching for Graphical Causal Models of Education Data

Research Methods & Statistics

Presentation transcript:

Ambiguous Manipulations Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University

1. Motivation Wanted: Answers to Causal Questions: Does attending Day Care cause Aggression? Does watching TV cause obesity? How can we answer these questions empirically? When and how can we estimate the size of the effect? Can we know our estimates are reliable?

Causation & Intervention Conditioning is not the same as intervening P(Lung Cancer | Tar-stained teeth = no)  P(Lung Cancer | Tar-stained teeth set= no) Show Teeth Slides

Causal Inference: Experiments Gold Standard: Randomized Clinical Trials - Intervene: Randomly assign treatment - Observe Response Estimate P( Response | Treatment assigned)

Causal Inference: Observational Studies Collect a sample on - Potential Causes (X) - Response (Y) - Covariates (potential confounders Z) Estimate P(Y | X, Z) Highly unreliable We can estimate sampling variability, but we don’t know how to estimate specification uncertainty from data

2. Progress 1985 – Present Representing causal structure, and connecting it to probability Modeling Interventions Indistinguishability and Discovery Algorithms

Representing Causal Structures Causal Graph G = {V,E} Each edge X  Y represents a direct causal claim: X is a direct cause of Y relative to V 1. don’t define causality - but will introduce axioms to connect probability to causality 2. many fields proceed without agreement on definition - probability, “force” in mechanics, interpretation of quantum mechanics, etc. 3. a number of different kinds of graphs represent probability distributions and independence - advantage of directed graphs is also represents causal relations 4. will introduce several extensions

Direct Causation X is a direct cause of Y relative to S, iff z,x1  x2 P(Y | X set= x1 , Z set= z)  P(Y | X set= x2 , Z set= z) where Z = S - {X,Y}

Causal Bayes Networks The Joint Distribution Factors According to the Causal Graph, i.e., for all X in V P(V) = P(X|Immediate Causes of(X)) P(S = 0) = .7 P(S = 1) = .3 P(YF = 0 | S = 0) = .99 P(LC = 0 | S = 0) = .95 P(YF = 1 | S = 0) = .01 P(LC = 1 | S = 0) = .05 P(YF = 0 | S = 1) = .20 P(LC = 0 | S = 1) = .80 P(YF = 1 | S = 1) = .80 P(LC = 1 | S = 1) = .20 P(S,Y,F) = P(S) P(YF | S) P(LC | S)

Modeling Ideal Interventions Interventions on the Effect Post Pre-experimental System Room Temperature Wearing Sweater

Modeling Ideal Interventions Interventions on the Cause Post Pre-experimental System Room Temperature Wearing Sweater

Interventions & Causal Graphs Model an ideal intervention by adding an “intervention” variable outside the original system Erase all arrows pointing into the variable intervened upon Intervene to change Inf Post-intervention graph? Pre-intervention graph Fat Hand - intervention - cholesterol drug -- arythmia

Calculating the Effect of Interventions Pre-manipulation Joint Distribution P(Exp,Inf,Rash) = P(Exp)P(Inf | Exp)P(Rash|Inf) Intervention on Inf Post-manipulation Joint Distribution P(Exp,Inf,Rash) = P(Exp)P(Inf | I) P(Rash|Inf)

Causal Discovery from Observational Studies

Equivalence Class with Latents: PAGs: Partial Ancestral Graphs Assumptions: Acyclic graphs Latent variables Sample Selection Bias Equivalence: Independence over measured variables 1. represents set of conditional independence and distribution equivalent graphs 2. same adjacencies 3. undirected edges mean some contain edge one way, some contain other way 4. directed edge means they all go same way 5. Pearl and Verma -complete rules for generating from Meek, Andersson, Perlman, and Madigan, and Chickering 6. instance of chain graph 7. since data can’t distinguish, in absence of background knowledge is right output for search 8. what are they good for?

Causal Inference from Observational Studies Knowing when we know enough to calculate the effect of Interventions The Prediction Algorithm (SGS, 2000)

Causal Discovery from Observational Studies

3. The Ambiguity of Manipulation Assumptions Causal graph known (Cholesterol is a cause of Heart Condition) No Unmeasured Common Causes Therefore The manipulated and unmanipulated distributions are the same: P(H | TC = x) = P(H | TC set= x)

The Problem with Predicting the Effects of Acting Problem – the cause is a composite of causes that don’t act uniformly, E.g., Total Blood Cholesterol (TC) = HDL + LDL The observed distribution over TC is determined by the unobserved joint distribution over HDL and LDL Ideally Intervening on TC does not determine a joint distribution for HDL and LDL

The Problem with Predicting the Effects of Setting TC P(H | TC set1= x) puts NO constraints on P(H | TC set2= x), P(H | TC = x) puts NO constraints on P(H | TC set= x) Nothing in the data tips us off about our ignorance, i.e., we don’t know that we don’t know.

Examples Abound

Possible Ways Out Causal Graph is Not Known: Cholesterol does not really cause Heart Condition Confounders (unmeasured common causes) are present: LDL and HDL are confounders 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations

Cholesterol is not really a cause of Heart Condition Relative to a set of variables S (and a background), X is a cause of Y iff x1  x2 P(Y | X set= x1)  P(Y | X set= x2) Total Cholesterol is a cause of Heart Disease

Cholesterol is not really a cause of Heart Condition Is Total Cholesterol is a direct cause of Heart Condition relative to: {TC, LDL, HDL, HD}? TC is logically related to LDL, HDL, so manipulating it once LDL and HDL are set is impossible.

LDL, HDL are confounders No way to manipulate TCl without affecting HDL, LDL HDL, LDL are logically related to TC

Logico-Causal Systems S: Atomic Variables independently manipulable effects of all manipulations are unambiguous S’: Defined Variables defined logically from variables in S For example: S: LDL, HDL, HD, Disease1, Disease2 S’: TC 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations

Logico-Causal Systems: Adding Edges S: LDL, HDL, HD, D1, D2 S’: TC System over S System over S U S’ 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations TC  HD iff manipulations of TC are unambiguous wrt HD

Logico-Causal Systems: Unambiguous Manipulations For each variable X in S’, let Parents(X’) be the set of variables in S that logically determine X’, i.e., X’ = f(Parents(X’)), e.g., TC = LDL + HDL Inv(x’) = set of all values p of Parents(X’) s.t., f(p) = x’ A manipulation of a variable X’ in S’ to a value x’ wrt another variable Y is unambiguous iff p1≠ p2 [P(Y | p1  Inv(x’)) = P(Y | p2  Inv(x’))] 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations TC  HD iff all manipulations of TC are unambiguous wrt HD

Logico-Causal Systems: Removing Edges S: LDL, HDL, HD, D1, D2 S’: TC System over S System over S U S’ 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations Remove LDL  HD iff LDL _||_ HD | TC

Logico-Causal Systems: Faithfulness Faithfulness: Independences entailed by structure, not by special parameter values. Crucial to inference Effect of TC on HD unambiguous Unfaithfulness: LDL _||_ HDL | TC Because LDL and TC determine HDL, and similarly, HDL and TC determine TC 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations

Effect on Prediction Algorithm Still sound – but less informative Observed System: TC, HD, D1, D2 Manipulate: Effect on: Assume manipulation unambiguous Manipulation Maybe ambiguous Disease 1 Disease 2 None HD Can’t tell TC 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations

Effect on Prediction Algorithm Observed System: TC, HD, D1, D2, X Not completely sound No general characterization of when the Prediction algorithm, suitably modified, is still informative and sound. Conjectures, but no proof yet. 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations Example: If observed system has no deterministic relations All orientations due to marginal independence relations are still valid

Effect on Causal Inference of Ambiguous Manipulations Experiments, e.g., RCTs: Manipulating treatment is unambiguous  sound ambiguous  unsound Observational Studies, e.g., Prediction Algorithm: Manipulation is unambiguous  potentially sound ambiguous  potentially sound 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations

References Causation, Prediction, and Search, 2nd Edition, (2000), by P. Spirtes, C. Glymour, and R. Scheines ( MIT Press) Causality: Models, Reasoning, and Inference, (2000), Judea Pearl, Cambridge Univ. Press Spirtes, P., Scheines, R.,Glymour, C., Richardson, T., and Meek, C. (2004), “Causal Inference,” in Handbook of Quantitative Methodology in the Social Sciences, ed. David Kaplan, Sage Publications, 447-478 Spirtes, P., and Scheines, R. (2004). Causal Inference of Ambiguous Manipulations. in Proceedings of the Philosophy of Science Association Meetings, 2002. Reaber, Grant (2005). The Theory of Ambiguous Manipulations. Masters Thesis, Department of Philosophy, Carnegie Mellon University 1