Download presentation
Presentation is loading. Please wait.
1
Ambiguous Manipulations
Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University
2
1. Motivation Wanted: Answers to Causal Questions:
Does attending Day Care cause Aggression? Does watching TV cause obesity? How can we answer these questions empirically? When and how can we estimate the size of the effect? Can we know our estimates are reliable?
3
Causation & Intervention
Conditioning is not the same as intervening P(Lung Cancer | Tar-stained teeth = no) P(Lung Cancer | Tar-stained teeth set= no) Show Teeth Slides
5
Causal Inference: Experiments
Gold Standard: Randomized Clinical Trials - Intervene: Randomly assign treatment - Observe Response Estimate P( Response | Treatment assigned)
6
Causal Inference: Observational Studies
Collect a sample on - Potential Causes (X) - Response (Y) - Covariates (potential confounders Z) Estimate P(Y | X, Z) Highly unreliable We can estimate sampling variability, but we don’t know how to estimate specification uncertainty from data
7
2. Progress 1985 – Present Representing causal structure, and connecting it to probability Modeling Interventions Indistinguishability and Discovery Algorithms
8
Representing Causal Structures
Causal Graph G = {V,E} Each edge X Y represents a direct causal claim: X is a direct cause of Y relative to V 1. don’t define causality - but will introduce axioms to connect probability to causality 2. many fields proceed without agreement on definition - probability, “force” in mechanics, interpretation of quantum mechanics, etc. 3. a number of different kinds of graphs represent probability distributions and independence - advantage of directed graphs is also represents causal relations 4. will introduce several extensions
9
Direct Causation X is a direct cause of Y relative to S, iff
z,x1 x2 P(Y | X set= x1 , Z set= z) P(Y | X set= x2 , Z set= z) where Z = S - {X,Y}
10
Causal Bayes Networks The Joint Distribution Factors
According to the Causal Graph, i.e., for all X in V P(V) = P(X|Immediate Causes of(X)) P(S = 0) = .7 P(S = 1) = .3 P(YF = 0 | S = 0) = .99 P(LC = 0 | S = 0) = .95 P(YF = 1 | S = 0) = .01 P(LC = 1 | S = 0) = .05 P(YF = 0 | S = 1) = .20 P(LC = 0 | S = 1) = .80 P(YF = 1 | S = 1) = .80 P(LC = 1 | S = 1) = .20 P(S,Y,F) = P(S) P(YF | S) P(LC | S)
11
Modeling Ideal Interventions
Interventions on the Effect Post Pre-experimental System Room Temperature Wearing Sweater
12
Modeling Ideal Interventions
Interventions on the Cause Post Pre-experimental System Room Temperature Wearing Sweater
13
Interventions & Causal Graphs
Model an ideal intervention by adding an “intervention” variable outside the original system Erase all arrows pointing into the variable intervened upon Intervene to change Inf Post-intervention graph? Pre-intervention graph Fat Hand - intervention - cholesterol drug -- arythmia
14
Calculating the Effect of Interventions
Pre-manipulation Joint Distribution P(Exp,Inf,Rash) = P(Exp)P(Inf | Exp)P(Rash|Inf) Intervention on Inf Post-manipulation Joint Distribution P(Exp,Inf,Rash) = P(Exp)P(Inf | I) P(Rash|Inf)
15
Causal Discovery from Observational Studies
16
Equivalence Class with Latents: PAGs: Partial Ancestral Graphs
Assumptions: Acyclic graphs Latent variables Sample Selection Bias Equivalence: Independence over measured variables 1. represents set of conditional independence and distribution equivalent graphs 2. same adjacencies 3. undirected edges mean some contain edge one way, some contain other way 4. directed edge means they all go same way 5. Pearl and Verma -complete rules for generating from Meek, Andersson, Perlman, and Madigan, and Chickering 6. instance of chain graph 7. since data can’t distinguish, in absence of background knowledge is right output for search 8. what are they good for?
17
Causal Inference from Observational Studies
Knowing when we know enough to calculate the effect of Interventions The Prediction Algorithm (SGS, 2000)
18
Causal Discovery from Observational Studies
19
3. The Ambiguity of Manipulation
Assumptions Causal graph known (Cholesterol is a cause of Heart Condition) No Unmeasured Common Causes Therefore The manipulated and unmanipulated distributions are the same: P(H | TC = x) = P(H | TC set= x)
20
The Problem with Predicting the Effects of Acting
Problem – the cause is a composite of causes that don’t act uniformly, E.g., Total Blood Cholesterol (TC) = HDL + LDL The observed distribution over TC is determined by the unobserved joint distribution over HDL and LDL Ideally Intervening on TC does not determine a joint distribution for HDL and LDL
21
The Problem with Predicting the Effects of Setting TC
P(H | TC set1= x) puts NO constraints on P(H | TC set2= x), P(H | TC = x) puts NO constraints on P(H | TC set= x) Nothing in the data tips us off about our ignorance, i.e., we don’t know that we don’t know.
22
Examples Abound
23
Possible Ways Out Causal Graph is Not Known:
Cholesterol does not really cause Heart Condition Confounders (unmeasured common causes) are present: LDL and HDL are confounders 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations
24
Cholesterol is not really a cause of Heart Condition
Relative to a set of variables S (and a background), X is a cause of Y iff x1 x2 P(Y | X set= x1) P(Y | X set= x2) Total Cholesterol is a cause of Heart Disease
25
Cholesterol is not really a cause of Heart Condition
Is Total Cholesterol is a direct cause of Heart Condition relative to: {TC, LDL, HDL, HD}? TC is logically related to LDL, HDL, so manipulating it once LDL and HDL are set is impossible.
26
LDL, HDL are confounders
No way to manipulate TCl without affecting HDL, LDL HDL, LDL are logically related to TC
27
Logico-Causal Systems
S: Atomic Variables independently manipulable effects of all manipulations are unambiguous S’: Defined Variables defined logically from variables in S For example: S: LDL, HDL, HD, Disease1, Disease2 S’: TC 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations
28
Logico-Causal Systems: Adding Edges
S: LDL, HDL, HD, D1, D2 S’: TC System over S System over S U S’ 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations TC HD iff manipulations of TC are unambiguous wrt HD
29
Logico-Causal Systems: Unambiguous Manipulations
For each variable X in S’, let Parents(X’) be the set of variables in S that logically determine X’, i.e., X’ = f(Parents(X’)), e.g., TC = LDL + HDL Inv(x’) = set of all values p of Parents(X’) s.t., f(p) = x’ A manipulation of a variable X’ in S’ to a value x’ wrt another variable Y is unambiguous iff p1≠ p2 [P(Y | p1 Inv(x’)) = P(Y | p2 Inv(x’))] 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations TC HD iff all manipulations of TC are unambiguous wrt HD
30
Logico-Causal Systems: Removing Edges
S: LDL, HDL, HD, D1, D2 S’: TC System over S System over S U S’ 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations Remove LDL HD iff LDL _||_ HD | TC
31
Logico-Causal Systems: Faithfulness
Faithfulness: Independences entailed by structure, not by special parameter values. Crucial to inference Effect of TC on HD unambiguous Unfaithfulness: LDL _||_ HDL | TC Because LDL and TC determine HDL, and similarly, HDL and TC determine TC 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations
32
Effect on Prediction Algorithm
Still sound – but less informative Observed System: TC, HD, D1, D2 Manipulate: Effect on: Assume manipulation unambiguous Manipulation Maybe ambiguous Disease 1 Disease 2 None HD Can’t tell TC 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations
33
Effect on Prediction Algorithm
Observed System: TC, HD, D1, D2, X Not completely sound No general characterization of when the Prediction algorithm, suitably modified, is still informative and sound. Conjectures, but no proof yet. 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations Example: If observed system has no deterministic relations All orientations due to marginal independence relations are still valid
34
Effect on Causal Inference of Ambiguous Manipulations
Experiments, e.g., RCTs: Manipulating treatment is unambiguous sound ambiguous unsound Observational Studies, e.g., Prediction Algorithm: Manipulation is unambiguous potentially sound ambiguous potentially sound 0. Pearl calls stability 1. all conditional independencies that hold entailed by Markov assumption 2. a kind of simplicity assumption 3. graph it is faithful to has more degrees of freedom than other graphs that fit distribution 4. violation is zero Lebesgue measure 5. on a variety of scoring rules, faithful graph does best in limit 6. doesn’t say anthing about “almost” violations
35
References Causation, Prediction, and Search, 2nd Edition, (2000), by P. Spirtes, C. Glymour, and R. Scheines ( MIT Press) Causality: Models, Reasoning, and Inference, (2000), Judea Pearl, Cambridge Univ. Press Spirtes, P., Scheines, R.,Glymour, C., Richardson, T., and Meek, C. (2004), “Causal Inference,” in Handbook of Quantitative Methodology in the Social Sciences, ed. David Kaplan, Sage Publications, Spirtes, P., and Scheines, R. (2004). Causal Inference of Ambiguous Manipulations. in Proceedings of the Philosophy of Science Association Meetings, 2002. Reaber, Grant (2005). The Theory of Ambiguous Manipulations. Masters Thesis, Department of Philosophy, Carnegie Mellon University 1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.