CAUSAL REASONING FOR DECISION AIDING SYSTEMS COGNITIVE SYSTEMS LABORATORY UCLA Judea Pearl, Mark Hopkins, Blai Bonet, Chen Avin, Ilya Shpitser
Judea Pearl Robustness of Causal Claims Ilya Shpitser and Chen Avin Experimental Testability of Counterfactuals Blai Bonet Logic-based Inference on Bayes Networks Mark Hopkins Inference using Instantiations Chen Avin Inference in Sensor Networks Blai Bonet Report from Probabilistic Planning Competition PRESENTATIONS
FROM STATISTICAL TO CAUSAL ANALYSIS: 1. THE DIFFERENCES Data joint distribution inferences from passive observations Probability and statistics deal with static relations ProbabilityStatistics Causal Model Data Causal assumptions 1.Effects of interventions 2.Causes of effects 3.Explanations Causal analysis deals with changes (dynamics) Experiments
Z Y X INPUTOUTPUT TYPICAL CAUSAL MODEL
TYPICAL CLAIMS 1.Effects of potential interventions, 2.Claims about attribution (responsibility) 3.Claims about direct and indirect effects 4.Claims about explanations
ROBUSTNESS: MOTIVATION The effect of smoking on cancer is, in general, non-identifiable (from observational studies). Smoking x y Genetic Factors (unobserved) Cancer u In linear systems: y = x + is non-identifiable.
ROBUSTNESS: MOTIVATION Z – Instrumental variable; cov( z,u ) = 0 Smoking y Genetic Factors (unobserved) Cancer u x Z Price of Cigarettes is identifiable
ROBUSTNESS: MOTIVATION Problem with Instrumental Variables: The model may be wrong! Smoking Z Price of Cigarettes x y Genetic Factors (unobserved) Cancer u
Smoking ROBUSTNESS: MOTIVATION Z1Z1 Price of Cigarettes Solution: Invoke several instruments Surprise: 1 = 2 model is likely correct x y Genetic Factors (unobserved) Cancer u Peer Pressure Z2Z2
ROBUSTNESS: MOTIVATION Z1Z1 Price of Cigarettes x y Genetic Factors (unobserved) Cancer u Peer Pressure Z2Z2 Smoking Greater surprise: 1 = 2 = 3 ….= n = q Claim = q is highly likely to be correct Z3Z3 ZnZn Anti-smoking Legislation
ROBUSTNESS: MOTIVATION xy Genetic Factors (unobserved) Cancer u Smoking Symptoms do not act as instruments remains non-identifiable s Symptom Why? Taking a noisy measurement ( s ) of an observed variable ( y ) cannot add new information
ROBUSTNESS: MOTIVATION x Genetic Factors (unobserved) Cancer u Smoking Adding many symptoms does not help. remains non-identifiable y Symptom S1S1 S2S2 SnSn
ROBUSTNESS: MOTIVATION Find if can evoke an equality surprise 1 = 2 = … n associated with several independent estimands of x y Given a parameter in a general graph Formulate: Surprise, over-identification, independence Robustness: The degree to which is robust to violations of model assumptions
ROBUSTNESS: FORMULATION Bad attempt: Parameter is robust (over identifies) f 1, f 2 : Two distinct functions if:
ROBUSTNESS: FORMULATION exex eyey ezez xyz bc x = e x y = bx + e y z = cy + e z R yx = b R zx = bc R zy = c constraint: (b) (c) y → z irrelvant to derivation of b
RELEVANCE: FORMULATION Definition 8 Let A be an assumption embodied in model M, and p a parameter in M. A is said to be relevant to p if and only if there exists a set of assumptions S in M such that S and A sustain the identification of p but S alone does not sustain such identification. Theorem 2 An assumption A is relevant to p if and only if A is a member of a minimal set of assumptions sufficient for identifying p.
ROBUSTNESS: FORMULATION Definition 5 (Degree of over-identification) A parameter p (of model M ) is identified to degree k (read: k -identified) if there are k minimal sets of assumptions each yielding a distinct estimand of p.
ROBUSTNESS: FORMULATION xy b z c Minimal assumption sets for c. x y z c x y z c G3G3 G2G2 x y z c G1G1 Minimal assumption sets for b. x y b z
FROM MINIMAL ASSUMPTION SETS TO MAXIMAL EDGE SUPERGRAPHS FROM PARAMETERS TO CLAIMS Definition A claim C is identified to degree k in model M (graph G ), if there are k edge supergraphs of G that permit the identification of C, each yielding a distinct estimand. TE ( x,z ) = R zx TE ( x,z ) = R zx Rzy ·x x y z x y z e.g., Claim: (Total effect) TE (x,z) = q x y z
FROM MINIMAL ASSUMPTION SETS TO MAXIMAL EDGE SUPERGRAPHS FROM PARAMETERS TO CLAIMS Definition A claim C is identified to degree k in model M (graph G ), if there are k edge supergraphs of G that permit the identification of C, each yielding a distinct estimand. x y z x y z e.g., Claim: (Total effect) TE (x,z) = q x y z Nonparametric
CONCLUSIONS 1.Formal definition to ROBUSTNESS of causal claims. 2.Graphical criteria and algorithms for computing the degree of robustness of a given causal claim.