ASSESSING CAUSAL QUANTITIES FROM EXPERIMENTAL AND NONEXPERIMENTAL DATA Judea Pearl Computer Science and Statistics UCLA www.cs.ucla.edu/~judea/

Slides:

Advertisements

Similar presentations

Department of Computer Science

Advertisements

Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.

RELATED CLASS CS 262 Z – SEMINAR IN CAUSAL MODELING CURRENT TOPICS IN COGNITIVE SYSTEMS INSTRUCTOR: JUDEA PEARL Spring Quarter Monday and Wednesday, 2-4pm.

The World Bank Human Development Network Spanish Impact Evaluation Fund.

1 WHAT'S NEW IN CAUSAL INFERENCE: From Propensity Scores And Mediation To External Validity Judea Pearl University of California Los Angeles (

CAUSES AND COUNTERFACTUALS Judea Pearl University of California Los Angeles (

1 THE SYMBIOTIC APPROACH TO CAUSAL INFERENCE Judea Pearl University of California Los Angeles (

TRYGVE HAAVELMO AND THE EMERGENCE OF CAUSAL CALCULUS Judea Pearl University of California Los Angeles (

THE MATHEMATICS OF CAUSAL MODELING Judea Pearl Department of Computer Science UCLA.

Judea Pearl University of California Los Angeles THE MATHEMATICS OF CAUSE AND EFFECT.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering An Introduction to Pearl’s Do-Calculus of Intervention Marco Valtorta Department.

COMMENTS ON By Judea Pearl (UCLA). notation 1990’s Artificial Intelligence Hoover.

Causal Networks Denny Borsboom. Overview The causal relation Causality and conditional independence Causal networks Blocking and d-separation Excercise.

Building and Testing a Theory Steps Decide on what it is you want to explain or predict. 2. Identify the variables that you believe are important.

From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Causal Diagrams and the Identification of Causal Effects A presentation of.

Judea Pearl University of California Los Angeles CAUSAL REASONING FOR DECISION AIDING SYSTEMS.

CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.

Probability and Statistics Review Thursday Sep 11.

1 CAUSAL INFERENCE: MATHEMATICAL FOUNDATIONS AND PRACTICAL APPLICATIONS Judea Pearl University of California Los Angeles (

SIMPSON’S PARADOX, ACTIONS, DECISIONS, AND FREE WILL Judea Pearl UCLA

CAUSES AND COUNTERFACTUALS OR THE SUBTLE WISDOM OF BRAINLESS ROBOTS.

1 WHAT'S NEW IN CAUSAL INFERENCE: From Propensity Scores And Mediation To External Validity Judea Pearl University of California Los Angeles (

Regression and Correlation Methods Judy Zhong Ph.D.

CAUSAL INFERENCE IN THE EMPIRICAL SCIENCES Judea Pearl University of California Los Angeles (

1 REASONING WITH CAUSES AND COUNTERFACTUALS Judea Pearl UCLA (

Judea Pearl Computer Science Department UCLA DIRECT AND INDIRECT EFFECTS.

Judea Pearl University of California Los Angeles ( THE MATHEMATICS OF CAUSE AND EFFECT.

Judea Pearl University of California Los Angeles ( THE MATHEMATICS OF CAUSE AND EFFECT.

SIMPSON’S PARADOX, ACTIONS, DECISIONS, AND FREE WILL Judea Pearl UCLA

THE MATHEMATICS OF CAUSE AND EFFECT: With Reflections on Machine Learning Judea Pearl Departments of Computer Science and Statistics UCLA.

IPSS Ch 2. Selection Problem 2.1. The Nature of the Problem Non-Response, Dropped from Census, Sample Attrition in Longitudinal Survey, Censored Data We.

V13: Causality Aims: (1) understand the causal relationships between the variables of a network (2) interpret a Bayesian network as a causal model whose.

REASONING WITH CAUSE AND EFFECT Judea Pearl Department of Computer Science UCLA.

Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,

Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.

CAUSES AND COUNTERFACTIALS IN THE EMPIRICAL SCIENCES Judea Pearl University of California Los Angeles (

INTERVENTIONS AND INFERENCE / REASONING. Causal models  Recall from yesterday:  Represent relevance using graphs  Causal relevance ⇒ DAGs  Quantitative.

REASONING WITH CAUSE AND EFFECT Judea Pearl Department of Computer Science UCLA.

Conditional Probability Distributions Eran Segal Weizmann Institute.

Assessing the Total Effect of Time-Varying Predictors in Prevention Research Bethany Bray April 7, 2003 University of Michigan, Dearborn.

SIMPSON’S PARADOX Any statistical relationship between two variables may be reversed by including additional factors in the analysis. Application: The.

THE MATHEMATICS OF CAUSE AND COUNTERFACTUALS Judea Pearl University of California Los Angeles (

Impact Evaluation Sebastian Galiani November 2006 Causal Inference.

Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.

1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,

Judea Pearl Computer Science Department UCLA ROBUSTNESS OF CAUSAL CLAIMS.

CAUSAL REASONING FOR DECISION AIDING SYSTEMS COGNITIVE SYSTEMS LABORATORY UCLA Judea Pearl, Mark Hopkins, Blai Bonet, Chen Avin, Ilya Shpitser.

Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.

1 CONFOUNDING EQUIVALENCE Judea Pearl – UCLA, USA Azaria Paz – Technion, Israel (

CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.

Summary: connecting the question to the analysis(es) Jay S. Kaufman, PhD McGill University, Montreal QC 26 February :40 PM – 4:20 PM National Academy.

Variable selection in Regression modelling Simon Thornley.

1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.

Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.

1 Day 2: Search June 14, 2016 Carnegie Mellon University Center for Causal Discovery.

Methods of Presenting and Interpreting Information Class 9.

Explanation of slide: Logos, to show while the audience arrive.

Department of Computer Science

Judea Pearl University of California Los Angeles

Chen Avin Ilya Shpitser Judea Pearl Computer Science Department UCLA

Center for Causal Discovery: Summer Short Course/Datathon

Department of Computer Science

A MACHINE LEARNING EXERCISE

Computer Science and Statistics

CAUSAL INFERENCE IN STATISTICS

From Propensity Scores And Mediation To External Validity

THE MATHEMATICS OF PROGRAM EVALUATION

Department of Computer Science

CAUSAL REASONING FOR DECISION AIDING SYSTEMS

Presentation transcript:

ASSESSING CAUSAL QUANTITIES FROM EXPERIMENTAL AND NONEXPERIMENTAL DATA Judea Pearl Computer Science and Statistics UCLA

OUTLINE From probability to causal analysis: – The differences – The barriers – The benefits Assessing the effects of actions and policies Determining the causes of effects Distinguishing direct from indirect effects Generating explanations

FROM PROBABILITY TO CAUSAL ANALYSIS: 1. THE DIFFERENCES Data joint distribution predictions from passive observations Probability and statistics deals with static relations ProbabilityStatistics Causal Model Data Causal assumptions 1.Effects of interventions 2.Causes of effects 3.Explanations Causal analysis deals with changes (dynamics) Experiments

FROM PROBABILITY TO CAUSAL ANALYSIS: 1. THE DIFFERENCES (CONT) CAUSAL Spurious correlation Randomization Confounding / Effect Instrument Holding constant Explanatory variables STATISTICAL Regression Association / Independence “Controlling for” / Conditioning Odd and risk ratios Collapsibility 1.Causal and statistical concepts do not mix. 2.No causes in – no causes out (Cartwright, 1989) statistical assumptions data causal assumptions causal conclusions  } 3.Causal assumptions cannot be expressed in the mathematical language of standard statistics. 4.Non-standard mathematics: a)Structural equation models (SEM) b)Counterfactuals (Neyman-Rubin) c)Causal Diagrams (Wright, 1920)

FROM PROBABILITY TO CAUSAL ANALYSIS: 2. THE MENTAL BARRIERS 1.Reliance on non-probabilistic assumptions (all conclusions are conditional). 2.The use of new mathematical languages. 2.1 Structural equations models (SEM) 2.2 Counterfactuals (Neyman-Rubin framework) 2.3 Causal diagrams (Wright 1920)

WHAT'S IN A CAUSAL MODEL? Oracle that assigns truth value to causal sentences: Action sentences:B if we do A. Counterfactuals:B would be different if A were true. Explanation:B occurred because of A. Optional: with what probability?

Z Y X INPUTOUTPUT CAUSAL MODELS WHY THEY ARE NEEDED

STRUCTURAL EQUATION MODELS (HAAVELMO 1943; DUNCAN 1970) CAUSAL MODEL: U1U1 U2U2 IW QP INCOME DEMANDPRICE WAGE PATH DIAGRAMS (WRIGHT 1920):

CAUSAL MODELS AND COUNTERFACTUALS Definition: A causal model is a 3-tuple M =  V,U,F  with a multilation operator do(x): M  M x where: (i)V = {V 1 …,V n } endogenous variables, (ii)U = {U 1,…,U m } background variables (iii)F = set of n functions, f i : V \ V i  U  V i v i = f i (pa i,u i ) PA i  V \ V i U i  U (iv)M x =  U,V,F x , X  V, x  X where F x = {f i : V i  X }  {X = x} (Replace all functions f i corresponding to X with the constant functions X=x) Definition (Probabilistic Causal Model):  M, P(u)  P(u) is a probability assignment to the variables in U.

QWhat do causal diagrams represent? COMMON QUESTIONS ON CAUSAL DIAGRAMS A E C B D weight Blood pressure Genetic factors Cardio vascular disease A Qualitative assumptions about absence of effects Q Where do these assumptions come from? A Commonsense, substantive knowledge, other studies Q What if we can’t make any causal assumptions? A Quit

CAUSAL INFERENCE MADE EASY ( ) 1.Inference with Nonparametric Structural Equations made possible through Graphical Analysis. 2.Mathematical underpinning of counterfactuals through nonparametric structural equations 3.Graphical-Counterfactuals symbiosis

NON-PARAMETRIC STRUCTURAL MODELS Given P(x,y,z), should we ban smoking? x = u 1, z =  x + u 2, y =  z +  u 1 + u 3. Find:    Find: P(y|do(x)) x = f 1 (u 1 ), z = f 2 (x, u 2 ), y = f 3 (z, u 1, u 3 ). Linear AnalysisNonparametric Analysis U X Z Y 1 U 2 SmokingTar in Lungs Cancer U 3 U X Z Y 1 U 2 SmokingTar in Lungs Cancer  U 3 f1f1 f2f2 f3f3

2 f2f2 Given P(x,y,z), should we ban smoking? x = u 1, z =  x + u 2, y =  z +  u 1 + u 3. Find:    Find: P(y|do(x)) = P(Y=y) in new model x = const. z = f 2 (x, u 2 ), y = f 3 (z, u 1, u 3 ). Linear AnalysisNonparametric Analysis U X = x Z Y 1 U SmokingTar in Lungs Cancer U 3 U X Z Y 1 U 2 SmokingTar in Lungs Cancer  U 3 f3f3 NON-PARAMETRIC STRUCTURAL MODELS 

CAUSAL MODELS AND CAUSAL EFFECTS Definition: A causal model is a 3-tuple M =  V,U,F  with a multilation operator do(x): M  M x where: (i)V = {V 1 …,V n } endogenous variables, (ii)U = {U 1,…,U m } background variables (iii)F = set of n functions, f i : V \ V i  U  V i v i = f i (pa i,u i ) PA i  V \ V i U i  U (iv)M x =  U,V,F x , X  V, x  X where F x = {f i : V i  X }  {X = x} (Replace all functions f i corresponding to X with the constant functions X=x)

Definition: A causal model is a 3-tuple M =  V,U,F  with a multilation operator do(x): M  M x where: (i)V = {V 1 …,V n } endogenous variables, (ii)U = {U 1,…,U m } background variables (iii)F = set of n functions, f i : V \ V i  U  V i v i = f i (pa i,u i ) PA i  V \ V i U i  U (iv)M x =  U,V,F x , X  V, x  X where F x = {f i : V i  X }  {X = x} (Replace all functions f i corresponding to X with the constant functions X=x) Definition (Causal Effect P(y|do(x))): The Causal Effect of X on Y is given by the probability of Y=y in submodel  M x, P(u) . CAUSAL MODELS AND CAUSAL EFFECTS

IDENTIFIABILITY Definition: Let Q(M) be any quantity defined on a causal model M, and let A be a set of assumption. Q is identifiable relative to A iff for all M 1, M 2, that satisfy A. In other words, Q can be determined uniquely from the probability distribution P(v) of the endogenous variables, V, and assumptions A. P M 1 (v) = P M 2 (v)   Q( M 1 ) = Q( M 2 )

THE FUNDAMENTAL THEOREM OF CAUSAL INFERENCE Causal Markov Theorem: Any distribution generated by Markovian structural model M (recursive, with independent disturbances) can be factorized as Where pa i are the (values of) the parents of V i in the causal diagram associated with M. Corollary: (Truncated factorization, Manipulation Theorem) The distribution generated by an intervention do(X=x) (in a Markovian model M) is given by the truncated factorization

RAMIFICATIONS OF THE FUNDAMENTAL THEOREM U (unobserved) X = x Z Y SmokingTar in Lungs Cancer U (unobserved) X Z Y SmokingTar in Lungs Cancer Given P(x,y,z), should we ban smoking? Pre-interventionPost-intervention To compute P(y,z|do(x)), we must eliminate u.

THE BACK-DOOR CRITERION Graphical test of identification P(y | do(x)) is identifiable in G if there is a set Z of variables such that Z d-separates X from Y in G x. Z6Z6 Z3Z3 Z2Z2 Z5Z5 Z1Z1 X Y Z4Z4 Z6Z6 Z3Z3 Z2Z2 Z5Z5 Z1Z1 X Y Z4Z4 Z Moreover, P(y | do(x)) =   P(y | x,z) P(z) (“adjusting” for Z) z GxGx G

RULES OF CAUSAL CALCULUS Rule 1: Ignoring observations P(y | do{x}, z, w) = P(y | do{x}, w) Rule 2: Action/observation exchange P(y | do{x}, do{z}, w) = P(y | do{x},z,w) Rule 3: Ignoring actions P(y | do{x}, do{z}, w) = P(y | do{x}, w) ZX G WXZY),|( if 

DERIVATION IN CAUSAL CALCULUS Smoking Tar Cancer P (c | do{s}) =  t P (c | do{s}, t) P (t | do{s}) =  s   t P (c | do{t}, s) P (s | do{t}) P(t |s) =  t P (c | do{s}, do{t}) P (t | do{s}) =  t P (c | do{s}, do{t}) P (t | s) =  t P (c | do{t}) P (t | s) =  s  t P (c | t, s) P (s) P(t |s) =  s   t P (c | t, s) P (s | do{t}) P(t |s) Probability Axioms Rule 2 Rule 3 Rule 2 Genotype (Unobserved)

THE INTERPRETATION OF STRUCTURAL MODELS The meaning of y = a + bx +  is The meaning of b is The meaning of  is Meaning of SEM not expressible in standard probability syntax.

OUTLINE From probability to causal analysis: – The differences – The barriers – The benefits Assessing the effects of actions and policies Determining the causes of effects

DETERMINING THE CAUSES OF EFFECTS (The Attribution Problem) Your Honor! My client (Mr. A) died BECAUSE he used that drug. Court to decide if it is MORE PROBABLE THAN NOT that A would be alive BUT FOR the drug! P(? | A is dead, took the drug) > 0.50

THE PROBLEM Theoretical Problems: 1.What is the meaning of PN(x,y): “Probability that event y would not have occurred if it were not for event x, given that x and y did in fact occur.” 2.Under what condition can PN(x,y) be learned from statistical data, i.e., observational, experimental and combined.

U=u Y x (u)=y Z W X=x U Y Z W X COUNTERFACTUALS FROM MULTILATED MODELS (Riflemen) (Captain) (Court order) (Death) Example: 2-man firing squad Would the prisoner die (Y=y) if rifleman X were to Shoot (X=x) in situation U=u?

U D B C A AbductionActionPrediction S5.If the prisoner is dead, he would still be dead if A were not to have shot. D  D  A 3-STEPS TO COMPUTING COUNTERFACTUALS U D B C A FALSE TRUE U D B C A FALSE TRUE

AXIOMS OF CAUSAL COUNTERFACTUALS 1.Definiteness  x  X s.t. X y (u) = x 2.Uniqueness (X y (u) = x) & (X y (u) = x )  x=x 3.Effectiveness X xw (u) = x 4.Composition W x (u) = w  Y xw (u) = Y x (u) 5.Reversibility (Y xw (u) = y) & (W xy (u) = w)  Y x (u) = y

PROBABILITIES OF COUNTERFACTUALS P(u) induces unique distribution P(v) Probabilistic Causal Model:  M, P(u)  The probability of counterfactual Y x =y is the probability of Y=y induced by submodel M x :

U D B C A Abduction P(S5).The prisoner is dead. How likely is it that he would be dead if A were not to have shot. P(D  A |D) = ? COMPUTING PROBABILITIES OF COUNTERFACTUALS Action TRUE Prediction U D B C A FALSE P(u|D) P(D  A |D ) U D B C A FALSE P(u|D) P(u)P(u)

CAN FREQUENCY DATA DECIDE LEGAL RESPONSIBILITY? Nonexperimental data: drug usage predicts longer life Experimental data: drug has negligible effect on survival ExperimentalNonexperimental do(x) do(x) x x Deaths (y) Survivals (y) ,0001,0001,0001,000 1.He actually died 2.He used the drug by choice Court to decide (given both data): Is it more probable than not that A would be alive but for the drug? Plaintiff: Mr. A is special.

IDENTIFIABILITY FROM EXPERIMENTS Definition: Let Q(M) be any quantity defined on a causal model M, let M exp be a modification of M induced by some experiment, exp, and let Y be a set of variables observed under exp. Q is identifiable from experiment exp, given A, iff for all M 1, M 2. P M 1 exp(y) = P M 2 exp(y)  Q( M 1 ) = Q( M 2 ) In other words, Q can be determined uniquely from the probability distribution that the observed variables Y attain under the experimental conditions created by exp.

TYPICAL RESULTS Bounds given combined nonexperimental and experimental data Identifiability under exogeneity and monotonicity Bounds under exogeneity Identifiability under monotonicity (Combined data)

BOUNDING BY LINEAR PROGRAMMING Maximize (minimize): Nonexperimental constraints: p p 101 = P(x,y) p p 001 = P(x,y ) p p 010 = P(x,y) Experimental constraints: P(y x ) = p p p p 100 P(y x ) = p p p p 010 yxyxyxyx yxyxyxyx yxyxyxyx yxyxyxyx X Y Y = 1Y = 1 Y  X Y = X Y = 0 U -space collapsed Notation: p 100 = P(y x, y x, x)

TYPICAL THEOREMS (Tian and Pearl, 2000) Bounds given combined nonexperimental and experimental data Identifiability under monotonicity (Combined data) corrected Excess-Risk-Ratio

SOLUTION TO THE ATTRIBUTION PROBLEM (Cont) WITH PROBABILITY ONE P(y x | x,y) =1 From population data to individual case Combined data tell more that each study alone

OUTLINE From probability to causal analysis: – The differences – The barriers – The benefits Assessing the effects of actions and policies Determining the causes of effects Distinguishing direct from indirect effects

QUESTIONS ADDRESSED Direct / Indirect distinction is ubiquitous in our language, Why? What is the semantics of direct and indirect effects? Can we estimate them from data? Experimental data? What can we do with these estimates?

1.Direct (or indirect) effect may be more transportable. 2.Indirect effects may be prevented or controlled. 3.Direct (or indirect) effect may be forbidden WHY DECOMPOSE EFFECTS? Pill Thrombosis Pregnancy +  + Gender Hiring Qualification

TOTAL, DIRECT, AND INDIRECT EFFECTS HAVE SIMPLE SEMANTICS IN LINEAR MODELS XZ Y c a b z = bx +  1 y = ax + cz +  2 a + bc bc a

z = f (x,  1 ) y = g (x, z,  2 ) XZ Y SEMANTICS BECOMES NONTRIVIAL IN NONLINEAR MODELS (even when the model is completely specified) Dependent on z ? Void of operational meaning?

Indirect Effect? NEED OF FORMALIZATION AND XZ Y = What is the direct effect of X on Y?

Indirect Effect? NEED OF FORMALIZATION (Who cares?) AND XZ Y = What is the direct effect of X on Y? What is the effect of drug on recovery if drug-induced headache is eliminated? DRUG ASPIRIN RECOVERY

NEED OF FORMALIZATION (Who cares?) AND XZ Y = What is the direct effect of X on Y? What is the effect of drug on recovery if drug-induced headache is eliminated? DRUG ASPIRIN RECOVERY

NEED OF FORMALIZATION (Who cares?) AND GENDERQUALIFICATION HIRING What is the direct effect of X on Y? The effect of Gender on Hiring if sex discrimination is eliminated. indirect XZ Y IGNORE

Starting from X=0, (and Z=0 and Y=0) Total Effect: Change X from 0 to 1, and test the change in Y. Controlled DE: Keep Z constant at Z=0, or Z=1, and change X=0 to X=1. Controlled IE: None. Natural DE: Keep Z constant at its current value, and change X to 1. Natural IE: Keep X at 0, but set Z to what it would be if X were 1. TWO CONCEPTIONS OF DIRECT AND INDIRECT EFFECTS: Controlled vs. Natural XZ AND Y =

``The central question in any employment-discrimination case is whether the employer would have taken the same action had the employee been of different race (age, sex, religion, national origin etc.) and everything else had been the same’’ [ Carson versus Bethlehem Steel Corp. (70 FEP Cases 921, 7 th Cir. (1996))] x = male, x = female y = hire, y = not hire z = applicant’s qualifications LEGAL DEFINITIONS TAKE THE NATURAL CONCEPTION (FORMALIZING DISCRIMINATION) Y x Z x = Y x, Y x Z x = Y x NO DIRECT EFFECT

SEMANTICS OF NESTED COUNTERFACTUALS Can the quantity be estimated from data? Given u, Z x * (u) is the solution for Z in M x *, call it z is the solution for Y in M xz Given  M, P(u) , Q is well defined.

Starting from X=x *, (and Z=Z x* (u) and Y= Y x* (u)) Total Effect: TE(x,x * ;Y) = E(Y x ) – E(Y x* ) Controlled DE: CDE Z (x,x * ;Y) = E(Y xz ) – E(Y x*z ) Controlled IE: None. Natural DE: NDE(x,x * ;Y) = E(Y xZ x* ) – E(Y x* ) Natural IE: NIE(x,x * ;Y) = E(Y x*Z x ) – E(Y x* ) TWO CONCEPTIONS OF AVERAGE DIRECT AND INDIRECT EFFECTS: POPULATION-LEVEL DEFINITIONS XZ y = f (x,z,u) Y u2u2 u3u3 u1u1 Probabilistic causal model:  P(u)M, (all other parents of Y)

GRAPHICAL CONDITION FOR EXPERIMENTAL IDENTIFICATION OF AVERAGE NATURAL DIRECT EFFECTS Example: Theorem: If there exists a set W such that (Y Z | W) G xz and W  ND(X  Z) Then: NDE(x,x * ;Y) =  w,z [E(Y xz |w)]  E(Y x* z |w)] P(Z x* =z|w)P(w) Y X U1U1 G Z U2U2 U3U3 U4U4 W Y X U1U1 G XZ Z U2U2 U3U3 U4U4 W

GRAPHICAL CONDITION FOR NONEXPERIMENTAL IDENTIFICATION OF AVERAGE NATURAL DIRECT EFFECTS NDE(x,x * ;Y) =  w,z [E(Y xz |w)]  E(Y x* z |w)] P(Z x* =z|w)P(w)(20) Identification conditions 1.There exists a W such that (Y Z | W) G XZ 2.There exist additional covariates that render all counterfactual terms identifiable. Y X U1U1 G Z U2U2 U3U3 U4U4 W Y X U1U1 G XZ Z U2U2 U3U3 U4U4 W

Corollary 3: The average natural direct effect in Markovian models is identifiable from nonexperimental data, and it is given by NDE(x,x * ;Y) =  s  z [E(Y | x,z)  E(Y|x *,z)] P(z|x *, s) P(s) where S stands for all parents of X (or another sufficient set). IDENTIFICATION IN MARKOVIAN MODELS NDE(x,x *,Y) =  z [E(Y|x,z)  E(y|x *,z)] P(z|x*) X Y Z Example: S = 

How effective would the drug be if we eliminate its side-effect (Headache)? NDE(x,x * ;Y) =  z E(Y | x,z) P(z|x * )  E(Y|x * ) POLICY QUESTION ANSWERED BY NATURAL DIRECT EFFECT Recovery Drug Aspirin X Y Z W Headache

NIE(x,x * ;Y) = Expected increase in sales, if we bluff the competitor into believing that X is about to change from x * to x. For Markovian models: NDE(x,x * ;Y) =  z [E(Y | x *,z) [P(z|x) - P(z|x * )] POLICY-BASED INTERPRETATION OF INDIRECT EFFECTS X Y Z (Sales) (Advertisement Budget) (Competitor’s Budget)

Theorem 5: The total, direct and indirect effects obey The following equality TE(x,x * ;Y) = NDE(x,x * ;Y)  NIE(x *,x;Y) In words, the total effect (on Y) associated with the transition from x * to x is equal to the difference between the direct effect associated with this transition and the indirect effect associated with the reverse transition, from x to x *. RELATIONS BETWEEN TOTAL, DIRECT, AND INDIRECT EFFECTS

GENERAL PATH-SPECIFIC EFFECTS Y Z X W Find the effect of X on Y transmitted through the Subgraph g: X  Z  W  Y

Y Z X W x*x* z * = Z x* (u) GENERAL PATH-SPECIFIC EFFECTS (Def.) Y Z X W Form a new model,, specific to active subgraph g

Y Z X W x*x* z * = Z x* (u) GENERAL PATH-SPECIFIC EFFECTS (Def.) Y Z X W Form a new model,, specific to active subgraph g Nonidentifiable even in Markovian models Definition: g -specific effect

ANSWERS TO QUESTIONS Operational semantics of direct and indirect effects, not based on variable fixing. Conditions for estimability from experimental / nonexperimental data. Useful in answering new type of policy questions involving signal blocking instead of variable fixing.

CONCLUSIONS From probability to causal analysis: – The differences – The barriers – The benefits Assessing the effects of actions and policies Determining the causes of effects Distinguishing direct from indirect effects Generating explanations

ACTUAL CAUSATION AND THE COUNTERFACTUAL TEST "We may define a cause to be an object followed by another,..., where, if the first object had not been, the second never had existed." Hume, Enquiry, 1748 Lewis (1973): "x CAUSED y " if x and y are true, and y is false in the closest non-x-world. Structural interpretation: (i) X(u)=x (ii) Y(u)=y (iii) Y x (u)  y for x  x

PROBLEM WITH THE COUNTERFACTUAL DEFINITION Back-up to shoot iff Captain does not shoot at 12:00 noon (Back-up) (Prisoner) (Captain) Y W X

PROBLEM WITH THE COUNTERFACTUAL DEFINITION Back-up to shoot iff Captain does not shoot at 12:00 noon (Back-up) (Prisoner) (Captain) Y W X Scenario:Captain shot before noon Prisoner is dead = 1 = 0 = 1

PROBLEM WITH THE COUNTERFACTUAL DEFINITION Back-up to shoot iff Captain does not shoot at 12:00 noon (Back-up) (Prisoner) (Captain) Y W X Scenario:Captain shot before noon Prisoner is dead QIs Captain’s shot the cause of death? AYes, but the counterfactual test fails! Intuition:Back-up might fall asleep – structural contingency = 1 = 0 = 1

SELECTED STRUCTURAL CONTINGENCIES AND SUSTENANCE x sustains y against W iff: (i) X(u) = x; (ii) Y(u) = y ; (iii) Y xw (u) = y for all w; and (iv) Y x w (u) = y for some x  x and some w Y W X = 1 = w = 0 = 1 Y W X = 0 = w = 0

THE DESERT TRAVELER (The actual scenario) dehydration D Y death C cyanide intake Enemy -1 Poisons water Enemy-2 Shoots canteen P=1X =1 D=1C=0 Y=1

Inactive dehydration D Y death C cyanide intake Enemy -1 Poisons water Enemy-2 Shoots canteen Sustaining THE DESERT TRAVELER (Constructing a causal beam)  X  P P=1X =1 D=1C=0 Y=1

dehydration D y death C cyanide intake Enemy -1 Poisons water Enemy-2 Shoots canteen THE DESERT TRAVELER (Constructing a causal beam) C =  X P=1X =1 D=1C=0 Y=1

Inactive dehydration D y death C cyanide intake Enemy -1 Poisons water Enemy-2 Shoots canteen Sustaining =D  C=D  C THE DESERT TRAVELER (Constructing a causal beam) C =  X P=1X =1 D=1C=0 Y=1

dehydration D y death C cyanide intake Enemy -1 Poisons water Enemy-2 Shoots canteen THE DESERT TRAVELER (The final beam) Y=X C =  X Y=DY=D P=1X =1 D=1C=0 Y=1

DYNAMIC MODEL UNDER ACTION: do(Fire-1), do(Fire-2) x x* House t* Fire-1 Fire-2 t

THE RESULTING SCENARIO x x* House t* Fire-1 Fire-2 S(x,t) = f [S(x,t-1), S(x+1, t-1), S(x-1,t-1)] t

THE DYNAMIC BEAM Fire-1 Fire-2 x x* House t* Actual cause: Fire-1 t

FIRE AND Oxygen Match Explanation: knowledge needed for identifying causes Definition: x is an explanation of y if: (1) X(u)  x is conceivable, u  K (2) x is a cause of y in every conceivable u for which X(u)=x. EXPLANATION VS. CAUSE (Halpern and Pearl, 2001)

FIRE AND Oxygen Match Explanation: knowledge needed for identifying causes Definition: x is an explanation of y if: (1) X(u)  x is conceivable, u  K (2) x is a cause of y in every conceivable u for which X(u)=x. EXPLANATION VS. CAUSE (Halpern and Pearl, 2001)

FIRE AND Oxygen Match Explanation: knowledge needed for identifying causes Definition: x is an explanation of y if: (1) X(u)  x is conceivable, u  K (2) x is a cause of y in every conceivable u for which X(u)=x. EXPLANATION VS. CAUSE (Halpern and Pearl, 2001)

FIRE AND Oxygen Match CAUSES 1.Match 2.Oxygen EXPLANATIONS None needed Explanation: knowledge needed for identifying causes Definition: x is an explanation of y if: (1) X(u)  x is conceivable, u  K (2) x is a cause of y in every conceivable u for which X(u)=x. EXPLANATION VS. CAUSE (Halpern and Pearl, 2001)

OR FIRE AND Oxygen Match Fuse FIRE AND Oxygen Match CAUSES 1.Match 2.Oxygen EXPLANATIONS None needed Explanation: knowledge needed for identifying causes Definition: x is an explanation of y if: (1) X(u)  x is conceivable, u  K (2) x is a cause of y in every conceivable u for which X(u)=x. EXPLANATION VS. CAUSE (Halpern and Pearl, 2001)

OR FIRE AND Oxygen Match Fuse FIRE AND Oxygen Match CAUSES 1.Match 2.Oxygen EXPLANATIONS None needed Explanation: knowledge needed for identifying causes Definition: x is an explanation of y if: (1) X(u)  x is conceivable, u  K (2) x is a cause of y in every conceivable u for which X(u)=x. EXPLANATION VS. CAUSE (Halpern and Pearl, 2001)

OR FIRE AND Oxygen Match Fuse FIRE AND Oxygen Match CAUSES 1.Match 2.Oxygen EXPLANATIONS None needed Explanation: knowledge needed for identifying causes Definition: x is an explanation of y if: (1) X(u)  x is conceivable, u  K (2) x is a cause of y in every conceivable u for which X(u)=x. ? EXPLANATION VS. CAUSE (Halpern and Pearl, 2001)

OR FIRE AND Oxygen Match Fuse EXPLANATION VS. CAUSE (Halpern and Pearl, 2001) FIRE AND Oxygen Match CAUSES 1.Match 2.Oxygen EXPLANATIONS None needed Explanation: knowledge needed for identifying causes Definition: x is an explanation of y if: (1) X(u)  x is conceivable, u  K (2) x is a cause of y in every conceivable u for which X(u)=x. CAUSES 1.Match 2.Oxygen 3.Fuse EXPLANATIONS 1.Match 2.Fuse ?

Definition:The explanatory power of a proposition X=x relative to an observed event Y=y is given by P(K{x,y}|x), the pre-discovery probability of the set of contexts K in which x is the actual cause of y. FIRE AND Oxygen Match Oxygen Match K O,F = K M,F = K EP(O) = P(K|O) << 1EP(M) = P(K|M)  1 EXPLANATORY POWER (Halpern and Pearl, 2001)

REFERENCES 1.Pearl, J., Causality: Models, Reasoning, and Inference, Cambridge University Press, NY, Pearl, J. “Reasoning with Cause and Effect” (IJCAI-99)  . 3.Pearl, J. “Bayesianism and Causality, or Why I Am Only a Half-Bayesian”  . 4.Tian, J. and Pearl, J. “Probabilistic of causation: Bounds and identification,” Annals of Mathematics and Artificial Intelligence, 287—313,  ftp://ftp.cs.ucla.edu/pub/stat_ser/R284.pdf . 5.Pearl, J. “Direct and Indirect Effects,” Proceedings of UAI Halpern, J. and Pearl, J. “Causes and Explanations: A Structural-Model Approach — Part I: Causes,” Proceedings of UAI-01. “…Part II: Explanations,” Proceedings of IJCAI-01  ftp://ftp.cs.ucla.edu/pub/stat_ser/R266-part2.pdf .

A RECENT IDENTIFICATION RESULT Theorem: [ Tian and Pearl, 2001] The causal effect P(y|do(x)) is identifiable whenever the ancestral graph of Y contains no confounding path ( ) between X and any of its children. Y X Z1Z1 (b) Z2Z2 X Z1Z1 Y (a) Y X Z1Z1 (c) Z2Z2