Causal Networks Denny Borsboom
Overview The causal relation Causality and conditional independence Causal networks Blocking and d-separation Excercise
The causal relation What constitutes the “secret connexion” of causality is one of the big questions of philosophy Philosophical proposals: – A causes B means that… A invariably follows B (David Hume) A is an Insufficient but Nonredundant part of an Unnecessary but Sufficient condition for B (INUS condition; John Mackie) B counterfactually depends on A: if A had not happened, B would not have happened (David Lewis) …
Backdrop of philosophical accounts Can’t cope well with noisy data (i.e., can’t cope with data) Almost all causal relations are observed through statistical analysis: probabilities Probabilities didn’t sit well with the philosophical analyses, and neither did data For a long time, causal inference was therefore done in a theoretical vacuum
An alternative Recently, Judea Pearl suggested an alternative approach based in the statistical method of structural relations He argues that causal relations should be framed in terms of interventions on a model: given a causal model, what would happen to B if we changed A? This is a simple idea but it turned out very powerful
Pearl’s approach (I) A causal relation is encoded in a structural equation that says how B would change if A were changed This can be coded with the do operator or the symbol := So B:=2A means that B would change 2 units if A were to change one unit Note that this relation is asymmetric: B:=2A does not imply that A:=B/2
Pearl’s approach (II) The structural equations can be represented in a graph, by drawing a directed arrow from A to B whenever (in the model structure) changing A affects B but not vice versa: BA Can we relate such a system to data? That is, under which conditions can we actually determine the causal relations from the data?
Pearl’s approach (III) The classic problem of induction then presents itself as an identification problem: Given a only two variables, it is not possible to deduce from the data whether A->B or B->A (or some other structure generated the dependence): both are equally consistent with the data If temporal precedence distinguishes A->B from B->A then the skeptic may argue that this is all there is to know (really hardcore skeptics generalize to experiments) This is the root of the platitude that “correlation does not equal causation”
Pearl’s approach (IV) However, where there’s correlational smoke, there is often a causal fire… How to identify that fire? 20 th century statistics struggled with this issue; at the end of the 20 th century many had given up Pearl and Glymour et al. then simultaneously developed the insight that not correlations or conditional probabilities but conditional independence relations are key to the identification of causal structure
Pearl’s approach (V) Trick: shift attention from bivariate to multivariate systems and then ask two new questions: 1) Which conditional independence relations are implied by a given causal structure 2) Which causal structures are implied by a given set of conditional independence relations?
B C A B C A BCA Common CauseChainCollider Example: Village size (A) causes babies (B) and storks (C) Example: Smoking (A) causes tar (B) causes cancer (C) Example: Firing squad (B & C) shoot prisoner (A) CI: B and C conditionally independent given A CI: A and C conditionally independent given B CI: B and C conditionally dependent given A
So… If we can cleverly combine these small networks to build larger networks, then we might have a graphical criterion to deduce implied CI relations from a causal graph (i.e., we could look at the graph rather than solve equations) If we have a dataset, we can establish which of a set of possible causal graphs could have generated the CI relations observed If certain links cannot be deleted from the graph (i.e., are necessary to represent the CI relations), then it is in principle possible to establish causal relations from non-experimental data
To work!
Conditional independence (CI) (see handout)
B C A B C A BCA Common CauseChainCollider Example: Village size (A) causes babies (B) and storks (C) Example: Smoking (A) causes tar (B) causes cancer (C) Example: Firing squad (B & C) shoot prisoner (A) CI: B and C conditionally independent given A CI: A and C conditionally independent given B CI: B and C conditionally dependent given A
B C A B C A BCA Common CauseChainCollider Example: Village size (A) causes babies (B) and storks (C) Example: Smoking (A) causes tar (B) causes cancer (C) Example: Firing squad (B & C) shoot prisoner (A) CI: B and C conditionally independent given A CI: A and C conditionally independent given B CI: B and C conditionally dependent given A
Therefore Now suppose we are prepared to make some causal assumptions, most importantly: – there are no omitted variables that generate dependencies, and – all causal relations are necessary to establish the pattern of CI Then we can deduce causal relations from correlational data (at least in principle) Quite a nice result!
Blocking and d-separation
It would be nice if we could just look at the graph and see which CI relations it entails This turns out to be possible Rule: if you want to know whether in a directed acyclic graph two variables A and B are independent given C, see if they are d-separated For this you have to (a) check all the paths between A and B, and (b) see if they are all blocked If all paths are blocked by C, then C d-separates A and B, and you can predict that A is independent of B given C
Blocking and d-separation A path between two variables is formed by a series of edges that you can travel to reach one variable from the other A path between B and F
When is a path blocked? A path between A and B is said to be blocked by a variable C if: – A and B are connected by a chain in which C is the middle node (so here that would be A->C->B or A<-C<-B), or – A and B are connected by a common cause, and C is that common cause (here: A B), or – A and B are connected by a common effect (‘collider’), but C is not that common effect, and C is not one of the effects of the common effect.
Blocking and d-separation
So… If you have a causal network that consists of variables coupled through (directed) structural relations… …then you can tell which conditional independence patterns will arise… …just by looking at the picture!!!!!!!!!!!!!
So… And in the other direction: if you have a set of conditional independencies, you can search for the causal network that could have produced them This is material Lourens will cover next week
Recipe: are A and B independent given C? 1.List every path between A and B 2.For every path, check whether C blocks it 3.If C blocks all the paths in step (2), then C d- separates A and B, and A is conditionally independent of B given C 4.If C does not block all the paths in step (2), then C does not d-separate A and B. In this case anything may happen: we don’t know.
Practice!