Presentation is loading. Please wait.

Presentation is loading. Please wait.

Center for Causal Discovery: Summer Short Course/Datathon

Similar presentations


Presentation on theme: "Center for Causal Discovery: Summer Short Course/Datathon"— Presentation transcript:

1 Center for Causal Discovery: Summer Short Course/Datathon - 2018
June 13, 2018 Causal Search II

2 Outline Models  Data Bridge Principles: Markov Axiom and D-separation
Model Equivalence Model Search For Patterns For PAGs Multiple Regression vs. Model Search Mixed Variables – Discrete and Continuous Orienting via the degree of Gaussianity

3 Independence Equivalence Classes: Patterns & PAGs
Patterns (Verma and Pearl, 1990): graphical representation of d-separation equivalence among models with no latent common causes PAGs: (Richardson 1994) graphical representation of a d-separation equivalence class that includes models with latent common causes and sample selection bias that are d-separation equivalent over a set of measured variables X

4 PAGs: Partial Ancestral Graphs
1. represents set of conditional independence and distribution equivalent graphs 2. same adjacencies 3. undirected edges mean some contain edge one way, some contain other way 4. directed edge means they all go same way 5. Pearl and Verma -complete rules for generating from Meek, Andersson, Perlman, and Madigan, and Chickering 6. instance of chain graph 7. since data can’t distinguish, in absence of background knowledge is right output for search 8. what are they good for?

5 PAGs: Partial Ancestral Graphs
1. represents set of conditional independence and distribution equivalent graphs 2. same adjacencies 3. undirected edges mean some contain edge one way, some contain other way 4. directed edge means they all go same way 5. Pearl and Verma -complete rules for generating from Meek, Andersson, Perlman, and Madigan, and Chickering 6. instance of chain graph 7. since data can’t distinguish, in absence of background knowledge is right output for search 8. what are they good for?

6 PAGs: Partial Ancestral Graphs
1. represents set of conditional independence and distribution equivalent graphs 2. same adjacencies 3. undirected edges mean some contain edge one way, some contain other way 4. directed edge means they all go same way 5. Pearl and Verma -complete rules for generating from Meek, Andersson, Perlman, and Madigan, and Chickering 6. instance of chain graph 7. since data can’t distinguish, in absence of background knowledge is right output for search 8. what are they good for?

7 PAGs: Partial Ancestral Graphs
What PAG edges mean. 1. represents set of conditional independence and distribution equivalent graphs 2. same adjacencies 3. undirected edges mean some contain edge one way, some contain other way 4. directed edge means they all go same way 5. Pearl and Verma -complete rules for generating from Meek, Andersson, Perlman, and Madigan, and Chickering 6. instance of chain graph 7. since data can’t distinguish, in absence of background knowledge is right output for search 8. what are they good for?

8 PAG Search: Orientation
Y Unshielded X Y Z X _||_ Z | Y X _||_ Z | Y Collider Non-Collider X Y Z X Y Z

9 PAG Search: Orientation
After Adjacency Phase Collider Test: X1 – X3 – X2 X1 _||_ X2 Away from Collider Test: X1  X3 – X4 X2  X3 – X4 X1 _||_ X4 | X3 X2 _||_ X4 | X3

10 Tetrad Demo and Hands-on: College Plans
Use FCI to search College Plans data, a = .01 Repeat with bootstrap

11 Interesting Cases X1 X2 L M1 M2 X Y Z Y1 Y2 L1 L L1 Z1 L2 X Y Z1 X Z2

12 Tetrad Demo and Hands-on
Create new session Select “Simulate from a given graph, then search” from Pipelines menu Build a graph for interesting case M1, parameterize as you wish, and generate a large sample (e.g., N=1,000). Execute a Pattern search, e.g., PC Execute a PAG search, e.g., FCI. Explain the results Repeat for M2 M2 X1 Y2 L1 Y1 X2 M1 L X Y Z

13 Detectable Instrumental Variables
X Y Z1 M3 Z2 L

14 rZ,Obesity rZ,TV Observational Studies - Instrumental Variables
Goal: Estimate b2 SES a g b2 Traffic Density Upwind (IV) b1 Lead IQ C1 C2 Cn Idea: find a “natural” randomizer, i.e., an “instrument” rZ,Obesity rZ,TV Required Assumptions: IV adjacent and into cause Any trek from IV to effect goes through cause IV causally disconnected from, i.e., independent of,every confounder Z direct cause of Contract $ Z _||_ Every Confounder 1

15 rZ,Obesity rZ,TV Observational Studies - Instrumental Variables
SES a g Traffic Density Upwind (IV) b1 b2 Lead IQ C1 C2 Cn In a standardized SEM: rZ,Obesity rZ,TV rIV,IQ rIV,Lead b1 * b2 = b2 = b1 rIV,effect rIV,cause = IV estimate of cause  effect coefficient Z direct cause of Contract $ Z _||_ Every Confounder 1

16 Tetrad Demo and Hands-on
Simulate from M3 interpreted as a standardized SEM Execute a PAG search, e.g., FCI. Use the sample correlation matrix to compute IV estimate of XY coefficient using Z1, and again separately for Z2. X Y Z1 M3 Z2 L Sample Correlation Matrix

17 Regression & Causal Inference
Z1 X Z2 M4 Y

18 Regression & Causal Inference
X  Y ?? Size of the effect? Typical (non-experimental) strategy: Establish a prima facie case (X associated with Y) But, omitted variable bias X Y Z ? So, identifiy and measure potential confounders Z that are: prior to X, associated with X, associated with Y 3. Statistically control/adjust for Z (multiple regression) 1

19 Regression & Causal Inference
Multiple regression or any similar strategy is provably unreliable for causal inference regarding X  Y, with covariates Z, unless: X truly prior to Y X, Z, and Y are causally sufficient (no confounding) Multiple regression : Response Y, Independent Variable X, Covariates Z Coefficient bX is an unbiased estimate of the causal influence of X on Y, if and only if: in the true causal graph, after removing any edge that exists between X and Y, X and Y are d-separated by Z 1

20 Tetrad Demo and Hands-on
Simulate from M4 parameterized as a standardized SEM, N=1,000. Execute FCI search Execute regression: Y response, X only predictor Execute regression: Y response, X and Z1 and Z2 as predictors

21 Outline Models  Data Bridge Principles: Markov Axiom and D-separation
Model Equivalence Model Search For Patterns For PAGs Multiple Regression vs. Model Search Mixed Variables – Discrete and Continuous (Joe Ramsey) Orienting via the degree of Gaussianity (Joe Ramsey)

22 Summary of Search

23 Causal Search from Passive Observation
PC, FGS  Patterns (Markov equivalence class - no latent confounding) FCI  PAGs (Markov equivalence - including confounders and selection bias) CCD  Linear cyclic models (no confounding) Lingam  unique DAG (no confounding – linear non-Gaussian – faithfulness not needed) BPC, FOFC, FTFC  (Equivalence class of linear latent variable models) LVLingam  set of DAGs (confounders allowed) CyclicLingam  set of DGs (cyclic models, no confounding) Non-linear additive noise models  unique DAG Most of these algorithms are pointwise consistent – uniform consistent algorithms require stronger assumptions 1

24 Causal Search from Manipulations/Interventions
What sorts of manipulation/interventions have been studied? Do(X=x) : replace P(X | parents(X)) with P(X=x) = 1.0 Randomize(X): (replace P(X | parents(X)) with PM(X), e.g., uniform) Soft interventions (replace P(X | parents(X)) with PM(X | parents(X), I), PM(I)) Simultaneous interventions (reduces the number of experiments required to be guaranteed to find the truth with an independence oracle from N-1 to 2 log(N) Sequential interventions Sequential, conditional interventions Time sensitive interventions 1

25 Randomization  Association = Causation
Randomizer Treatment Response Dropout Treatment _||_ Response | Dropout = no Treatment Response Randomizer U Assignment Treatment _||_ Response 1


Download ppt "Center for Causal Discovery: Summer Short Course/Datathon"

Similar presentations


Ads by Google