Center for Causal Discovery: Summer Short Course/Datathon

Slides:

Advertisements

Similar presentations

CS188: Computational Models of Human Behavior

Advertisements

1 Learning Causal Structure from Observational and Experimental Data Richard Scheines Carnegie Mellon University.

Discovering Cyclic Causal Models by Independent Components Analysis Gustavo Lacerda Peter Spirtes Joseph Ramsey Patrik O. Hoyer.

Topic Outline Motivation Representing/Modeling Causal Systems

Weakening the Causal Faithfulness Assumption

Outline 1)Motivation 2)Representing/Modeling Causal Systems 3)Estimation and Updating 4)Model Search 5)Linear Latent Variable Models 6)Case Study: fMRI.

Peter Spirtes, Jiji Zhang 1. Faithfulness comes in several flavors and is a kind of principle that selects simpler (in a certain sense) over more complicated.

Dynamic Bayesian Networks (DBNs)

1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.

BAYESIAN INFERENCE Sampling techniques

The Simple Linear Regression Model: Specification and Estimation

CORRELATION. Overview of Correlation u What is a Correlation? u Correlation Coefficients u Coefficient of Determination u Test for Significance u Correlation.

© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.

1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.

1Causality & MDL Causal Models as Minimal Descriptions of Multivariate Systems Jan Lemeire June 15 th 2006.

Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.

Ambiguous Manipulations

Independent Component Analysis (ICA) and Factor Analysis (FA)

1 gR2002 Peter Spirtes Carnegie Mellon University.

Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006.

Causal Modeling for Anomaly Detection Andrew Arnold Machine Learning Department, Carnegie Mellon University Summer Project with Naoki Abe Predictive Modeling.

1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.

Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.

CAUSAL SEARCH IN THE REAL WORLD. A menu of topics  Some real-world challenges:  Convergence & error bounds  Sample selection bias  Simpson’s paradox.

Bayes Net Perspectives on Causation and Causal Inference

1 Part 2 Automatically Identifying and Measuring Latent Variables for Causal Theorizing.

1 Tetrad: Machine Learning and Graphcial Causal Models Richard Scheines Joe Ramsey Carnegie Mellon University Peter Spirtes, Clark Glymour.

Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.

Causal Inference and Graphical Models Peter Spirtes Carnegie Mellon University.

Lecture 8: Generalized Linear Models for Longitudinal Data.

CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.

L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.

1 Tutorial: Causal Model Search Richard Scheines Carnegie Mellon University Peter Spirtes, Clark Glymour, Joe Ramsey, others.

1 Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon.

Nov. 13th, Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour, and many others Dept. of Philosophy & CALD Carnegie Mellon.

Mediation: Solutions to Assumption Violation

Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,

Penn State - March 23, The TETRAD Project: Computational Aids to Causal Discovery Peter Spirtes, Clark Glymour, Richard Scheines and many others.

Learning With Bayesian Networks Markus Kalisch ETH Zürich.

INTERVENTIONS AND INFERENCE / REASONING. Causal models  Recall from yesterday:  Represent relevance using graphs  Causal relevance ⇒ DAGs  Quantitative.

CS Statistical Machine learning Lecture 24

Exploratory studies: you have empirical data and you want to know what sorts of causal models are consistent with it. Confirmatory tests: you have a causal.

Talking Points Joseph Ramsey. LiNGAM Most of the algorithms included in Tetrad (other than KPC) assume causal graphs are to be inferred from conditional.

Lecture 2: Statistical learning primer for biologists

The Visual Causality Analyst: An Interactive Interface for Causal Reasoning Jun Wang, Stony Brook University Klaus Mueller, Stony Brook University, SUNY.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Causal inferences This week we have been discussing ways to make inferences about the causal relationships between variables. One of the strongest ways.

1Causal Performance Models Causal Models for Performance Analysis of Computer Systems Jan Lemeire TELE lab May 24 th 2006.

1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.

1Causal Inference and the KMSS. Predicting the future with the right model Stijn Meganck Vrije Universiteit Brussel Department of Electronics and Informatics.

1 Day 2: Search June 14, 2016 Carnegie Mellon University Center for Causal Discovery.

CS498-EA Reasoning in AI Lecture #19 Professor: Eyal Amir Fall Semester 2011.

Methods of Presenting and Interpreting Information Class 9.

8 Experimental Research Design.

Day 3: Search Continued Center for Causal Discovery June 15, 2015

CJT 765: Structural Equation Modeling

Making Sense of Advanced Statistical Procedures in Research Articles

Markov Properties of Directed Acyclic Graphs

Center for Causal Discovery: Summer Short Course/Datathon

Ten things about Experimental Design

Migration and the Labour Market

Causal Data Mining Richard Scheines

Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour,

The Nonexperimental and Quasi-Experimental Strategies

Searching for Graphical Causal Models of Education Data

Principles of Experimental Design

Structural Equation Modeling

Presentation transcript:

Center for Causal Discovery: Summer Short Course/Datathon - 2018 June 13, 2018 Causal Search II

Outline Models  Data Bridge Principles: Markov Axiom and D-separation Model Equivalence Model Search For Patterns For PAGs Multiple Regression vs. Model Search Mixed Variables – Discrete and Continuous Orienting via the degree of Gaussianity

Independence Equivalence Classes: Patterns & PAGs Patterns (Verma and Pearl, 1990): graphical representation of d-separation equivalence among models with no latent common causes PAGs: (Richardson 1994) graphical representation of a d-separation equivalence class that includes models with latent common causes and sample selection bias that are d-separation equivalent over a set of measured variables X

PAGs: Partial Ancestral Graphs 1. represents set of conditional independence and distribution equivalent graphs 2. same adjacencies 3. undirected edges mean some contain edge one way, some contain other way 4. directed edge means they all go same way 5. Pearl and Verma -complete rules for generating from Meek, Andersson, Perlman, and Madigan, and Chickering 6. instance of chain graph 7. since data can’t distinguish, in absence of background knowledge is right output for search 8. what are they good for?

PAGs: Partial Ancestral Graphs 1. represents set of conditional independence and distribution equivalent graphs 2. same adjacencies 3. undirected edges mean some contain edge one way, some contain other way 4. directed edge means they all go same way 5. Pearl and Verma -complete rules for generating from Meek, Andersson, Perlman, and Madigan, and Chickering 6. instance of chain graph 7. since data can’t distinguish, in absence of background knowledge is right output for search 8. what are they good for?

PAGs: Partial Ancestral Graphs 1. represents set of conditional independence and distribution equivalent graphs 2. same adjacencies 3. undirected edges mean some contain edge one way, some contain other way 4. directed edge means they all go same way 5. Pearl and Verma -complete rules for generating from Meek, Andersson, Perlman, and Madigan, and Chickering 6. instance of chain graph 7. since data can’t distinguish, in absence of background knowledge is right output for search 8. what are they good for?

PAGs: Partial Ancestral Graphs What PAG edges mean. 1. represents set of conditional independence and distribution equivalent graphs 2. same adjacencies 3. undirected edges mean some contain edge one way, some contain other way 4. directed edge means they all go same way 5. Pearl and Verma -complete rules for generating from Meek, Andersson, Perlman, and Madigan, and Chickering 6. instance of chain graph 7. since data can’t distinguish, in absence of background knowledge is right output for search 8. what are they good for?

PAG Search: Orientation Y Unshielded X Y Z X _||_ Z | Y X _||_ Z | Y Collider Non-Collider X Y Z X Y Z

PAG Search: Orientation After Adjacency Phase Collider Test: X1 – X3 – X2 X1 _||_ X2 Away from Collider Test: X1  X3 – X4 X2  X3 – X4 X1 _||_ X4 | X3 X2 _||_ X4 | X3

Tetrad Demo and Hands-on: College Plans Use FCI to search College Plans data, a = .01 Repeat with bootstrap

Interesting Cases X1 X2 L M1 M2 X Y Z Y1 Y2 L1 L L1 Z1 L2 X Y Z1 X Z2

Tetrad Demo and Hands-on Create new session Select “Simulate from a given graph, then search” from Pipelines menu Build a graph for interesting case M1, parameterize as you wish, and generate a large sample (e.g., N=1,000). Execute a Pattern search, e.g., PC Execute a PAG search, e.g., FCI. Explain the results Repeat for M2 M2 X1 Y2 L1 Y1 X2 M1 L X Y Z

Detectable Instrumental Variables X Y Z1 M3 Z2 L

rZ,Obesity rZ,TV Observational Studies - Instrumental Variables Goal: Estimate b2 SES a g b2 Traffic Density Upwind (IV) b1 Lead IQ C1 C2 Cn Idea: find a “natural” randomizer, i.e., an “instrument” rZ,Obesity rZ,TV Required Assumptions: IV adjacent and into cause Any trek from IV to effect goes through cause IV causally disconnected from, i.e., independent of,every confounder Z direct cause of Contract $ Z _||_ Every Confounder 1

rZ,Obesity rZ,TV Observational Studies - Instrumental Variables SES a g Traffic Density Upwind (IV) b1 b2 Lead IQ C1 C2 Cn In a standardized SEM: rZ,Obesity rZ,TV rIV,IQ rIV,Lead b1 * b2 = b2 = b1 rIV,effect rIV,cause = IV estimate of cause  effect coefficient Z direct cause of Contract $ Z _||_ Every Confounder 1

Tetrad Demo and Hands-on Simulate from M3 interpreted as a standardized SEM Execute a PAG search, e.g., FCI. Use the sample correlation matrix to compute IV estimate of XY coefficient using Z1, and again separately for Z2. X Y Z1 M3 Z2 L Sample Correlation Matrix

Regression & Causal Inference Z1 X Z2 M4 Y

Regression & Causal Inference X  Y ?? Size of the effect? Typical (non-experimental) strategy: Establish a prima facie case (X associated with Y) But, omitted variable bias X Y Z ? So, identifiy and measure potential confounders Z that are: prior to X, associated with X, associated with Y 3. Statistically control/adjust for Z (multiple regression) 1

Regression & Causal Inference Multiple regression or any similar strategy is provably unreliable for causal inference regarding X  Y, with covariates Z, unless: X truly prior to Y X, Z, and Y are causally sufficient (no confounding) Multiple regression : Response Y, Independent Variable X, Covariates Z Coefficient bX is an unbiased estimate of the causal influence of X on Y, if and only if: in the true causal graph, after removing any edge that exists between X and Y, X and Y are d-separated by Z 1

Tetrad Demo and Hands-on Simulate from M4 parameterized as a standardized SEM, N=1,000. Execute FCI search Execute regression: Y response, X only predictor Execute regression: Y response, X and Z1 and Z2 as predictors

Outline Models  Data Bridge Principles: Markov Axiom and D-separation Model Equivalence Model Search For Patterns For PAGs Multiple Regression vs. Model Search Mixed Variables – Discrete and Continuous (Joe Ramsey) Orienting via the degree of Gaussianity (Joe Ramsey)

Summary of Search

Causal Search from Passive Observation PC, FGS  Patterns (Markov equivalence class - no latent confounding) FCI  PAGs (Markov equivalence - including confounders and selection bias) CCD  Linear cyclic models (no confounding) Lingam  unique DAG (no confounding – linear non-Gaussian – faithfulness not needed) BPC, FOFC, FTFC  (Equivalence class of linear latent variable models) LVLingam  set of DAGs (confounders allowed) CyclicLingam  set of DGs (cyclic models, no confounding) Non-linear additive noise models  unique DAG Most of these algorithms are pointwise consistent – uniform consistent algorithms require stronger assumptions 1

Causal Search from Manipulations/Interventions What sorts of manipulation/interventions have been studied? Do(X=x) : replace P(X | parents(X)) with P(X=x) = 1.0 Randomize(X): (replace P(X | parents(X)) with PM(X), e.g., uniform) Soft interventions (replace P(X | parents(X)) with PM(X | parents(X), I), PM(I)) Simultaneous interventions (reduces the number of experiments required to be guaranteed to find the truth with an independence oracle from N-1 to 2 log(N) Sequential interventions Sequential, conditional interventions Time sensitive interventions 1

Randomization  Association = Causation Randomizer Treatment Response Dropout Treatment _||_ Response | Dropout = no Treatment Response Randomizer U Assignment Treatment _||_ Response 1