1 Searching for Causal Models Richard Scheines Philosophy, Machine Learning, Human-Computer Interaction Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
1 Learning Causal Structure from Observational and Experimental Data Richard Scheines Carnegie Mellon University.
Advertisements

Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon.
Discovering Cyclic Causal Models by Independent Components Analysis Gustavo Lacerda Peter Spirtes Joseph Ramsey Patrik O. Hoyer.
Topic Outline Motivation Representing/Modeling Causal Systems
Weakening the Causal Faithfulness Assumption
Outline 1)Motivation 2)Representing/Modeling Causal Systems 3)Estimation and Updating 4)Model Search 5)Linear Latent Variable Models 6)Case Study: fMRI.
Introduction to Regression with Measurement Error STA431: Spring 2015.
1 Automatic Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour Dept. of Philosophy & CALD Carnegie Mellon.
Chapter 12 Simple Regression
Statistics for Managers Using Microsoft® Excel 5th Edition
Stat 512 – Lecture 18 Multiple Regression (Ch. 11)
Mediating Between Causes and Probabilities: the Use of Graphical Models in Econometrics Alessio Moneta Max Planck Institute of Economics, Jena, and Sant’Anna.
Ambiguous Manipulations
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Linear Regression and Correlation Analysis
Topic 3: Regression.
1 gR2002 Peter Spirtes Carnegie Mellon University.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Causal Modeling for Anomaly Detection Andrew Arnold Machine Learning Department, Carnegie Mellon University Summer Project with Naoki Abe Predictive Modeling.
Linear Regression Models Powerful modeling technique Tease out relationships between “independent” variables and 1 “dependent” variable Models not perfect…need.
1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search.
1 Center for Causal Discovery: Summer Workshop June 8-11, 2015 Carnegie Mellon University.
1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Bayes Net Perspectives on Causation and Causal Inference
1 Part 2 Automatically Identifying and Measuring Latent Variables for Causal Theorizing.
1 Tetrad: Machine Learning and Graphcial Causal Models Richard Scheines Joe Ramsey Carnegie Mellon University Peter Spirtes, Clark Glymour.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Causal Inference and Graphical Models Peter Spirtes Carnegie Mellon University.
Understanding Statistics
Name: Angelica F. White WEMBA10. Teach students how to make sound decisions and recommendations that are based on reliable quantitative information During.
1 Tutorial: Causal Model Search Richard Scheines Carnegie Mellon University Peter Spirtes, Clark Glymour, Joe Ramsey, others.
1 Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
1 Center for Causal Discovery: Summer Workshop June 8-11, 2015 Carnegie Mellon University.
Nov. 13th, Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour, and many others Dept. of Philosophy & CALD Carnegie Mellon.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Learning Linear Causal Models Oksana Kohutyuk ComS 673 Spring 2005 Department of Computer Science Iowa State University.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
14- 1 Chapter Fourteen McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved.
Course files
Penn State - March 23, The TETRAD Project: Computational Aids to Causal Discovery Peter Spirtes, Clark Glymour, Richard Scheines and many others.
INTERVENTIONS AND INFERENCE / REASONING. Causal models  Recall from yesterday:  Represent relevance using graphs  Causal relevance ⇒ DAGs  Quantitative.
G Lecture 81 Comparing Measurement Models across Groups Reducing Bias with Hybrid Models Setting the Scale of Latent Variables Thinking about Hybrid.
Lecture 2: Statistical learning primer for biologists
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
1 Day 2: Search June 9, 2015 Carnegie Mellon University Center for Causal Discovery.
1 Day 2: Search June 14, 2016 Carnegie Mellon University Center for Causal Discovery.
Tetrad 1)Main website: 2)Download:
Methods of Presenting and Interpreting Information Class 9.
Chapter 15 Multiple Regression Model Building
Day 3: Search Continued Center for Causal Discovery June 15, 2015
CJT 765: Structural Equation Modeling
Workshop Files
Markov Properties of Directed Acyclic Graphs
Workshop Files
Center for Causal Discovery: Summer Short Course/Datathon
Center for Causal Discovery: Summer Short Course/Datathon
Causal Data Mining Richard Scheines
Richard Scheines Carnegie Mellon University
Causal Discovery Richard Scheines Peter Spirtes, Clark Glymour,
Extra Slides.
Multiple Regression Chapter 14.
Searching for Graphical Causal Models of Education Data
Presentation transcript:

1 Searching for Causal Models Richard Scheines Philosophy, Machine Learning, Human-Computer Interaction Carnegie Mellon University

2 Goals 1.Basic Familiarity with Causal Model Search: oWhat it is oWhat it can and cannot do 2.Basic Familiarity with Tetrad IV oWhat it is oWhat it can and cannot do

3 Outline 1.Motivation 2.Representing Causal Systems 3.Strategies for Causal Inference 4.Causal Model Search 5.Examples 6.Causal Model Search with Latent Variables

1. Motivation Conditioning ≠ Intervening : P(Y | X = x ) ≠ P(Y | X set= x) When and how can we use non-experimental data to tell us about the effect of a future intervention?

Motivation Rumsfeld Problem: Do we know what we don’t know: Can we tell when there is not enough information in the data + background knowledge to infer causation?

Motivation: Example Online Course: As good or better than lecture? What student behaviors cause learning?

Full Semester Online Course in Causal & Statistical Reasoning

 Course is tooled to record certain events:  Logins, page requests, print requests, quiz attempts, quiz scores, voluntary exercises attempted, etc.  Each event was associated with attributes:  Time  student-id  Session-id

9 Experiments 2000 : Online vs. Lecture, UCSD Winter (N = 180) Spring (N = 120) 2001: Online vs. Lecture, Pitt & UCSD UCSD - winter (N = 190) Pitt (N = 80) UCSD - spring (N = 110)

10 Online vs. Lecture Delivery Online: No lecture / one recitation per week Required to finish approximately 2 online modules / week Lecture: 2 Lectures / one recitation per week Printed out modules as reading – extra assignments Same Material, same Exams: 2 Paper and Pencil Midterms 1 Paper and Pencil Final Exam

11 Pitt Variables  Pre-test (%)  Midterm1 (%)  Midterm 2 (%)  Final Exam (%)  Recitation attendance (%)  Lecture attendance (%)  Gender  Online (1 = online, 0 = lecture)

12 Online vs. Lecture - Pitt Online students did 1/2 a St.Dev better than lecture students (p =.059) Factors affecting performance: Practice Questions Attempted Cost: Online condition costs 1/3 less per student df = 2  2 = 0.08 p-value =.96

13 Printing and Voluntary Comprehension Checks: >

14 2. Representing Causal Systems 1.Causal structure - qualitatively 2.Interventions 3.Statistical Causal Models 1.Causal Bayes Networks 2.Structural Equation Models

15 Causal Graphs Causal Graph G = {V,E} Each edge X  Y represents a direct causal claim: X is a direct cause of Y relative to V Chicken Pox

16 Causal Graphs Do Not need to be Cause Complete Do need to be Common Cause Complete Omitted Causes 2Omitted Causes 1

17 Sweaters On Room Temperature Pre-experimental SystemPost Modeling Ideal Interventions Interventions on the Effect

18 Modeling Ideal Interventions Sweaters On Room Temperature Pre-experimental SystemPost Interventions on the Cause

19 Interventions & Causal Graphs Model an ideal intervention by adding an “intervention” variable outside the original system as a direct cause of its target. Pre-intervention graph Intervene on Income “Soft” Intervention “Hard” Intervention

20 Causal Bayes Networks P(S = 0) =.7 P(S = 1) =.3 P(YF = 0 | S = 0) =.99P(LC = 0 | S = 0) =.95 P(YF = 1 | S = 0) =.01P(LC = 1 | S = 0) =.05 P(YF = 0 | S = 1) =.20P(LC = 0 | S = 1) =.80 P(YF = 1 | S = 1) =.80P(LC = 1 | S = 1) =.20 P(S,YF, L) = P(S) P(YF | S) P(LC | S) The Joint Distribution Factors According to the Causal Graph, i.e., for all X in V P(V) =  P(X|Immediate Causes of(X))

21 Tetrad Demo

22 Structural Equation Models 1. Structural Equations 2. Statistical Constraints Statistical Model Causal Graph

23 Structural Equation Models  Structural Equations: One Equation for each variable V in the graph: V = f(parents(V), error V ) for SEM (linear regression) f is a linear function  Statistical Constraints: Joint Distribution over the Error terms Causal Graph

24 Structural Equation Models Equations: Education =  ed Income =    Education  income Longevity =    Education  Longevity Statistical Constraints: (  ed,  Income,  Income ) ~N(0,  2 )  2  diagonal - no variance is zero Causal Graph SEM Graph (path diagram)

Calculating the Effect of Interventions Pre-manipulation Joint Distribution (YF,S,L) Intervention, Causal Graph Post-manipulation Joint Distribution (YF,S,L)

Calculating the Effect of Interventions P(YF,S,L) = P(S) P(YF|S) P(L|S) P(YF,S,L) m = P(S) P(YF|Manip) P(L|S) Replace pre-manipulation causes with manipulation

Structural Equations: Education =  ed Longevity =  f   Education)  Longevity Income = f   Education)  income Modularity of Intervention/Manipulation Causal Graph Manipulated Structural Equations: Education =  ed Longevity =  f   Education)  Longevity Income = f   M1)  Manipulated Causal Graph M1

Structural Equations: Education =  ed Longevity =  f   Education)  Longevity Income = f   Education)  income Modularity of Intervention/Manipulation Causal Graph Manipulated Structural Equations: Education =  ed Longevity =  f   Education)  Longevity Income = f   M2,Education)  income  Manipulated Causal Graph M2

29 3. Strategies for Causal Inference

Goal: Causation (X  Y)  Problem: Association  Causation  Why? -- Mainly confounding  Solutions (Designs) oExperiments  Controlled Trials  Randomized Trials oObservational Studies  Quasi-Experiments - Fortuitous Randomization  Instrumental Variables  Statistical Control  Quasi-Experiments – Blocking  Interrupted Time Series oCausal Model Search 30

31 Statistical Evidence - Question 1: Is there an Association?  TV,Obsesity ≠ 0  TV,Obsesity = 0

32 Statistical Evidence – Question 2: Is the Association Spurious?  TV,Obsesity ≠ 0 Produced by: Spurious Association Causal Association

33 The Problem of Confounding TV Obesity Permissiveness of Parents C1C1 C2C2 CnCn ?? Contract $ # IEDs Ethnic Alignment with Central Govt. C1C1 C2C2 CnCn ?? Hours of TV BMI Contract $ # IEDs

34 Randomized Trials eliminate Spurious Association Exposure (treatment) assigned randomly In an RT: association between exposure and outcome: strong evidence of causation:

35 Designs for Dealing With Confounding Contract $ # IEDs Ethnic Alignment C1C1 C2C2 CnCn ?? Randomizer 1) Experiments - Randomized Trials

36 Designs for Dealing With Confounding Contract $ # IEDs Ethnic Alignment C1C1 C2C2 CnCn ?? Randomizer 1) Experiments - Randomized Trials All confounders removed Often Ethically or Practically Impossible

37 Designs for Dealing With Confounding Contract $ # IEDs Ethnic Alignment C1C1 C2C2 CnCn ?? 2a) Observational Studies - Statistical Control  Contract$,#IEDs All confounders must be measured.EthnicAlignment, C1, C2,..,Cn

38 Eliminating Spurious Association without Randomizing/Assigning/Controlling Exposure All confounders measured?  TV,Obestity.Permissiveness ≠ 0 Confounders measured well?  TV,Obestity.PoorMeasure ≠ 0 Statistical Adjustment (controlling for covariates)  TV,Obestity.Permissiveness = 0  TV,Obestity. ≠ 0

39 Designs for Dealing With Confounding 2b) Observational Studies - Instrumental Variables Contracting Agent (Z) Needed Assumptions: Z direct cause of Contract $ Z independent of every confounder Contract $ # IEDs Ethnic Alignment with Central Govt. C1C1 C2C2 CnCn ?? Idea: Z is a partial natural randomizer

40 Designs for Dealing With Confounding Gender-matched Instructor Learning C1C1 C2C2 CnCn ?? 2c) Observational Studies: Quasi-Experiments – Fortuitous Randomization Random Assignment of Instructor

41 Designs for Dealing With Confounding Gender-matched Instructor Learning C1C1 C2C2 CnCn ?? 2c) Observational Studies: Quasi-Experiments – Fortuitous Randomization Random Assignment of Instructor

42 Designs for Dealing With Confounding TV Obesity Permissiveness of Parents C1C1 C2C2 CnCn ?? 2c) Quasi-Experiments - Blocking Identical Twins Subset Data to only Twins

43 Strategies for Dealing With Confounding TV Obesity Permissiveness of Parents C1C1 C2C2 CnCn ?? 2c) Quasi-Experiments - Blocking Identical Twins TV,Obesity in Twin 1 vs. TV,Obesity in Twin 2 Subset Data to only Twins

44 Regression & Causal Inference

45 Regression & Causal Inference 2.So, identifiy and measure potential confounders Z: a)prior to X, b)associated with X, c)associated with Y Typical (non-experimental) strategy: 1.Establish a prima facie case (X associated with Y) 3. Statistically adjust for Z (multiple regression) But, omitted variable bias

46 Regression & Causal Inference Strategy threatened by measurement error – ignore this for now Multiple regression is provably unreliable for causal inference unless: X prior to Y X, Z, and Y are causally sufficient (no confounding)

Examples Truth RegressionAlternative?  X = 0  Z ≠ 0  X ≠ 0  Z ≠ 0  X ≠ 0  Z1 ≠ 0  Z2 ≠ 0

48 Better Methods Exist Causal Model Search (since 1988): Provably Reliable Provably Rumsfeld

49 4. Causal Model Search

50 Causal Discovery Statistical Data  Causal Structure Background Knowledge - X 2 before X 3 - no unmeasured common causes Statistical Inference

51 Faithfulness Constraints on a probability distribution P generated by a causal structure G hold for all parameterizations of G. Revenues = aRate + cEconomy +  Rev. Economy = bRate +  Econ. Faithfulness: a ≠ -bc

52 The Problem of Alternatives: Observationally Equivalent Models Given an Experimental Setup, and Background Knowledge, and Theory, and a set of independence relations, what are all the models that would entail those independence relations that are consistent with BK and Theory?

53 Equivalence Classes Independence (d-separation equivalence) DAGs : Patterns PAGs : Latent variable models Intervention Equivalence Classes Measurement Model Equivalence Classes Linear Non-Gaussian Model Equivalence Classes Equivalence: Independence (M 1 ╞ X _||_ Y | Z  M 2 ╞ X _||_ Y | Z) Distribution (  1  2 M 1 (  1 ) = M 2 (  2 ))

54 Representations of Independence Equivalence Classes We want the representations to: Characterize the Independence Relations Entailed by the Equivalence Class Represent causal features that are shared by every member of the equivalence class

55 Patterns & PAGs Patterns (Verma and Pearl, 1990): graphical representation of Markov equivalence - with no latent variables. PAGs: (Richardson 1994) graphical representation of an equivalence class including latent variable models and sample selection bias that are Markov equivalent over a set of measured variables X

56 Patterns

57 Patterns

58 PAGs: Partial Ancestral Graphs

Regression vs. PAGs X Y Z 2 Z 1 Truth RegressionPAG X Y Z 1 X Y  X = 0  Z ≠ 0  X ≠ 0  Z ≠ 0  X ≠ 0  Z1 ≠ 0  Z2 ≠ 0

60 Causal Model Search Background Knowledge PC, GES, CPC FCI, CFCI Impossible

61 Overview of Search Methods Constraint Based Searches TETRAD (SGS, PC, FCI) Very fast – capable of handling 1,000 variables Pointwise, but not uniformly consistent Scoring Searches Scores: BIC, AIC, etc. Search: Hill Climb, Genetic Alg., Simulated Annealing Difficult to extend to latent variable models Meek and Chickering Greedy Equivalence Class (GES) Very slow – max N ~ Pointwise, but not uniformly consistent

62 5. Examples

63 Case Study 1: Foreign Investment Does Foreign Investment in 3 rd World Countries cause Political Repression? Timberlake, M. and Williams, K. (1984). Dependence, political exclusion, and government repression: Some cross-national evidence. American Sociological Review 49, N = 72 POdegree of political exclusivity CVlack of civil liberties ENenergy consumption per capita (economic development) FIlevel of foreign investment

64 Correlations po fi en fi en cv Case Study 1: Foreign Investment

65 Regression Results po =.227*fi -.176*en +.880*cv SE (.058) (.059) (.060) t Interpretation: foreign investment increases political repression Case Study 1: Foreign Investment

Alternatives Case Study 1: Foreign Investment There is no model with testable constraints (df > 0) in which FI has a positive effect on PO that is not rejected by the data.

67 Variables  Tangibility/Concreteness (Exp manipulation)  Imaginability (likert 1-7)  Impact (avg. of 2 likerts)  Sympathy (likert)  Donation ($) Case Study 2: Charitable Giving Cryder & Loewenstein (in prep)

68 Theoretical Model Case Study 2: Charitable Giving study 1 (N= 94) df = 5,  2 = 52.0, p=

69 GES Outputs Case Study 2: Charitable Giving study 1: df = 5,  2 = 5.88, p= 0.32 study 1: df = 5,  2 = 3.99, p= 0.55

70 Theoretical Model Case Study 2: Charitable Giving study 2 (N= 115) df = 5,  2 = 62.6, p= study 2: df = 5,  2 = 8.23, p= 0.14 study 2: df = 5,  2 = 7.48, p= 0.18

71 GES Outputs Case Study 2: Charitable Giving study 1: df = 5,  2 = 5.88, p= 0.32 study 2: df = 5,  2 = 8.23, p= 0.14 study 1: df = 5,  2 = 3.99, p= 0.55 study 2: df = 5,  2 = 7.48, p= 0.18

Lead and IQ: Variable Selection Final Variables (Needleman) -leadbaby teeth -fabfather’s age -mabmother’s age -nlbnumber of live births -medmother’s education -piqparent’s IQ -ciqchild’s IQ

Needleman Regression - standardized coefficient - (t-ratios in parentheses) - p-value for significance ciq = lead fab nlb med mab piq (2.32) (1.79) (2.30) (3.08) (1.97) (3.87) < <0.01 All variables significant at.1 R 2 =.271

TETRAD Variable Selection Tetrad mab _||_ ciq fab _||_ ciq nlb _||_ ciq | med Regression mab _||_ ciq | { lead, med, piq, nlb fab} fab _||_ ciq | { lead, med, piq, nlb mab} nlb _||_ ciq | { lead, med, piq, mab, fab}

Regressions - standardized coefficient - (t-ratios in parentheses) - p-value for significance Needleman (R 2 =.271) ciq = lead fab nlb med mab piq (2.32) (1.79) (2.30) (3.08) (1.97) (3.87) < <0.01 TETRAD (R 2 =.243) ciq = lead med piq (2.89) (3.50) (3.59) <0.01 <0.01 <0.01

Measurement Error  Measured regressor variables are proxies that involve measurement error  Errors-in-all-variables model for Lead’s influence on IQ - underidentified Strategies: Sensitivity Analysis Bayesian Analysis

Prior over Measurement Error Proportion of Variance from Measurement Error  Measured Lead Mean =.2,SD =.1  Parent’s IQMean =.3,SD =.15  Mother’s Education Mean =.3,SD =.15 Prior Otherwise uninformative

Posterior Zero Robust over similar priors

Using Needleman’s Covariates With similar prior, the marginal posterior: Very Sensitive to Prior Over Regressors TETRAD eliminated Zero

80 6. Causal Model Search with Latent Variables

81 The Causal Theory Formation Problem for Latent Variable Models Given observations on a number of variables, identify the latent variables that underlie these variables and the causal relations among these latent concepts. Example: Spectral measurements of solar radiation intensities. Variables are intensities at each measured frequency. Example: Quality of a Child’s Home Environment, Cumulative Exposure to Lead, Cognitive Functioning

82 The Most Common Automatic Solution: Exploratory Factor Analysis  Chooses “factors” to account linearly for as much of the variance/covariance of the measured variables as possible.  Great for dimensionality reduction  Factor rotations are arbitrary  Gives no information about the statistical and thus the causal dependencies among any real underlying factors.  No general theory of the reliability of the procedure

83 Other Solutions  Independent Components, etc  Background Theory  Scales

84 Other Solutions: Background Theory Key Causal Question Thus, key statistical question: Lead _||_ Cog | Home ? Specified Model

85 Lead _||_ Cog | Home ? Yes, but statistical inference will say otherwise. Other Solutions: Background Theory True Model “Impurities”

86 Purify Specified Model

87 Purify True Model

88 Purify True Model

89 Purify True Model

90 Purify True Model

91 Purify Purified Model

92 Scale = sum(measures of a latent) Other Solutions: Scales

93 True Model Other Solutions: Scales Pseudo-Random Sample: N = 2,000

94 Scales vs. Latent variable Models Regression: Cognition on Home, Lead Predictor Coef SE Coef T P Constant Home Lead S = R-Sq = 61.1% R-Sq(adj) = 61.0% Insig. True Model

95 Scales vs. Latent variable Models Scales homescale = (x1 + x2 + x3)/3 leadscale = (x4 + x5 + x6)/3 cogscale = (x7 + x8 + x9)/3 True Model

96 Scales vs. Latent variable Models Cognition = homescale Lead Predictor Coef SE Coef T P Constant homescal Lead Regression: Cognition on homescale, Lead Sig. True Model

97 Scales vs. Latent variable Models Modeling Latents True Model Specified Model

98 Scales vs. Latent variable Models (  2 = 29.6, df = 24, p =.19) B5 =.0075, which at t=.23, is correctly insignificant True Model Estimated Model

99 Scales vs. Latent variable Models Mixing Latents and Scales (  2 = 14.57, df = 12, p =.26) B5 = -.137, which at t=5.2, is incorrectly highly significant P <.001 True Model

100 Build Pure Clusters Output - provably reliable (pointwise consistent): Equivalence class of measurement models over a pure subset of measures True Model Output

101 Build Pure Clusters Qualitative Assumptions 1.Two types of nodes: measured (M) and latent (L) 2.M L (measured don’t cause latents) 3.Each m  M measures (is a direct effect of) at least one l  L 4.No cycles involving M Quantitative Assumptions: 1.Each m  M is a linear function of its parents plus noise 2.P(L) has second moments, positive variances, and no deterministic relations

102 Case Study 4: Stress, Depression, and Religion MSW Students (N = 127) 61 - item survey (Likert Scale) Stress: St 1 - St 21 Depression: D 1 - D 20 Religious Coping: C 1 - C 20 p = 0.00 Specified Model

103 Build Pure Clusters Case Study 4: Stress, Depression, and Religion

104 Assume Stress temporally prior: MIMbuild to find Latent Structure: p = 0.28 Case Study 4: Stress, Depression, and Religion

105 Case Study 5: Test Anxiety Bartholomew and Knott (1999), Latent variable models and factor analysis 12th Grade Males in British Columbia (N = 335) 20 - item survey (Likert Scale items): X 1 - X 20 : Exploratory Factor Analysis:

106 Build Pure Clusters : Case Study 5: Test Anxiety

107 Build Pure Clusters: p-value = 0.00p-value = 0.47 Exploratory Factor Analysis: Case Study 5: Test Anxiety

108 MIMbuild p =.43Unininformative Scales: No Independencies or Conditional Independencies Case Study 5: Test Anxiety

109 Economics  Bessler, Pork Prices  Hoover, multiple Other Cases Educational Research  Easterday, Bias & Recall  Laski, Numerical coding Climate Research  Glymour, Chu,, Teleconnections Biology  Shipley,  SGS, Spartina Grass Neuroscience  Glymour & Ramsey, fMRI Epidemiology  Scheines, Lead & IQ

Software Education: - Causality Lab: - Web Course on Causal and Statistical Reasoning, and Empirical Research Methods: Research: Tetrad:

References  Causation, Prediction, and Search, 2 nd Edition, (2000), by P. Spirtes, C. Glymour, and R. Scheines ( MIT Press)  Causality: Models, Reasoning, and Inference (2000). By Judea Pearl, Cambridge Univ. Press  Computation, Causation, & Discovery (1999), edited by C. Glymour and G. Cooper, MIT Press

112 References Biology Chu, Tianjaio, Glymour C., Scheines, R., & Spirtes, P, (2002). A Statistical Problem for Inference to Regulatory Structure from Associations of Gene Expression Measurement with Microarrays. Bioinformatics, 19: Shipley, B. Exploring hypothesis space: examples from organismal biology. Computation, Causation and Discovery. C. Glymour and G. Cooper. Cambridge, MA, MIT Press. Shipley, B. (1995). Structured interspecific determinants of specific leaf area in 34 species of herbaceous angeosperms. Functional Ecology 9.

113 References Scheines, R. (2000). Estimating Latent Causal Influences: TETRAD III Variable Selection and Bayesian Parameter Estimation: the effect of Lead on IQ, Handbook of Data Mining, Pat Hayes, editor, Oxford University Press. Jackson, A., and Scheines, R., (2005). Single Mothers' Self-Efficacy, Parenting in the Home Environment, and Children's Development in a Two-Wave Study, Social Work Research, 29, 1, pp Timberlake, M. and Williams, K. (1984). Dependence, political exclusion, and government repression: Some cross-national evidence. American Sociological Review 49,

114 References Economics Akleman, Derya G., David A. Bessler, and Diana M. Burton. (1999). ‘Modeling corn exports and exchange rates with directed graphs and statistical loss functions’, in Clark Glymour and Gregory F. Cooper (eds) Computation, Causation, and Discovery, American Association for Artificial Intelligence, Menlo Park, CA and MIT Press, Cambridge, MA, pp Awokuse, T. O. (2005) “Export-led Growth and the Japanese Economy: Evidence from VAR and Directed Acyclical Graphs,” Applied Economics Letters 12(14), Bessler, David A. and N. Loper. (2001) “Economic Development: Evidence from Directed Acyclical Graphs” Manchester School 69(4), Bessler, David A. and Seongpyo Lee. (2002). ‘Money and prices: U.S. data (a study with directed graphs)’, Empirical Economics, Vol. 27, pp Demiralp, Selva and Kevin D. Hoover. (2003) !Searching for the Causal Structure of a Vector Autoregression," Oxford Bulletin of Economics and Statistics 65(supplement), pp Haigh, M.S., N.K. Nomikos, and D.A. Bessler (2004) “Integration and Causality in International Freight Markets: Modeling with Error Correction and Directed Acyclical Graphs,” Southern Economic Journal 71(1), Sheffrin, Steven M. and Robert K. Triest. (1998). ‘A new approach to causality and economic growth’, unpublished typescript, University of California, Davis.

115 References Economics Swanson, Norman R. and Clive W.J. Granger. (1997). ‘Impulse response functions based on a causal approach to residual orthogonalization in vector autoregressions’, Journal of the American Statistical Association, Vol. 92, pp Demiralp, S., Hoover, K., & Perez, S. A Bootstrap Method for Identifying and Evaluating a Structural Vector Autoregression Oxford Bulletin of Economics and Statistics, 2008, 70, (4), Searching for the Causal Structure of a Vector Autoregression Oxford Bulletin of Economics and Statistics, 2003, 65, (s1), Kevin D. Hoover, Selva Demiralp, Stephen J. Perez, Empirical Identification of the Vector Autoregression: The Causes and Effects of U.S. M2*, This paper was written to present at the Conference in Honour of David F. Hendry at Oxford University, 2325 August Selva Demiralp and Kevin D. Hoover, Searching for the Causal Structure of a Vector Autoregression, OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 65, SUPPLEMENT (2003) A. Moneta, and P. Spirtes “Graphical Models for the Identification of Causal Structures in Multivariate Time Series Model”, Proceedings of the 2006 Joint Conference on Information Sciences, JCIS 2006, Kaohsiung, Taiwan, ROC, October 8-11,2006, Atlantis Press, 2006.

References  Eberhardt, F., and Scheines R., (2007).“Interventions and Causal Inference”, in PSA-2006, Proceedings of the 20th biennial meeting of the Philosophy of Science Association  Silva, R., Glymour, C., Scheines, R. and Spirtes, P. (2006) “Learning the Structure of Latent Linear Structure Models,” Journal of Machine Learning Research, 7,