Download presentation
Presentation is loading. Please wait.
Published byKatherine Booth Modified over 9 years ago
1
Outline 1)Motivation 2)Representing/Modeling Causal Systems 3)Estimation and Updating 4)Model Search 5)Linear Latent Variable Models 6)Case Study: fMRI 1
2
Goals 2 1.From Imaging data, to extract as much information as we can, as accurately as we can, about which brain regions influence which others in the course of psychological tasks. 2.To generalize over tasks 3.To specialize over groups of people.
3
What Are the Brain Variables? In current studies, from 20,000 + ………………………………….3 voxelsROIs ROI = Region of interest Question: How sensitive are causal inferences to brain variable selection? 3
4
How are ROIs constructed (FSL)? Define an experimental variable (box function). Use a generalized linear model to determine which voxels “light up” in correlation with the experimental variable. Add a group level step if voxels lighting up for the group is desired. Cluster the resulting voxels into connected clusters. – Small clusters are eliminated. – Remaining clusters become the ROIs. – Symmetry constraints may be imposed. 4
5
Search Complexity: How Big is the Set of Possible Explanations? X Y 5 } For N variables: 8 For graphical models:
6
Statistical Complexity Graphical models are untestable unless parameterized into statistical models. Incomplete models of associations are likely to fail tests. Multiple testing problems. Multiple subjects/Missing ROIs. No fast scoring method for mixed ancestral graphs that model feedback and latent common causes. Weak time lag information. 6
7
Measurement Complexity Sampling rate is slower than causal interaction speed. Indirect measurement creates spurious associations of measured variables: N1 N2 N3 X1 X3 X1 X2 X3 X2 Neural N, measured X Regression of X3 on X1, X2 7
8
Specification Strategies 1.Guess a model and test it. 2.Search the model space or some restriction of it. a. Search for the full parameterized structure b. Search for graphical structure alone c. Search for graphical features (e.g, adjacencies) 8
9
What Evidence of What Works, and Not? Theory. – Limiting correctness of algorithms (PC, FCI, GES, LiNGAM, etc., under usually incorrect assumptions for fMRI). Prior Knowledge – Do automated search results conform with established relationships? Animal Experiments (Limited) Simulation Studies 9
10
Brief Review: Smith’s Simulation Study 5 to 50 variables 28 simulation conditions, 50 subjects/condition. 38 search methods Search 1 subject at a time. 10
11
Methods tested by Smith DCM, SEM excluded; no search. (Not completely true.) Full correlation in various frequency bands Partial correlation Lasso (ICOV) Mutual Information, Partial MI Granger Causality Coherence Generalized Synchronization Patel’s Conditional Dependence Measures – P(x|y) vs P(y|x) Bayes Net Methods – CCD, CPC, FCI, PC, GES LiNGAM 11
12
Smith’s Results Adjacencies: – Partial Correlation methods (GLASSO) and several “Bayes Net” methods from CMU get ~ 90% correct in most simulations. Edge Directions – Smith: “None of the methods is very accurate, with Patel's τ performing best at estimating directionality, reaching nearly 65% d-accuracy, all other methods being close to chance.” (p. 883) – Most of the adjacencies for Patel’s τ are false. 12
13
13 Simulation conditions (see handout)…
14
14 SIMULATION 2 (10 variables, 11 edges)
15
Simulation 4 (50 variables 61 edges) 15
16
Simulation 7: 250 minutes, 5 variables 16
17
Simulation 8: Shared Inputs 17
18
Simulation 14: 5-Cycle 18
19
Simulation 15: Stronger Connections 19
20
Simulation 16: More Connections 20
21
Simulation 22: Nonstationary Connection Strengths 21
22
Simulation 24: One Strong External Input 22
23
23
24
24
25
Take Away Conclusion? Nothing works! Methods that get adjacencies (90%) cannot get directions of influence. Methods that get directions (60% - 70%) for normal session lengths cannot tell true adjacencies from false adjacencies. Even with unrealistically long sessions (4 hours), the best method gets 90% accuracy for directions but finds very few adjacencies. 25
26
Idea… If we could: – Increase sample size (effectively) by using data from multiple subjects – Focus on a method with strong adjacencies – Combine this with a method with strong orientations We may be able to do better (Ramsey, Hanson and Glymour, NeuroImage) – This is the strategy of the PC Lingam algorithm of Hoyer and several of us, though there are other ways to pursue the same strategy. 26
27
Reminder: If noises are non-Gaussian, we can learn more than a pattern. 27 Linear Models, Covariance Data, Pattern/CPDAG Linear Models, Non- Gaussian Noises (LiNG), Directed Graph (1)(2)
28
Are noises for FMRI models non-Gaussian? Yes. This is controversial but shouldn’t be. – For word/pseudoword data of Xue and Poldrack (Task 3), kurtosis ranges up to 39.3 for residuals. There is a view in the literature that noises are distributed (empirically) as Gamma—say, with shape 19 and scale 20. 28
29
Are connection functions linear for FMRI data? You tell me: I’ve not done a thorough survey of studies. 29
30
Coefficients? One expects them to be positive. – Empirically, in linear models of fMRI data, there are very few negative coefficients (1 in 200, say). – They’re only slightly negative if so. – This is consistent with negative coefficient occurring due to small sample regression estimation errors. For the most part, need to be less than 1. – Brain activations are cyclic and evolving over time. – Empirically, in linear models of fMRI, most coefficients are less than 1. To the extent that they’re greater than 1, one suspects nonlinearity. 30
31
The IMaGES algorithm Adaptation for multiple subjects of GES, a Bayes net method tested by Smith, et al. Iterative model construction using Bayesian scores separately on each subject at each step; edge with best average score added. Tolerates ROIs missing in various subjects. Seeks feed forward structure only. Finds adjacencies between variables with latent common causes. Forces sparsity by penalized BIC score to avoid triangulated variables (see Measurement Complexity) 31
32
IMaGES/LOFS Smith (2011): “Future work might look to optimize the use of higher-order statistics specifically for the scenario of estimating directionality from fMRI data.” LiNGAM orients edges by non-Normality of higher moments of the distributions of adjacent variables. LOFS uses the IMaGES adjacencies, and the LiNGAM idea for directing edges (with a different score for non- Normality, and without independent components). Unlike IMaGES, LOFS can find cycles. LOFS (from our paper) is R1 and/or R2… 32
33
Procedure R1(S) You don’t have to read these—I’ll describe them! G <- empty graph over variables of S For each variable V – Find the combination C of adj(V, S) that maximizes NG(e V|C ). – For each W in C Add W V to G Return G 33
34
Procedure R2(S) G <- empty graph over variables of S For each pair of variables X, Y – Scores<-empty – For each combination of adjacents C for X and Y If NG(e X|Y ) NG(Y) – score <- NG(X) + NG(e Y|X ) – Add Y, score> to Scores If NG(e X|Y ) > NG(X) & NG(e Y|X ) < NG(Y) – score <- NG(e X|Y ) + NG(Y) – Add to Scores – If Scores is empty Add X—Y to G. – Else Add to G the edge in Scores with the highest score. Return G 34
35
Non-Gaussianity Scores Log Cosh – used in ICA Exp = -e^(-X^2/2) – used in ICA Kurtosis – ICA (one of the first tried, not great) Mean absolute – PC LiNGAM E(e^X) – Cumulant arithmetic = e^(κ 1 (X) + 1/(2!) κ 2 (X) + 1/(3!) κ 3 (X) + …) Anderson Darling A^2 – LOFS – Empirical Distribution Function (EDF) score with heavy weighting on the tails. – We’re using this one! 35
36
Mixing Residuals We are assuming that residuals for ROIs from different subjects are drawn from the same population, so that they can be mixed. Sometimes we center residuals from different subjects before mixing, sometimes not. For Smith study, doesn’t matter—the data is already centered! 36
37
Precision and Recall Precision = True positives / all positives – What fraction of the guys you found were correct? Recall = True positives / all true guys – What fraction of the correct guys did you find? 37
38
38
39
39
40
40
41
41
42
42
43
Some Further Problems Discovering nearly canceling 2 cycles is hard (but we will try anyway…) Identifying latent latents for acyclic models Reliability of search may be worse with event designs than with block designs Subjects that differ in causal structures will yield poor results for multi-subject methods. 43
44
Hands On Download fmridata.tet. Attach a Search box to the data and run IMaGES. Copy the layout from the layout graph provided into the search box (using menus). Attach another Search box with IMaGES and Data as input and run LOFS. Try variations! 44
45
Thanks! S. M. Smith, K L. Miller, G. Salimi-Khorshidi, M. Webster, C. F. Beckmann, T. E. Nichols, J. D. Ramsey, M. W. Woolrich (2011), Network modelling methods for fMRI, NeuroImage. J.D. Ramsey, S.J. Hanson, C. Hanson, Y.O. Halchenko, R.A. Poldrack, and C. Glymour(2010), Six Problems for causal inference from fMRI, NeuroImage. J.D. Ramsey, S.J. Hanson, C. Glymour. Multi-subject search correctly identifies causal connections and most causal directions in the DCM models of the Smith et al. simulation study. NeuroImage. G. Xue, Poldrack, R., 2007. The neural substrates of visual perceptual learning of words: implications for the visual word form area hypothesis. J. Cogn. Neurosci. Thanks to the James S. McDonnell Foundation. 45
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.