From Brain Images to Causal Connections Center for Causal Discovery (CCD) BD2K All Hands Meeting University of Pittsburgh Carnegie Mellon University Pittsburgh.

From Brain Images to Causal Connections Center for Causal Discovery (CCD)
BD2K All Hands Meeting University of Pittsburgh Carnegie Mellon University Pittsburgh Supercomputing Center Yale University The project is a collaborative effort among investigators at Pitt, UPMC, CMU, and the Pittsburgh Supercomputer Center. Some of the key personnel on the project have been collaborating on methods for causal modeling and discovery for more than 20 years. One driving biomedical problem will be the discovery of cell signaling pathways in several cancers, including breast, lung, and colon cancer. Understanding these pathways is central to developing drug therapies that can effectively treat the cancers. Another driving biomedical problem will be the discovery of the mechanisms of disease in COPD. Both of these biomedical problems are translational and involve using data that spans from the molecular to the clinical. 11/29/2016

Goals From brain scans infer the causal connections of activation between brain regions at rest and while performing tasks with: Smallest spatially resolved regions (“voxels”) Clusters of voxels (“regions of interest”) Identify causal connection anomalies in neurotypical populations

Causal Search Is Parameter Estimation
X <- Y -> Z Causes X Y Z 1 X -> Y -> Z Causes X Y Z 1

Causal Search is Estimating Many Distributions Not Sampled from a Distribution Sampled
Y From Distribution: X Z P(X,Y,Z) = P(X | Y) P(Z | X, Y) P(Y) Predict the Distribution If X Were Perturbed: Y X Z P*(X,Y,Z) = P*(X) P(Z | X, Y) P(Y)

Causal Skeletons Causal Structure Causal Skeleton X -> Y -> Z X – Y – Z X -> Y <- Z X – Y – Z

Causation is Hard to Find
Correlation is not causation X < - Y -> Z => X – Y – Z Partial correlation is not causation X -> Y <- Z => X - Y - Z These methods do not even identify the causal skeletons correctly.

Asymptotically Correct Methods
PC – Uses a sequence of hypothesis tests GES – Uses a score (BIC) LiNGAM – Uses Independent Components

Strategies PC, GES are asymptotically correct but do not scale up
Optimize and parallelize the algorithms to fastGES (FGES) and PC-Max LiNGAM requires very large sample sizes for accuracy Use a fragment of the LiNGAM model for estimation of causal directions (LING)

FGES Is Accurate In High Dimensions
Fast Greedy Search: Precisions and Recalls, N = 1000, simulated Gaussian data. Even better accuracies with PC-Max

Example: The Causal Skeleton of a Resting State Brain with FGES
60 scans of the same individual taken over the course of a year. Each scan produces approximately 51,000 voxel time series sampled at approximately 2 second intervals. (~500 sample points) Resting State Signals are not strictly Gaussian Series is not strictly stationary

single subject whole brain 51,212 voxels, 518 datapoints Session 30
FGS penalty 4 Session 30 The colors represent the length of the edge, ie. Hotter color means short distance connections; cooler colors mean long distance connections. Brains at the left are whole brain, cortical voxel space, FGS penalty 4 graphs for 2 different sessions. Upper brain is for a session at the beginning of the experiment, and lower brain is from the last session of the experiment , more than a year apart. Session 30 has 138,287 edges; Session 104 has 114, 536 edges. The brains to the right were obtained by parcellating the brain into ROIs (each node represents the spatial centroid of an ROI), and for visualization purposes we only draw edges for each pair of ROIs that are connected by > 50 edges at the voxelwise level (left figure). 5. The thicker the edge, the larger the number of edges connecting the pair of ROIs. 5. The structure recovered in both sessions is highly similar. 6. We need to ask Madelyn the approx. running time of FGS penalty 4 for these problems in the super computer. Session 104

FGES: Penalty and Complexity
The more connections per node in the true graph, the longer the FGES search takes. If the true graph is very dense, the search may never return. A penalty term in the BIC score can be increased to force sparsity on the graph discovered and increase speed. We can run a single scan of the whole brain in about 12 hours with penalty 4. Result: 134,000 edges.

Penalties and Sparsity
Relationships found at higher penalties are largely retained at lower penalties.

What’s Wrong with This Graph?
We know there are feedback structures in the brain. FGES estimations of directions of influence assume acyclicity—no feedback. For that there is an asymptotically correct algorithm called Cyclic Causal Discovery (CCD, Richardson, 1996) Slow and not very accurate. Same strategy: Optimize and speed-up.

Carnegie Mellon CDC fMRI Group
Ruben Sanchez Romero Madelyn Glymour Biwei Huang Kun Zhang Joseph Ramsey Clark Glymour

X Axis: Threshold for number of voxel-voxel connections between ROIs Y Axis: Number of ROI-ROI Edges exceeding a threshold in at least 59 of 60 scans

Work to Do: Easier Identify the voxel level connections that are invariant across all or almost all 60 scans. Difficult at the voxel level because cannot identify voxels perfectly in different scans even of the same person. Motion correct all scans to a single scan Use a voxel distance function Identify the voxels or small voxel regions that have the greatest numbers of connections.

Work to Do: More Difficult
Estimate Directions of Influence: The Cyclic Discovery Algorithm (CDC) recovers directed causal structure equivalence classes in linear feedback systems Recent unpublished improvements in speed and accuracy. Apply it to the 60 scans. Several edge by edge orientation methods are available that take advantage of non-Gaussian components of the signal.

Prospects A real Causome.
Identification of functional differences among voxels E.g. in two causally connected “regions of interest” which voxels are doing the preponderance of the signaling? Discovery of mechanisms of anomaly, e.g., autism.

Signaling sub-Regions of the Hippocammpus

Autism and the Cerebellum

Issue Anatomical studies argue for a role for the cerebellum in autism. FMRI studies based on differential signals (normal/ASD) find no such relation. Markov Blanket methods on fMRI data from U. of Utah Medical school perfectly distinguish normal/ASD by cerebellum connections. (Sample way too small: 10 normal, 10 ASD). Most fMRI scans do not obtain good measurements of the cerebellum. We are currently examining 500+ scans from the ABIDE network for good cerebellum measurements to be used in out-of-sample testing.

From Brain Images to Causal Connections Center for Causal Discovery (CCD) BD2K All Hands Meeting University of Pittsburgh Carnegie Mellon University Pittsburgh.

Similar presentations

Presentation on theme: "From Brain Images to Causal Connections Center for Causal Discovery (CCD) BD2K All Hands Meeting University of Pittsburgh Carnegie Mellon University Pittsburgh."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

From Brain Images to Causal Connections Center for Causal Discovery (CCD) BD2K All Hands Meeting University of Pittsburgh Carnegie Mellon University Pittsburgh.

Similar presentations

Presentation on theme: "From Brain Images to Causal Connections Center for Causal Discovery (CCD) BD2K All Hands Meeting University of Pittsburgh Carnegie Mellon University Pittsburgh."— Presentation transcript:

Similar presentations

About project

Feedback