with Application to Functional MRI Brain Data

Slides:

Advertisements

Similar presentations

Computing Persistent Homology

Advertisements

The General Linear Model Or, What the Hell’s Going on During Estimation?

Reading Graphs and Charts are more attractive and easy to understand than tables enable the reader to ‘see’ patterns in the data are easy to use for comparisons.

Multivariate Methods Pattern Recognition and Hypothesis Testing.

Dimensional reduction, PCA

Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,

Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.

Sexual Activity and the Lifespan of Male Fruitflies

Objectives of Multiple Regression

EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.

Statistics for the Behavioral Sciences (5 th ed.) Gravetter & Wallnau Chapter 17 The Chi-Square Statistic: Tests for Goodness of Fit and Independence University.

Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.

Brain Mapping Unit The General Linear Model A Basic Introduction Roger Tait

A vector space containing infinitely many vectors can be efficiently described by listing a set of vectors that SPAN the space. eg: describe the solutions.

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 11: Bivariate Relationships: t-test for Comparing the Means of Two Groups.

Statistical Analysis An Introduction to MRI Physics and Analysis Michael Jay Schillaci, PhD Monday, April 7 th, 2007.

Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.

Indexing and Retrieval of Dyanamic Brain Images: Communication Within the Human Brain Author: Arnav Sheth Supervisor: Dr. Lawrence Shepp, Statistics Department,

Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.

Descriptive Statistics: Tabular and Graphical Methods

Graphs of Sine and Cosine

Chapter 2 Functions and Graphs

Chapter 4 Basic Estimation Techniques

HGEN Thanks to Fruhling Rijsdijk

Linear Algebra Review.

Chapter 12: The Nuts and Bolts of Multi-factor experiments.

Copyright © Cengage Learning. All rights reserved.

Statistical Data Analysis - Lecture /04/03

We propose a method which can be used to reduce high dimensional data sets into simplicial complexes with far fewer points which can capture topological.

Chapter 2 Functions and Graphs

Finding domain and range

Basic Estimation Techniques

Eigenvalues and Eigenvectors

Circuit Switching Circuit switching refers to a communication mechanism that establishes a path between a sender and receiver with guaranteed isolation.

Copyright © 2009 Pearson Education, Inc.

The general linear model and Statistical Parametric Mapping

Essentials of Modern Business Statistics (7e)

Bi-variate #1 Cross-Tabulation

Quantitative Skills : Graphing

Module 6: Presenting Data: Graphs and Charts

Basic Estimation Techniques

Review for Exam 2 Some important themes from Chapters 6-9

Ying shen Sse, tongji university Sep. 2016

3.3 Graphics in the Media LEARNING GOAL

CS 350 – Software Design A Standard Object-Oriented Solution – Chapter 4 Before studying design patterns, many programmers solve a problem by starting.

A Network Diffusion Model of Disease Progression in Dementia

3D Transformation CS380: Computer Graphics Sung-Eui Yoon (윤성의)

Objectives Identify linear functions and linear equations.

Sexual Activity and the Lifespan of Male Fruitflies

EE513 Audio Signals and Systems

Introduction to Algebraic Topology and Persistent Homology

Introduction to Connectivity Analyses

Lexico-grammar: From simple counts to complex models

Motion-Based Analysis of Spatial Patterns by the Human Visual System

Volume 79, Issue 4, Pages (August 2013)

An Introduction to Correlational Research

Objectives Identify linear functions and linear equations.

Volume 14, Issue 7, Pages (February 2016)

Volume 42, Issue 2, Pages (April 2004)

Math review - scalars, vectors, and matrices

Chapter 18: The Chi-Square Statistic

Data Mining Anomaly Detection

Section 10.2 Comparing Two Means.

Gregor Rainer, Earl K Miller Neuron

A Neural Network Reflecting Decisions about Human Faces

Data Mining Anomaly Detection

Conserved Sequence Processing in Primate Frontal Cortex

Eigenvalues and Eigenvectors

Chapter 6 Additional Integration Topics

Presentation transcript:

with Application to Functional MRI Brain Data Describing High-Order Statistical Dependence Using "Concurrence Topology,” with Application to Functional MRI Brain Data Steven P. Ellis, Columbia University, ellisst@nyspi.columbia.edu Arno Klein, Sage Bionetworks Abstract: We propose a new nonparametric method, "Concurrence Topology (CT)", for describing dependence among dichotomous variables. CT starts by translating the data into a "filtration", i.e., a series of shapes. Holes in the filtration correspond to relatively weak or negative association among the variables. CT uses computational topology to describe the pattern of holes in the filtration. CT is able to describe high- order dependence while avoiding combinatorial explosion. We employed CT to investigate brain functional connectivity based on dichotomized functional MRI data. The data set includes subjects diagnosed with ADHD and healthy controls. In an exploratory analysis, working in both the time and Fourier domains, CT found a number of differences between ADHD subjects and controls in the topology of their filtrations. This poster is based on the paper Ellis and Klein (2014). CT AND NETWORK ANALYSIS: CT goes beyond network analysis in that Curto- Itskov filtrations often contain simplices that connect more than two variables (often 60 or more variables in our data). A figure like Figure 4 is impossible for a network. HOLES: Holes in a filtration are like stairwells in a building. Holes come in different dimensions (0, 1, 2, …). Holes in a Curto-Itskov filtration indicate relatively weak or negative association among the variables. Holes of dimension d pertain to order of dependence at least d+2. Figure 4: Persistence plot for same subject’s fMRI data in dimension 2. PERSISTENCE: A stairwell might span several floors. Working down from the top floor, the floor where the stairwell begins is the floor of its “birth”. The floor where it ends is the floor of the “death” of the stairwell. In general, a hole in a filtration might “persist” through several frames. “Birth – death” is the “lifespan” of the hole. The plot of death vs. birth of the holes of a given hole dimension is a “persistence plot”. The lifespan of a hole is the vertical distance from the diagonal death=birth line to the point corresponding to the hole. Figure 2 shows the persistence plot in dimension 1 for the filtration shown in Figure 1. DATA SET: We worked with publicly available resting state fMRI. The data included 25 ADHD subjects and 41 healthy controls. For every subject, the data included “BOLD” values in 92 brain regions at 192 time points. ANALYSIS STRATEGY: Dependence among regional BOLD series describes “functional connectivity” of the brain. (Holes reflect brain function, not brain anatomy.) We performed CT analysis of the fMRI data for each subject separately. In the “time domain” an observation is dichotomized BOLD in all regions at a single time point. In the “Fourier domain” an observation is dichotomized power in all regions at a single Fourier frequency. We used summary statistics of CT analyses as subject-wise variables in conventional statistical analyses. For each subject separately we performed CT analyses in both time and Fourier domains in dimensions 0—5. Our main interest was in trying out CT. Therefore, our analyses are exploratory. Significance at the 0.05 level was used as a flag for identifying potentially interesting findings. We plan to test our findings in larger, independent data sets. SOME FINDINGS BASED ON PERSISTENCE PLOTS: We observed differences between ADHD and control groups in the Fourier domain, dimensions 1 and 2 (especially dimension 1) and in the time domain, dimensions 4 and 5 (especially dimension 4): 64% of ADHD subjects had holes in the time domain in dimension 4 compared to 93% of controls. Note: Holes in dimension 4 reflect order of dependence at least 6. LOCALIZATION: Holes involve all variables, but some variables are more directly involved than others. The variables in “short cycles” are the most directly implicated. Not all holes have short cycles, but most do. Short cycles allow interpretation of holes. In dimension d, a short cycle contains d+2 variables (regions, in this case). The hole, call it a, corresponding to the dot marked by an asterisk in Figure 3 has a short cycle that appears in 13 subjects. This is nominally statistically significant. The 16 most common short cycles belonging to a distinguish ADHD from controls: 76% of the ADHD subjects have at least one of the 16, but only 44% of controls do. ORDER OF DEPENDENCE: A feature of a joint distribution that can be seen when looking at k variables at a time but that cannot be seen when looking at fewer than k reflects “kth-order dependence” among the variables. E.g., a kth-order interaction in a log-linear model reflects kth-order dependence. “High-order” dependence means order of dependence 3 or larger. PROBLEM: Describing high-order dependence among dozens or more variables “agnostically” (i.e., treating all variables the same a priori) often leads to a combinatorial explosion. CONCURRENCE TOPOLOGY (CT): “Concurrence topology” is a new nonparametric method for describing dependence among dichotomous variables that solves this problem. (Other methods exist that also solve the problem, but CT gives a very different view of the dependence structure. Initial inspiration for CT came from Curto and Itskov, 2008.) FILTRATION: CT translates multivariate dichotomous data into a filtration, which is a series of shapes (“frames”), none larger than the preceding one. Analogy: A filtration is like a building. The frames are like floors in the building. Figure 1 shows a low-dimensional filtration. Figure 2: Persistence plot in dimension 1 for filtration in figure 1. Line is death=birth line. COMPUTATIONAL TOPOLOGY: Specialized computational topology software is needed to identify holes, with their births and deaths, in a Curto-Itskov filtration. CT code, written in R, that does this is available from the authors. LONG LIFESPANS: The longer the lifespan of a hole in a Curto-Itskov filtration, the more likely it is to be “real”, not just a product of “noise.” REAL DATA PERSISTENCE PLOTS: Figures 3 and 4 are persistence plots of a research subject’s fMRI data in dimensions 1 and 2. The asterisk in figure 3 indicates a long-lived hole that proves to be interesting. (See below.) Figure 1: “Toy” filtration. CONCURRENCES: Suppose binary variables are coded “0” and “1”. The “concurrence” corresponding to a multivariate binary observation is the list of variables that are “1” in that observation. CT represents concurrences as simplices and uses them to build the “Curto-Itskov filtration” for the data. The frames (indexed by “frequency level”) correspond to the frequencies with which the concurrences appear in the data. REFERENCES: C. Curto and V. Itskov (2008) “Cell groups reveal structure of stimulus space,” PLoS Computational Biology, 4. S.P. Ellis, A. Klein (2014) "Describing high-order statistical dependence using ‘concurrence topology,’ with application to functional MRI brain data," Homology, Homotopy and Applications, 16, 245--264. Figure 3: Persistence plot for a subject’s fMRI data in dimension 1. Large dot indicates multiple holes with same birth and death. Asterisk indicates interesting hole discussed below.