The General Linear Model (GLM): the marriage between linear systems and stats FFA.

Slides:

Advertisements

Similar presentations

Basis Functions. What’s a basis ? Can be used to describe any point in space. e.g. the common Euclidian basis (x, y, z) forms a basis according to which.

Advertisements

General Linear Model L ύ cia Garrido and Marieke Schölvinck ICN.

1 st Level Analysis: design matrix, contrasts, GLM Clare Palmer & Misun Kim Methods for Dummies

SPM 2002 C1C2C3 X =  C1 C2 Xb L C1 L C2  C1 C2 Xb L C1  L C2 Y Xb e Space of X C1 C2 Xb Space X C1 C2 C1  C3 P C1C2  Xb Xb Space of X C1 C2 C1 

Outline What is ‘1st level analysis’? The Design matrix

The General Linear Model Or, What the Hell’s Going on During Estimation?

The General Linear Model (GLM)

1st level analysis: basis functions and correlated regressors

I NTRODUCTION The use of rapid event related designs is becoming more widespread in fMRI research. The most common method of modeling these events is by.

Efficiency – practical Get better fMRI results Dummy-in-chief Joel Winston Design matrix and.

The General Linear Model

Statistical Parametric Mapping Lecture 9 - Chapter 11 Overview of fMRI analysis Textbook: Functional MRI an introduction to methods, Peter Jezzard, Paul.

With many thanks for slides & images to: FIL Methods group, Virginia Flanagin and Klaas Enno Stephan Dr. Frederike Petzschner Translational Neuromodeling.

Contrasts (a revision of t and F contrasts by a very dummyish Martha) & Basis Functions (by a much less dummyish Iroise!)

Analysis of fMRI data with linear models Typical fMRI processing steps Image reconstruction Slice time correction Motion correction Temporal filtering.

FMRI Methods Lecture7 – Review: analyses & statistics.

SPM short course – Oct Linear Models and Contrasts Jean-Baptiste Poline Neurospin, I2BM, CEA Saclay, France.

Bayesian Modelling of Functional Imaging Data Will Penny The Wellcome Department of Imaging Neuroscience, UCL http//:

Basics of fMRI Time-Series Analysis Douglas N. Greve.

Learning Theory Reza Shadmehr LMS with Newton-Raphson, weighted least squares, choice of loss function.

A comparison of methods for characterizing the event-related BOLD timeseries in rapid fMRI John T. Serences.

Functional Brain Signal Processing: EEG & fMRI Lesson 14

The General Linear Model (for dummies…) Carmen Tur and Ashwani Jha 2009.

1 Time Series Analysis of fMRI II: Noise, Inference, and Model Error Douglas N. Greve

Event-related fMRI SPM course May 2015 Helen Barron Wellcome Trust Centre for Neuroimaging 12 Queen Square.

Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston 18.

General Linear Model and fMRI Rachel Denison & Marsha Quallo Methods for Dummies 2007.

Statistical Analysis An Introduction to MRI Physics and Analysis Michael Jay Schillaci, PhD Monday, April 7 th, 2007.

FMRI Modelling & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Chicago, Oct.

Idiot's guide to... General Linear Model & fMRI Elliot Freeman, ICN. fMRI model, Linear Time Series, Design Matrices, Parameter estimation,

The General Linear Model

The linear systems model of fMRI: Strengths and Weaknesses Stephen Engel UCLA Dept. of Psychology.

The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM fMRI Course London, May 2012.

Analysis of FMRI Data: Principles and Practice Robert W Cox, PhD Scientific and Statistical Computing Core National Institute of Mental Health Bethesda,

The general linear model and Statistical Parametric Mapping II: GLM for fMRI Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline.

The General Linear Model Christophe Phillips SPM Short Course London, May 2013.

The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM fMRI Course London, October 2012.

SPM short course – Mai 2008 Linear Models and Contrasts Jean-Baptiste Poline Neurospin, I2BM, CEA Saclay, France.

The General Linear Model …a talk for dummies

The General Linear Model (GLM)

HST 583 fMRI DATA ANALYSIS AND ACQUISITION

The Linear Systems Approach

The General Linear Model (GLM)

Contrast and Inferences

The general linear model and Statistical Parametric Mapping

The General Linear Model

Design Matrix, General Linear Modelling, Contrasts and Inference

Effective Connectivity

and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline

The General Linear Model (GLM)

The General Linear Model

fMRI Basic Experimental Design – event-related fMRI.

Rachel Denison & Marsha Quallo

Bayesian Methods in Brain Imaging

The General Linear Model

Learning Theory Reza Shadmehr

The General Linear Model (GLM)

Effective Connectivity

Chapter 3 General Linear Model

MfD 04/12/18 Alice Accorroni – Elena Amoruso

The General Linear Model

The General Linear Model (GLM)

Mathematical Foundations of BME

The General Linear Model

Predictive Neural Coding of Reward Preference Involves Dissociable Responses in Human Ventral Midbrain and Ventral Striatum John P. O'Doherty, Tony W.

The General Linear Model

Linear Algebra and Matrices

A Neural Network Reflecting Decisions about Human Faces

Color Signals in Human Motion-Selective Cortex

Neurophysiology of the BOLD fMRI Signal in Awake Monkeys

Presentation transcript:

The General Linear Model (GLM): the marriage between linear systems and stats FFA

fMRI Data Processing Stream raw scanner data preprocessing identify task related activity

The General Linear Model (GLM) Jargon/Terms to remember General Linear Model (GLM) HRF (HIRF): hemodynamic (impulse) response function Design matrix Predictors Betas () Residual error Deconvolution How can we relate the hemodynamic response for a single event to the hemodynamic response during a blocked stimulation? Will the rise time be the same? Duration of response be the same? We need a model - we will spend a lecture explaining the current model for the linear systems approach which provides a mean to predict the expected BOLD response from the experimental paradigm. We will discuss both the appeal and constraints of this model. But first we will give some intuition 3

How do I predict the BOLD response to any stimulus? Convolution How do I predict the BOLD response to any stimulus?

How do I predict the BOLD response to any stimulus? Convolution How do I predict the BOLD response to any stimulus? According to the linear systems approach Need to know the impulse response function Predicted response is the convolution between the input and the impulse response function

How do I predict the BOLD response to any stimulus? Convolution How do I predict the BOLD response to any stimulus? According to the linear systems approach Need to know the impulse response function Predicted response is the convolution between the input and the impulse response function

How do I predict the BOLD response to any stimulus? Convolution How do I predict the BOLD response to any stimulus? According to the linear systems approach Need to know the impulse response function Predicted response is the convolution between the input and the impulse response function

Examples of model hemodynamic impulse response function (HRF) estimated from human fMRI measurements No undershoot No undershoot Derived empirically in V1 Shortest stimulus is 3s Mathematical model based on Turner balloon model of BOLD responses Derived empirically in V1 Shortest stimulus is 1s used decovolution

How do I predict the BOLD response to any stimulus? Convolution or How do I predict the BOLD response to any stimulus? Stimulus Predicted BOLD?

How do I predict the BOLD response to any stimulus? Convolution or How do I predict the BOLD response to any stimulus? Stimulus Predicted BOLD? Turn the stimulus into a series of impulses Sum up the time shifted impulse response In other words convolve your stimulus with the hrf

Convolve stimulus with hrf Convolution Stimulus Predicted BOLD Convolve stimulus with hrf We do this numerically in matlab with the conv function Convolution_tutorial.m

How do I predict the BOLD response to any stimulus? Convolution How do I predict the BOLD response to any stimulus? Convolution_tutorial.m

Hypothesis: there are neurons in the brain that respond more (increase firing) to moving than still visual stimuli

Awesome fMRI experiment Hypothesis: there are neurons in the brain that respond more (increase firing) to moving than still visual stimuli Awesome fMRI experiment Scan subjects as they view 6 alternating blocks of moving vs. stationary stimuli Moving dots 1 Still dots time

Awesome fMRI experiment Hypothesis: there are neurons in the brain that respond more (increase firing) to moving than still visual stimuli Awesome fMRI experiment Scan subjects as they view 6 alternating blocks of moving vs. stationary stimuli Moving dots 1 Still dots Prediction: BOLD response in regions containing motion sensitive neurons will be stronger during blocks of moving dots than blocks of still dots time

Awesome fMRI experiment Hypothesis: there are neurons in the brain that respond more (increase firing) to moving than still visual stimuli Awesome fMRI experiment Scan subjects as they view 6 alternating blocks of moving vs. stationary stimuli Note that I generated a vector of 1s and 0s indicating at each timepoint what is the stimulus: 0=still; 1=moving; Moving dots 1 Still dots Prediction: BOLD response in regions containing motion sensitive neurons will be stronger during blocks of moving dots than blocks of still dots time

Awesome fMRI experiment Hypothesis: there are neurons in the brain that respond more (increase firing) to moving than still visual stimuli Awesome fMRI experiment Scan subjects as they view 6 alternating blocks of moving vs. stationary stimuli Note that I generated a vector of 1s and 0s indicating at each timepoint what is the stimulus: 0=still; 1=moving; This is called the design matrix. Moving dots 1 Still dots Prediction: BOLD response in regions containing motion sensitive neurons will be stronger during blocks of moving dots than blocks of still dots time

Based on our linear model we will generate a prediction by convolving the design matrix with an HRF g(t) predictor 1 HRF * 18

Based on our linear model we will generate a prediction by convolving the design matrix with an HRF g(t) predictor 1 HRF * How well does my prediction match the data? Time series from an example voxel time course data 19

I want to write a mathematical notation relating the prediction to the data for 2 reasons: (1) to estimate the model parameters and (2) test how good the fit is time course data g(t) predictor

I want to write a mathematical notation relating the prediction to the data for 2 reasons: (1) to estimate the model parameters and (2) test how good the fit is time course data g(t) predictor

I want to write a mathematical notation relating the prediction to the data for 2 reasons: (1) to estimate the model parameters and (2) test how good the fit is time course data g(t) predictor

I want to write a mathematical notation relating the prediction to the data for 2 reasons: (1) to estimate the model parameters and (2) test how good the fit is Because this is a linear model, I predict my time course y(t) is a scaled version of the predictor g(t), plus an offset term b0. time course data g(t) predictor

I want to write a mathematical notation relating the prediction to the data for 2 reasons: (1) to estimate the model parameters and (2) test how good the fit is Because this is a linear model, I predict my time course y(t) is a scaled version of the predictor g(t), plus an offset term b0. time course data g(t) predictor The model has 2 parameters: b1 scales the predictor b0 shifts it from baseline and an error term e(t), residual error: what the linear model doesn’t explain in the data

I want to write a mathematical notation relating the prediction to the data for 2 reasons: (1) to estimate the model parameters and (2) test how good the fit is Because this is a linear model, I predict my time course y(t) is a scaled version of the predictor g(t), plus an offset term b0. time course data g(t) predictor The model has 2 parameters: b1 scales the predictor b0 shifts it from baseline and an error term e(t), residual error: what the linear model doesn’t explain in the data

g(t) predictor We can solve this with a simple linear regression! I want to write a mathematical notation relating the prediction to the data for 2 reasons: (1) to estimate the model parameters and (2) test how good the fit is We can solve this with a simple linear regression! time course data g(t) predictor The model has 2 parameters: b1 scales the predictor b0 shifts it from baseline and an error term e(t), residual error: what the linear model doesn’t explain in the data

* g(t) predictor Design matrix HRF Stimulus: used to generate predictor Offset term (b0) HRF * 27

is scaled to match the time course Design matrix Solution (b1): how much the predictor is scaled to match the time course

is scaled to match the time course Design matrix Solution (b1): how much the predictor is scaled to match the time course Here the signal is around zero so b0 is negligible. Nevertheless researchers rarely report it because they are interested about the effect of the stimulus and not the baseline brain signal

The residual error is the difference between the data and the model’s prediction e(t)=residual error

The residual error is the difference between the data and the model’s prediction The error term can be used to estimate how much variance in the data is explained by the model e(t)=residual error

How does the HRF affect results? * * *

How does the HRF affect results? * * *

How does the HRF affect results?

The General Linear Model: expand the regression model to have more than one predictor y(t) time course of voxel gi(t) i-th factor bi coefficient of i-th factor b0 shift from baseline e(t) additive noise

General Linear Model: Matrix Notation Y time course column vector (nx1) n: number of time samples G matrix of concatenated predictors (nxp) p:number of predictors (number of experimental conditions) b vector of factor coefficients (px1) e additive noise Least Squares Solution:

General Linear Model: Matrix Notation Y time course column vector (nx1) n: number of time samples G matrix of concatenated predictors (nxp) p:number of predictors (number of experimental conditions) b vector of factor coefficients (px1) e additive noise Least Squares Solution: As an experimenter you generate the design matrix, which then gets convolved with the HRF to generator predictors gi

Example GLM with 4 predictors and 2 baselines for run1 and run 2

Example GLM with 4 predictors and 2 baselines for run1 and run 2 Scaled by bs 39

Example GLM with 4 predictors and 2 baselines for run1 and run 2 GLM explains 70.5% of the time course variance

Example GLM with 4 predictors and 2 baselines for run1 and run 2 GLM explains 46.7% of the time course variance

Correlated Predictors Avoid predictors that are correlated with one another This is why we NEVER include a baseline predictor baseline predictor is almost completely correlated (r = -1) with the sum of other existing predictors if we included a baseline predictor, the model would have problems assigning variance to stimulus predictors vs. baseline predictors for example, the model could not distinguish between two possibilities (e.g., Beta1=1, Beta2=0 vs. Beta1=0, Beta2=-1) Stimulus predictor Baseline predictor

Correlated Predictors Avoid predictors that are correlated with one another This is why we NEVER include a baseline predictor baseline predictor is almost completely correlated (r = -1) with the sum of other existing predictors if we included a baseline predictor, the model would have problems assigning variance to stimulus predictors vs. baseline predictors for example, the model could not distinguish between two possibilities (e.g., Beta1=1, Beta2=0 vs. Beta1=0, Beta2=-1) Stimulus predictor Baseline predictor

Correlated Predictors Avoid predictors that are correlated with one another This is why we NEVER include a baseline predictor baseline predictor is almost completely correlated (r = -1) with the sum of other existing predictors if we included a baseline predictor, the model would have problems assigning variance to stimulus predictors vs. baseline predictors for example, the model could not distinguish between two possibilities (e.g., Beta1=1, Beta2=0 vs. Beta1=0, Beta2=-1) Stimulus predictor Baseline predictor

Deconvolution Can you solve the inverse problem? If I measure the summed response of several impulses, can I recover the hemodynamic response to a single event?

Series of identical single events that are closely space in time Series of identical single events that are closely space in time. Can I recover the hemodynamic response of a single event? h3 h4 h2 h5 h1 h6 h7 y3 y6 y5 y2 y4 y1 =y8 =y7

Series of identical single events that are closely space in time Series of identical single events that are closely space in time. Can I recover the hemodynamic response of a single event? h3 h4 h2 h5 h1 h6 h7 y3 y6 y5 y2 y4 y1 =y8 =y7

Series of identical single events that are closely space in time Series of identical single events that are closely space in time. Can I recover the hemodynamic response of a single event? h3 h4 h2 h5 h1 h6 h7 y3 y6 y5 y2 y4 Deconvolution: Compute the hemodynamic response hi in a time window by solving the GLM at each time point rather than assume an HRF; y1 =y8 =y7

Not if they are presented in a fixed interval Series of identical single events that are closely space in time. Can I recover the hemodynamic response of a single event? h3 h4 h2 h5 h1 h6 h7 y3 y6 y5 y2 y4 Deconvolution: Compute the hemodynamic response hi in a time window by solving the GLM at each time point rather than assume an HRF; y1 =y8 =y7

Series of identical single events that are closely space in time Series of identical single events that are closely space in time. Can I recover the hemodynamic response of a single event? Yes, if they are displaced (jittered) in time h3 h4 h2 h5 h6 h1 h7 y3 y6 y5 y2 y4 y1 Deconvolution: Compute the hemodynamic response hi in a time window by solving the GLM at each time point rather than assume an HRF; =y7 =y8

Estimating s Using Standard GLM

Deconvolution, estimating s in a time window without assuming a specific HRF

Back to nonlinearities- re-evaluating the GLM when it fails Birn et al NeuroImage 2001 Linear model fails for brief and rapid stimuli: (1) Responses are non-linear for durations shorter than 2s (2) Model substantially underestimates responses for brief stimuli

Problem: GLM relates stimulus to BOLD not neural responses to BOLD Gaussian noise Black box fMRI response Neural response MRI Scanner Hemo-dynamics Stimulus + If nonlinearities have a neural origin then incorporating a model of neural responses that accounts for these nonlinearities may better predict BOLD responses than the standard GLM

We sought to test nonlinearities by measuring brain responses to combinations of sustained and transient visual stimuli and comparing the predictions of the GLM to a new encoding model that takes into account nonlinear neural responses

We sought to test nonlinearities by measuring brain responses to combinations of sustained and transient visual stimuli and comparing the predictions of the GLM to a new encoding model that takes into account nonlinear neural responses

Predicted V1 responses from the standard GLM Stigliani, Jeska & Grill-Spector, Biorxiv 2017

V1 responses to transient stimuli differ from the predictions of the standard GLM Stigliani, Jeska & Grill-Spector, Biorxiv 2017

We developed a 2-temporal channel encoding model of neural responses in the visual system and tested if it better predicts BOLD responses than the standard GLM Stigliani, Jeska & Grill-Spector, Biorxiv 2017

2-temporal channel model better predicts V1 responses to time varying stimuli than the standard GLM Stigliani, Jeska & Grill-Spector, Biorxiv 2017

2-temporal channel model better predicts V1 responses to time varying stimuli than the standard GLM Stigliani, Jeska & Grill-Spector, Biorxiv 2017

2-temporal channel model also explains other data such as the data from the Birn paper Supplementary Figure S4: The 2 temporal-channel model explains response nonlinearities for briefly presented stimuli. (a) Figure adapted from Birn et al. (y-axis values are unreported in original version). Top left: measured V1 responses to brief (250– 2000 ms) presentations of a checkerboard stimulus that was contrast inverted at 8 Hz in all trial durations; Top right: predicted V1 responses based on a standard linear model solved using responses to longer presentations of the checkerboard stimulus. Bottom: same data as above except the measured and predicted fMRI responses are superimposed for each trial duration. (b) Simulated V1 responses to the stimuli used by Birn et al. that are derived with the  weights solved using models fit to V1 data from Experiments 1 and 2 of the present study. Left: predictions of the 2 temporal-channel model for each trial duration; Right: predictions of the standard model for each trial duration. Bottom: same data as above except the predictions of the two models are superimposed for each trial duration. The simulations show that the standard model replicates Birn et al.’s linear model and underestimate responses. In contrast, the 2 temporal-channel model better explains the measured responses (a‑left) and predicts higher responses than the standard model in each duration (b‑bottom).