Basics of Experimental Design for fMRI: Event-Related Designs Jody Culham Brain and Mind Institute Department of Psychology Western University http://www.fmri4newbies.com/ Basics of Experimental Design for fMRI: Event-Related Designs Last Update: February 22, 2013 Last Course: Psychology 9223, W2013, Western University
Event-Related Averaging (can be used for block or event-related designs) 2
Event-Related Averaging In this example an “event” is the start of a block In single-trial designs, an event may be the start of a single trial First, we compute an event related average for the blue condition Define a time window before (2 volumes) and after (15 volumes) the event Extract the time course for every event (here there are four events in one run) Average the time courses across all the events
Event-Related Averaging Second, we compute an event related average for the gray condition
Event-Related Averaging Third, we can plot the average ERA for the blue and gray conditions on the same graph
Event-Related Averaging in BV Define which subjects/runs to include Set time window Define which conditions to average (usually exclude baseline) We can tell BV where to put the y=0 baseline. Here it’s the average of the two circled data points at x=0. Determine how you want to define the y-axis values, including zero
But what if the curves don’t have the same starting point? But what if the data looked like this? …or this? In the data shown, the curves started at the same level, as we expect they should because both conditions were always preceded by a resting baseline period
Epoch-based averaging FILE-BASED AVERAGING: zero baseline determined across all conditions (for 0 to 0: points in red circles) In the latter two cases, we could simply shift the curves so they all start from the same (zero) baseline EPOCH-BASED AVERAGING: zero baselines are specific to each epoch
File-based vs. Epoch-based Averaging time courses may start at different points because different event histories or noise Epoch-based Averaging each curve starts at zero can be risky with noisy data only use it if you are fairly certain your pre-stim baselines are valid (e.g., you have a long ITI and/or your trial orders are counterbalanced) can yield very different conclusions than GLM stats e.g., set EACH curve such that at time=0, %BSC=0 File-based Averaging zero is based on average starting point of all curves works best when low frequencies have been filtered out of your data similar to what your GLM stats are testing
What if…? This design has the benefit that each condition epoch is preceded by a baseline, which is nice for making event-related averages However, we might decide that this design takes too much time because we are spending over half of the time on the baseline. Perhaps we should use the following paradigm instead…? This regular triad sequence has some nice features, but it can make ERAs more complicated to understand.
Regular Ordering and ERAs We might have a time course that looks like this
Example of ERA Problems If you make an ERA the usual way, you might get something that looks like this: File-Based (Pre=2, Post=10, baseline 0 to 0) Intact One common newbie mistake is to make ERAs for all conditions, including the baseline (Fixation). This situation will illustrate some of the confusion with that Scrambled Fixation Initially some people can be confused how to interpret this ERA because the pre-event activation looks wonky.
Example of ERA Problems File-Based (Pre=2, Post=10, baseline 0 to 0) File-Based (Pre=8, Post=18, baseline 0 to 0) Intact Scrambled Fixation If you make the ERA over a longer time window, the situation becomes clearer. You have three curves that are merely shifted in time with respect to one another.
Example of ERA Problems File-Based (Pre=2, Post=10, baseline 0 to 0) Intact End of Intact Scrambled End of Scrambled End of Fixation Fixation Now you should realize that the different pre-epoch baselines result from the fact that each condition has different preceding conditions Intact is always preceded by Fixation Scrambled is always preceded by Intact Fixation is always preceded by Scrambled
Example of ERA Problems File-Based (Pre=2, Post=10, baseline 0 to 0) Intact Scrambled Fixation Because of the different histories, changes with respect to baseline are hard to interpret. Nevertheless, ERAs can show you how much the conditions differed once the BOLD response stabilized This period shows, rightly so, Intact > Scrambled > Fixation
Example of ERA Problems Epoch-Based (Pre=2, Post=10, baseline -2 to -2) Because the pre-epoch baselines are so different (due to differences in preceding conditions), here it would be really stupid to do epoch-based averaging (e.g., with x=-2 as the y=0 baseline) In fact, it would lead us to conclude (falsely!) that there was more activation for Fixation than for Scrambled
Example of ERA Problems In a situation with a regular sequence like this, instead of making an ERA with a short time window and curves for all conditions, you can make one single time window long enough to show the series of conditions (and here you can also pick a sensible y= 0 based on x=-2) File-Based average for Intact condition only (Pre=2, Post23, baseline -2 to -2) Intact Scrambled Fixation
Partial confounding In the case we just considered, the histories for various conditions were completely confounded Intact was always preceded by Fixation Scrambled was always preceded by Intact Fixation was always preceded by Scrambled We can also run into problems (less obvious but with the same ERA issues) if the histories of conditions are partially confounded (e.g., quasi-random orders) Intact is preceded by Scrambled 3X and by Fixation 3X Scrambled is preceded by Intact 4X and Fixation 1X Fixation is preceded by Intact 2X, by Scrambled 2X and by nothing 1X No condition is ever preceded by itself
ERAs: Take-home messages ERAs can be a valuable way to look at activation over time BUT you have to carefully select the baseline (file-based vs. epoch-based; which time points to take as baseline) you have to know whether the history of your conditions is counterbalanced if it’s not counterbalanced, ERAs can be misleading This problem of “history” will come up again…
Basics of Event-Related Designs 20
Block Designs = trial of one type (e.g., face image) = trial of another type (e.g., place image) Block Design Early Assumption: Because the hemodynamic response delays and blurs the response to activation, the temporal resolution of fMRI is limited. Positive BOLD response Initial Dip Overshoot Post-stimulus Undershoot 1 2 3 BOLD Response (% signal change) Time Stimulus WRONG!!!!! Jody
What are the temporal limits? What is the briefest stimulus that fMRI can detect? Blamire et al. (1992): 2 sec Bandettini (1993): 0.5 sec Savoy et al (1995): 34 msec 2 s stimuli single events Data: Blamire et al., 1992, PNAS Figure: Huettel, Song & McCarthy, 2004 Data: Robert Savoy & Kathy O’Craven Figure: Rosen et al., 1998, PNAS Jody Although the shape of the HRF delayed and blurred, it is predictable. Event-related potentials (ERPs) are based on averaging small responses over many trials. Can we do the same thing with fMRI?
Predictor Height Depends on Stimulus Duration
Design Types Block Design Slow ER Design Rapid Jittered ER Design = trial of one type (e.g., face image) Design Types = trial of another type (e.g., place image) Block Design Slow ER Design Rapid Jittered ER Design Jody Mixed Design
Detection vs. Estimation detection: determination of whether activity of a given voxel (or region) changes in response to the experimental manipulation 1 estimation: measurement of the time course within an active voxel in response to the experimental manipulation % Signal Change Jody 4 8 12 Time (sec) Definitions modified from: Huettel, Song & McCarthy, 2004, Functional Magnetic Resonance Imaging
Block Designs: Poor Estimation Huettel, Song & McCarthy, 2004, Functional Magnetic Resonance Imaging
Pros & Cons of Block Designs high detection power has been the most widely used approach for fMRI studies accurate estimation of hemodynamic response function is not as critical as with event-related designs Cons poor estimation power subjects get into a mental set for a block very predictable for subject can’t look at effects of single events (e.g., correct vs. incorrect trials, remembered vs. forgotten items) becomes unmanagable with too many conditions (e.g., more than 4 conditions + baseline) Jody
Slow Event-Related Designs Slow ER Design Jody
Convolution of Single Trials Neuronal Activity BOLD Signal Haemodynamic Function Time Time Slide from Matt Brown
BOLD Summates Neuronal Activity BOLD Signal Slide adapted from Matt Brown
Slow Event-Related Design: Constant ITI Bandettini et al. (2000) What is the optimal trial spacing (duration + intertrial interval, ITI) for a Spaced Mixed Trial design with constant stimulus duration? 2 s stim vary ISI Block Event-related average Jody Source: Bandettini et al., 2000
Optimal Constant ITI Brief (< 2 sec) stimuli: Source: Bandettini et al., 2000 Brief (< 2 sec) stimuli: optimal trial spacing = 12 sec For longer stimuli: optimal trial spacing = 8 + 2*stimulus duration Effective loss in power of event related design: = -35% i.e., for 6 minutes of block design, run ~9 min ER design Jody
Trial to Trial Variability Huettel, Song & McCarthy, 2004, Functional Magnetic Resonance Imaging
How Many Trials Do You Need? Huettel, Song & McCarthy, 2004, Functional Magnetic Resonance Imaging standard error of the mean varies with square root of number of trials Number of trials needed will vary with effect size Function begins to asymptote around 15 trials
Effect of Adding Trials Huettel, Song & McCarthy, 2004, Functional Magnetic Resonance Imaging
Pros & Cons of Slow ER Designs excellent estimation useful for studies with delay periods very useful for designs with motion artifacts (grasping, swallowing, speech) because you can tease out artifacts analysis is straightforward Example: Delayed Hand Actions (Singhal et al., under revision) Visual Response Delay Action Execution Grasp Go (G) Reach Go (R) Grasp Stop (GS) Reach Stop (RS) Action-related artifact Really long delay: 18 s Effect of this design on our subject Cons poor detection power because you get very few trials per condition by spending most of your sampling power on estimating the baseline subjects can get VERY bored and sleepy with long inter-trial intervals Jody
“Do You Wanna Go Faster?” Rapid Jittered ER Design Tzvi Yes, but we have to test assumptions regarding linearity of BOLD signal first
Linearity of BOLD response “Do things add up?” red = 2 - 1 green = 3 - 2 Sync each trial response to start of trial Tzvi Not quite linear but good enough! Source: Dale & Buckner, 1997
Linearity is okay for events every ~4+ s
Why isn’t BOLD totally linear? In part because neurons aren’t totally linear either “Phasic” (or “transient”) neural responses Adaptation or habituation… stay tuned May depend on factors like stimulus duration and stimulus intensity Spikes/ms Ganmor et al., 2010, Neuron Time (ms)
Optimal Rapid ITI Rapid Mixed Trial Designs Source: Dale & Buckner, 1997 Tzvi Rapid Mixed Trial Designs Short ITIs (~2 sec) are best for detection power Do you know why?
Efficiency (Power)
Two Approaches Detection – find the blobs Business as usual Model predicted activation using square-wave predictor functions convolved with assumed HRF Extract beta weights for each condition; Contrast betas Drawback: Because trials are packed so closely together, any misestimates of the HRF will lead to imperfect GLM predictors and betas Estimation – find the time course make a model that can estimate the volume-by-volume time courses through a deconvolution of the signal
BOLD Overlap With Regular Trial Spacing Neuronal activity from TWO event types with constant ITI Partial tetanus BOLD activity from two event types Slide from Matt Brown
BOLD Overlap with Jittering Neuronal activity from closely-spaced, jittered events BOLD activity from closely-spaced, jittered events Slide from Matt Brown
BOLD Overlap with Jittering Neuronal activity from closely-spaced, jittered events BOLD activity from closely-spaced, jittered events Slide from Matt Brown
Fast fMRI Detection A) BOLD Signal B) Individual Haemodynamic Components C) 2 Predictor Curves for use with GLM (summation of B) Slide from Matt Brown
Why jitter? Yields larger fluctuations in signal When pink is on, yellow is off pink and yellow are anticorrelated Includes cases when both pink and yellow are off less anticorrelation Without jittering predictors from different trial types are strongly anticorrelated As we know, the GLM doesn’t do so well when predictors are correlated (or anticorrelated)
GLM: Tutorial data Just as in the GLM for a block design, we have one predictor for each condition other than the baseline
GLM: Output Faces > Baseline
Vary Intertrial Interval (ITI) How to Jitter = trial of one type (e.g., face image) = trial of another type (e.g., place image) TD = 2 s ITI = 0 s SOA = 2 s TD = 2 s ITI = 4 s SOA = 6 s Vary Intertrial Interval (ITI) Stimulus Onset Asynchrony (SOA) = ITI + Trial Duration may want to make TD (e.g., 2 s) and ITI durations (e.g., 0, 2, 4, 6 s) an integer multiple of TR (e.g., 2 s) for ease of creating protocol files Frequency of ITIs in Each Condition 2 4 6 ITI (s) Flat Distribution Exponential Distribution Another way to think about it… Include “Null” Trials = null trial (nothing happens) Can randomize or counterbalance distribution of three trial types Outcome may be similar to varying ISI
Assumption of HRF is More Problematic for Event-Related Designs We know that the standard two-gamma HRF is a mediocre approximation for individual Ss’ HRFs Handwerker et al., 2004, Neuroimage We know this isn’t such a big deal for block designs but it is a bigger issue for rapid event-related designs.
One Approach to Estimation: Counterbalanced Trial Orders Each condition must have the same history for preceding trials so that trial history subtracts out in comparisons For example if you have a sequence of Face, Place and Object trials (e.g., FPFOPPOF…), with 30 trials for each condition, you could make sure that the breakdown of trials (yellow) with respect to the preceding trial (blue) was as follows: …Face Face x 10 …Place Face x 10 …Object Face x 10 …Face Place x 10 …Place Place x 10 …Object Place x 10 …Face Object x 10 …Place Object x 10 …Object Object x 10 Most counterbalancing algorithms do not control for trial history beyond the preceding one or two items
Algorithms for Picking Efficient Designs Optseq2
Algorithms for Picking Efficient Designs Genetic Algorithms
You Can’t Always Counterbalance You may be interested in variables for which you can not control trial sequence e.g., subject errors can mess up your counterbalancing e.g., memory experiments: remembered vs. forgotten items e.g., decision-making: choice 1 vs. choice 2 e.g., correlations with behavioral ratings
Post Hoc Trial Sorting Example Wagner et al., 1998, Science
Pros & Cons of Applying Standard GLM to Rapid-ER Designs high detection power trials can be put in unpredictable order subjects don’t get so bored Cons and Caveats reduced detection compared to block designs requires stronger assumptions about linearity BOLD is non-linear with inter-event intervals < 6 sec. Nonlinearity becomes severe under 2 sec. errors in HRF model can introduce errors in activation estimates
Design Types Mixed Design = trial of one type (e.g., face image) = trial of another type (e.g., place image) Mixed Design
Example of Mixed Design Otten, Henson, & Rugg, 2002, Nature Neuroscience used short task blocks in which subjects encoded words into memory In some areas, mean level of activity for a block predicted retrieval success
Pros and Cons of Mixed Designs allow researchers to distinguish between state-related and item-related activation Cons sensitive to errors in HRF modelling
Deconvolution of Event-Related Designs Using the GLM
Two Approaches Detection – find the blobs Business as usual Model predicted activation using square-wave predictor functions convolved with assumed HRF Extract beta weights for each condition; Contrast betas Drawback: Because trials are packed so closely together, any misestimates of the HRF will lead to imperfect GLM predictors and betas Estimation – find the time course make a model that can estimate the volume-by-volume time courses through a deconvolution of the signal
Convolution of Single Trials Neuronal Activity BOLD Signal Haemodynamic Function Time Time Slide from Matt Brown
Fast fMRI Detection A) BOLD Signal B) Individual Haemodynamic Components C) 2 Predictor Curves for use with GLM (summation of B) Slide from Matt Brown
DEconvolution of Single Trials Neuronal Activity BOLD Signal Haemodynamic Function Time Time Slide from Matt Brown
Deconvolution Example time course from 4 trials of two types (pink, blue) in a “jittered” design
Summed Activation
Single Stick Predictor (stick predictors are also called finite impulse response (FIR) functions) single predictor for first volume of pink trial type
Predictors for Pink Trial Type set of 12 predictors for subsequent volumes of pink trial type need enough predictors to cover unfolding of HRF (depends on TR)
Predictor Matrix Diagonal filled with 1’s .
Predictors for Pink Trial Type
Predictors for the Blue Trial Type
Predictor x Beta Weights for Pink Trial Type sequence of beta weights for one trial type yields an estimate of the average activation (including HRF)
Predictor x Beta Weights for Blue Trial Type height of beta weights indicates amplitude of response (higher betas = larger response)
Linear Deconvolution Miezen et al. 2000 Jittering ITI also preserves linear independence among the hemodynamic components comprising the BOLD signal.
Decon GLM To find areas that respond to all stims, we could fill the contrast column with +’s 14 predictors (time points) for Cues 14 predictors (time points) for Face trials …but that would be kind of dumb because we don’t expect all time points to be highly positive, just the ones at the peak of the HRF 14 predictors (time points) for House trials 14 predictors (time points) for Object trials
Contrasts on Peak Volumes We can search for areas that show activation at the peak (e.g., 3-5 s after stimulus onset
Results: Peaks > Baseline
Graph beta weights for spike predictors Get deconvolution time course Why go to all this bother? Why not just generate an event-related average? …
Pros and Cons of Deconvolution Produces time course that dissociates activation from trial history Does not assume specific shape for hemodynamic function Robust against trial history biases (though not immune to it) Compound trial types possible (e.g., stimulus-delay-response) may wish to include “half-trials” (stimulus without response) Cons: Complicated Quite sensitive to noise Contrasts don’t take HRF fully into account, they just examine peaks
Not Mutually Exclusive Convolution and deconvolution GLMs are not mutually exclusive Example use convolution GLM to detect blobs, use deconvolution to estimate time courses
Design Types Block Design Slow ER Design Rapid Jittered ER Design = trial of one type (e.g., face image) Design Types = trial of another type (e.g., place image) Block Design Slow ER Design Rapid Jittered ER Design Jody Mixed Design
Take-home message Block designs Slow ER designs Fast ER designs Great detection, poor estimation Slow ER designs Poor detection, great estimation Fast ER designs Good detection, very good estimation Excellent choice for designs where predictability is undesirable or where you want to factor in subject’s behavior