Group analyses of fMRI data Methods & models for fMRI data analysis 28 April 2009 Klaas Enno Stephan Laboratory for Social and Neural Systems Research Institute for Empirical Research in Economics University of Zurich Functional Imaging Laboratory (FIL) Wellcome Trust Centre for Neuroimaging University College London With many thanks for slides & images to: FIL Methods group, particularly Will Penny
Overview of SPM RealignmentSmoothing Normalisation General linear model Statistical parametric map (SPM) Image time-series Parameter estimates Design matrix Template Kernel Gaussian field theory p <0.05 Statisticalinference
Why hierachical models? fMRI, single subject fMRI, multi-subject ERP/ERF, multi-subject EEG/MEG, single subject Hierarchical models for all imaging data! time
Time BOLD signal Time single voxel time series single voxel time series Reminder: voxel-wise time series analysis! model specification model specification parameter estimation parameter estimation hypothesis statistic SPM
The model: voxel-wise GLM = + y y X X Model is specified by 1.Design matrix X 2.Assumptions about e Model is specified by 1.Design matrix X 2.Assumptions about e N: number of scans p: number of regressors N: number of scans p: number of regressors The design matrix embodies all available knowledge about experimentally controlled factors and potential confounds.
GLM assumes Gaussian “spherical” (i.i.d.) errors sphericity = iid: error covariance is scalar multiple of identity matrix: Cov(e) = 2 I sphericity = iid: error covariance is scalar multiple of identity matrix: Cov(e) = 2 I Examples for non-sphericity: non-identity non-independence
Multiple covariance components at 1 st level = Q1Q1 Q2Q2 Estimation of hyperparameters with ReML (restricted maximum likelihood). V enhanced noise model error covariance components Q and hyperparameters
c = ReML- estimates t-statistic based on ML estimates For brevity:
Group level inference: fixed effects (FFX) assumes that parameters are “fixed properties of the population” all variability is only intra-subject variability, e.g. due to measurement errors Laird & Ware (1982): the probability distribution of the data has the same form for each individual and the same parameters In SPM: simply concatenate the data and the design matrices lots of power (proportional to number of scans), but results are only valid for the group studied, can’t be generalized to the population
Group level inference: random effects (RFX) assumes that model parameters are probabilistically distributed in the population variance is due to inter-subject variability Laird & Ware (1982): the probability distribution of the data has the same form for each individual, but the parameters vary across individuals In SPM: hierarchical model much less power (proportional to number of subjects), but results can be generalized to the population
Recommended reading Linear hierarchical models Mixed effect models
Linear hierarchical model Hierarchical model Multiple variance components at each level At each level, distribution of parameters is given by level above. What we don’t know: distribution of parameters and variance parameters (hyperparameters).
Example: Two-level model =+ = + Second level First level
Two-level model Friston et al. 2002, NeuroImage fixed effects random effects
Mixed effects analysis Non-hierarchical model Variance components at 2 nd level Estimating 2 nd level effects between-level non-sphericity Within-level non-sphericity at both levels: multiple covariance components Friston et al. 2005, NeuroImage within-level non-sphericity
Estimation EM-algorithm E-step M-step Assume, at voxel j: Assume, at voxel j: Friston et al. 2002, NeuroImage GN gradient ascent
Algorithmic equivalence Hierarchical model Hierarchical model Parametric Empirical Bayes (PEB) Parametric Empirical Bayes (PEB) EM = PEB = ReML Restricted Maximum Likelihood (ReML) Restricted Maximum Likelihood (ReML) Single-level model Single-level model
Mixed effects analysis Summary statistics Summary statistics EM approach EM approach Step 1 Step 2 Friston et al. 2005, NeuroImage non-hierarchical model 1 st level non-sphericity 2 nd level non-sphericity pooling over voxels
Practical problems Most 2-level models are just too big to compute. And even if, it takes a long time! Moreover, sometimes we are only interested in one specific effect and do not want to model all the data. Is there a fast approximation?
Summary statistics approach Data Design Matrix Contrast Images SPM(t) Second level First level One-sample 2 nd level One-sample 2 nd level
Validity of the summary statistics approach The summary stats approach is exact if for each session/subject: All other cases: Summary stats approach seems to be fairly robust against typical violations. Within-session covariance the same First-level design the same One contrast per session
Reminder: sphericity „sphericity“ means: Scans i.e.
2nd level: non-sphericity Errors are independent but not identical: e.g. different groups (patients, controls) Errors are independent but not identical: e.g. different groups (patients, controls) Errors are not independent and not identical: e.g. repeated measures for each subject (like multiple basis functions) Errors are not independent and not identical: e.g. repeated measures for each subject (like multiple basis functions) Error covariance Error covariance
Example 1: non-indentical & independent errors Stimuli: Auditory Presentation (SOA = 4 secs) of (i) words and (ii) words spoken backwards Auditory Presentation (SOA = 4 secs) of (i) words and (ii) words spoken backwards Subjects: e.g. “Book” and “Koob” e.g. “Book” and “Koob” fMRI, 250 scans per subject, block design Scanning: (i) 12 control subjects (ii) 11 blind subjects (i) 12 control subjects (ii) 11 blind subjects Noppeney et al.
1 st level: 2 nd level: Controls Blinds
Stimuli: Auditory Presentation (SOA = 4 secs) of words Subjects: fMRI, 250 scans per subject, block design fMRI, 250 scans per subject, block design Scanning: (i) 12 control subjects 1. Motion2. Sound3. Visual4. Action “jump”“click”“pink”“turn” Question: What regions are generally affected by the semantic content of the words? Contrast: semantic decisions > auditory decisions on reversed words (gender identification task) What regions are generally affected by the semantic content of the words? Contrast: semantic decisions > auditory decisions on reversed words (gender identification task) Example 2: non-indentical & non-independent errors Noppeney et al. 2003, Brain 1. Words referred to body motion. Subjects decided if the body movement was slow. 2. Words referred to auditory features. Subjects decided if the sound was usually loud 3. Words referred to visual features. Subjects decided if the visual form was curved. 4. Words referred to hand actions. Subjects decided if the hand action involved a tool.
Repeated measures ANOVA 1 st level: 2 nd level: 3.Visual 4.Action ?=?= ?=?= ?=?= 1.Motion 2.Sound
Repeated measures ANOVA 1 st level: 2 nd level: 3.Visual 4.Action ?=?= ?=?= ?=?= 1.Motion 2.Sound
Practical conclusions Linear hierarchical models are used for group analyses of multi- subject imaging data. The main challenge is to model non-sphericity (i.e. non-identity and non-independence of errors) within and between levels of the hierarchy. This is done using EM or ReML (which are equivalent for linear models). The summary statistics approach is robust approximation to a full mixed-effects analysis. –Use mixed-effects model only, if seriously in doubt about validity of summary statistics approach.
Thank you