Hierarchical Models and Variance Components Will Penny Wellcome Department of Imaging Neuroscience, University College London, UK Cyclotron Research Centre, University of Liege, April 2003
Outline Random Effects Analysis General Framework Summary statistic approach (t-tests @ 2nd level) General Framework Multiple variance components and Hierarchical models Multiple variance components F-tests and conjunctions @2nd level Modelling fMRI serial correlation @1st level Hierarchical models for Bayesian Inference SPMs versus PPMs
Outline Random Effects Analysis General Framework Summary statistic approach (t-tests @ 2nd level) General Framework Multiple variance components and Hierarchical models Multiple variance components F-tests and conjunctions @2nd level Modelling fMRI serial correlation @1st level Hierarchical models for Bayesian Inference SPMs versus PPMs
Random Effects Analysis:Summary-Statistic Approach 1st Level 2nd Level Data Design Matrix Contrast Images 1 ^ SPM(t) 1 ^ 2 ^ 2 ^ 11 ^ 11 ^ ^ One-sample t-test @2nd level 12 ^ 12 ^
Validity of approach ^ ^ Gold Standard approach is EM – see later – estimates population mean effect as MEANEM the variance of this estimate as VAREM For N subjects, n scans per subject and equal within-subject variance we have VAREM = Var-between/N + Var-within/Nn In this case, the SS approach gives the same results, on average: Avg[a] = MEANEM Avg[Var(a)] =VAREM In other cases, with N~12, and typical ratios of between-subject to within-subject variance found in fMRI, the SS approach will give very similar results to EM. ^ ^
Two populations Estimated population means Contrast images Two-sample t-test @2nd level
Outline Random Effects Analysis General Framework Summary statistic approach (t-tests @ 2nd level) General Framework Multiple variance components and Hierarchical models Multiple variance components F-tests and conjunctions @2nd level Modelling fMRI serial correlation @1st level Hierarchical models for Bayesian Inference SPMs versus PPMs
The General Linear Model y = X + e N 1 N L L 1 N 1 Error covariance N 2 Basic Assumptions Identity Independence N We assume ‘sphericity’
Multiple variance components y = X + e N 1 N L L 1 N 1 Error covariance N Errors can now have different variances and there can be correlations N We allow for ‘nonsphericity’
Non-Sphericity Errors are independent but not identical Errors are not independent and not identical Error Covariance
General Framework Multiple variance components Hierarchical Models at each level With hierarchical models we can define priors and make Bayesian inferences. If we know the variance components we can compute the distributions over the parameters at each level.
Estimation EM algorithm Friston, K. et al. (2002), Neuroimage ( ) å y E-Step ( ) y C X T 1 - = e q h M-Step r for i and j { } { Q tr J g i j ij k å + l Friston, K. et al. (2002), Neuroimage
Pooling assumption Decompose error covariance at each voxel, i, into a voxel specific term, r(i), and voxel-wide terms. The hyperparameters are now estimated once for the whole brain.
Algorithm Equivalence Parametric Empirical Bayes (PEB) Hierarchical model EM=PEB=ReML Restricted Maximimum Likelihood (ReML) Single-level model
Outline Random Effects Analysis General Framework Summary statistic approach (t-tests @ 2nd level) General Framework Multiple variance components and Hierarchical models Multiple variance components F-tests and conjunctions @2nd level Modelling fMRI serial correlation @1st level Hierarchical models for Bayesian Inference SPMs versus PPMs
Non-Sphericity Errors are independent but not identical Errors are not independent and not identical Error Covariance
Non-Sphericity Error can be Independent but Non-Identical when… 1) One parameter but from different groups e.g. patients and control groups 2) One parameter but design matrices differ across subjects e.g. subsequent memory effect
Non-Sphericity Error can be Non-Independent and Non-Identical when… 1) Several parameters per subject e.g. Repeated Measurement design 2) Conjunction over several parameters e.g. Common brain activity for different cognitive processes 3) Complete characterization of the hemodynamic response e.g. F-test combining HRF, temporal derivative and dispersion regressors
Example I U. Noppeney et al. Stimuli: Auditory Presentation (SOA = 4 secs) of (i) words and (ii) words spoken backwards Subjects: (i) 12 control subjects (ii) 11 blind subjects jump touch koob “click” Scanning: fMRI, 250 scans per subject, block design Q. What are the regions that activate for real words relative to reverse words in both blind and control groups?
Independent but Non-Identical Error 1st Level Controls Blinds 2nd Level Controls and Blinds Conjunction between the 2 groups
Example 2 U. Noppeney et al. Stimuli: Auditory Presentation (SOA = 4 secs) of words motion sound visual action jump touch “jump” “click” “pink” “turn” “click” Subjects: (i) 12 control subjects Scanning: fMRI, 250 scans per subject, block design Q. What regions are affected by the semantic content of the words ?
= = = ? ? ? Non-Independent and Non-Identical Error 1st Leve visual sound hand motion ? = ? = ? = 2nd Level F-test
Example III U. Noppeney et al. Stimuli: (i) Sentences presented visually (ii) False fonts (symbols) Some of the sentences are syntactically primed Scanning: fMRI, 250 scans per subject, block design Q. Which brain regions of the “sentence reading system” are affected by Priming?
Non-Independent and Non-Identical Error 1st Level Sentence > Symbols No-Priming>Priming Orthogonal contrasts 2nd Level Conjunction of 2 contrasts Left Anterior Temporal
Example IV Modelling serial correlation in fMRI time series Model errors for each subject as AR(1) + white noise.
Outline Random Effects Analysis General Framework Summary statistic approach (t-tests @ 2nd level) General Framework Multiple variance components and Hierarchical models Multiple variance components F-tests and conjunctions @2nd level Modelling fMRI serial correlation @1st level Hierarchical models for Bayesian Inference SPMs versus PPMs
The Interface PEB OLS Parameters Parameters, and REML Hyperparameters No Priors Shrinkage priors
Bayes Rule
Example 2:Univariate model Likelihood and Prior Posterior Relative Precision Weighting
Bayesian Inference 1st level = within-voxel Likelihood Shrinkage Prior In the absence of evidence to the contrary parameters will shrink to zero 2nd level = between-voxels
Bayesian Inference: Posterior Probability Maps PPMs Posterior Likelihood Prior SPMs
SPMs and PPMs PPMs: Show activations of a given size SPMs: show voxels with non-zero activations
PPMs Advantages Disadvantages Use of shrinkage One can infer a cause priors over voxels is computationally demanding Utility of Bayesian approach is yet to be established One can infer a cause DID NOT elicit a response SPMs conflate effect-size and effect-variability No multiple comparisons problem (hence no smoothing) P-values don’t change with search volume
Summary Random Effects Analysis Multiple variance components Summary statistic approach (t-tests @ 2nd level) Multiple variance components F-tests and conjunctions @2nd level Modelling fMRI serial correlation @1st level Hierarchical models for Bayesian Inference SPMs versus PPMs General Framework Multiple variance components and Hierarchical models