Download presentation
Presentation is loading. Please wait.
Published byEvelyn Sparks Modified over 9 years ago
1
The General Linear Model (for dummies…) Carmen Tur and Ashwani Jha 2009
2
Overview of SPM RealignmentSmoothing Normalisation General linear model Statistical parametric map (SPM) Image time-series Parameter estimates Design matrix Template Kernel Gaussian field theory p <0.05 Statisticalinference
3
What is the GLM? It is a model (ie equation) It provides a framework that allows us to make refined statistical inferences taking into account: – the (preprocessed) 3D MRI images – time information (BOLD time series) – user defined experimental conditions – neurobiologically based priors (HRF) – technical / noise corrections
4
How does work? By creating a linear model: Collect Data X Y Data
5
How does work? By creating a linear model: Collect Data Generate model X Y Data Y=bx + c
6
How does work? By creating a linear model: Collect Data Generate model Fit model X Y Data Y=0.99x + 12 + e e N t t e 1 2 = minimum
7
How does work? By creating a linear model: Collect Data Generate model Fit modelTest model X Y Data Y=0.99x + 12 + e e
8
GLM matrix format Y = β 1 X 1 + C + e 5.91 2 15.02 0 18.43 5 12.34 4 24.75 8 23.26 8 19.37 0 13.68 9 26.19 1 21.610 5 31.711 2 But GLM works with lists of numbers (matrices) Y = 0.99x + 12 + e
9
GLM matrix format But GLM works with lists of numbers (matrices) Y = β 1 X 1 + β 2 X2 + C + e 5.91 2 15.02 2 18.43 5 12.34 5 24.75 5 23.26 2 19.37 2 13.68 5 26.19 5 21.610 5 31.711 2 Y = 0.99x + 12 + e ‘non-linear’ We need to put this in… (data Y, and design matrix Xs) sphericity assumption
10
GLM matrix format = + y y X X
11
fMRI example (from SPM course)…
12
Passive word listening versus rest 7 cycles of rest and listening Blocks of 6 scans with 7 sec TR Question: Is there a change in the BOLD response between listening and rest? Stimulus function One session A very simple fMRI experiment
13
Time BOLD signal Time A closer look at the data (Y)… Look at each voxel over time (mass univariate)
14
BOLD signal Time = + error e 22 + x2x2 1 x1x1 The rest of the model… Instead of C
15
The GLM matrix format = + y X
16
…easy! How to solve the model (parameter estimation) Assumptions (sphericity of error)
17
Actually try to estimate ‘best’ has lowest overall error ie the sum of squares of the error: But how does this apply to GLM, where X is a matrix… ^ Solving the GLM (finding ) Y=0.99x + 12 + e e N t t e 1 2 = minimum ^
18
…need to geometrically visualise the GLM in N dimensions = + y x 1 x 2 N ^ ^
19
…need to geometrically visualise the GLM = + y x 1 x 2 N=3 x2x2 x1x1 Design space defined by y = X ^ ^ ^ What about the actual data y?
20
…need to geometrically visualise the GLM = + y x 1 x 2 N=3 x2x2 x1x1 Design space defined by y = X ^ ^ ^ y
21
Once again in 3D.. The design (X) can predict the data values (y) in the design space. The actual data y, usually lies outside this space. The ‘error’ is difference. y e Design space defined by X x1x1 x2x2 ˆ ˆ Xy ^
22
Solving the GLM (finding ) – ordinary least squares (OLS) To find minimum error: e has to be orthogonal to design space (X). “Project data onto model” ie: X T e = 0 X T (y - X ) = 0 X T y = X T X y e Design space defined by X x1x1 x2x2 ˆ ˆ Xy N t t e 1 2 = minimum
23
Assuming sphericity We assume that the error has: – a mean of 0, – is normally distributed – is independent (does not correlate with itself) =
24
Assuming sphericity We assume that the error has: – a mean of 0, – is normally distributed – is independent (does not correlate with itself) = x
25
Solution Half-way re-cap… = + Ordinary least squares estimation (OLS) (assuming i.i.d. error): y X GLM
26
Methods for dummies 2009-10 London, 4th November 2009 II part Carmen Tur
27
I. BOLD responses have a delayed and dispersed form HRF Neural stimulus hemodynamic response time Neural stimulus Hemodynamic Response Function: This is the expected BOLD signal if a neural stimulus takes place expected BOLD response = input function impulse response function (HRF) expected BOLD response Problems of this model
28
I. BOLD responses have a delayed and dispersed form Solution: CONVOLUTION HRF Transform neural stimuli function into a expected BOLD signal with a canonical hemodynamic response function (HRF) Problems of this model
29
II. The BOLD signal includes substantial amounts of low-frequency noise Problems of this model WHY? Multifactorial: biorhythms, coil heating, etc… HOW MAY OUR DATA LOOK LIKE? Intensity Of BOLD signal Time Real data Predicted response, NOT taking into account low- frequency drift
30
II. The BOLD signal includes substantial amounts of low-frequency noise Problems of this model Solution: HIGH PASS FILTERING discrete cosine transform (DCT) set
31
STEPS so far… Interim summary: GLM so far… 1.Acquisition of our data (Y) 2.Design our matrix (X) 3.Assumptions of GLM 4.Correction for BOLD signal shape: convolution 5.Cleaning of our data of low-frequency noise 6.Estimation of βs But our βs may still be wrong! Why? 7.Checkup of the error… Are all the assumptions of the error satisfied?
32
III. The data are serially correlated Temporal autocorrelation: in y = Xβ + e over time 0 It is… t e e in time t is correlated with e in time t-1 Problems of this model It should be… 0 t e
33
III. The data are serially correlated Temporal autocorrelation: in y = Xβ + e over time WHY? Multifactorial… Problems of this model
34
III. The data are serially correlated Temporal autocorrelation: in y = Xβ + e over time e t = ae t-1 + ε (assuming ε ~ N(0,σ 2 I)) Problems of this model autocovariance function Autoregressive model in other words: the covariance of error at time t (e t ) and error at time t-1 (e t-1 ) is not zero
35
III. The data are serially correlated Temporal autocorrelation: in y = Xβ + e over time e t = ae t-1 + ε (assuming ε ~ N(0,σ 2 I)) Problems of this model Autoregressive model e t = ae t-1 + ε e t = a (ae t-2 + ε) + ε e t = a 2 e t-2 + aε + ε e t = a 2 (ae t-3 + ε) + aε + ε e t = a 3 e t-3 + a 2 ε + aε + ε … e t -1 = ae t-2 + ε e t -2 = ae t-3 + ε … But a is a number between 0 and 1
36
Problems of this model in other words: the covariance of error at time t (e t ) and error at time t-1 (e t-1 ) is not zero time (scans) time (scans) ERROR: Covariance matrix 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
37
III. The data are serially correlated Temporal autocorrelation: in y = Xβ + e over time e t = ae t-1 + ε (assuming ε ~ N(0,σ 2 I)) autocovariance function Problems of this model Autoregressive model This violates the assumption of the error e ~ N (0, σ 2 I ) in other words: the covariance of error at time t (e t ) and error at time t-1 (e t-1 ) is not zero
38
III. The data are serially correlated Solution: 1. Use an enhanced noise model with hyperparameters for multiple error covariance components Problems of this model It should be… It is… e t = ae t-1 + ε a ≠ 0 a = 0 e t = ae t-1 + ε e t = ε But…a? ? e t = ε Or, if you wish
39
III. The data are serially correlated Solution: 1. Use an enhanced noise model with hyperparameters for multiple error covariance components Problems of this model It should be… It is… e t = ae t-1 + ε a ≠ 0a = 0 e t = ae t-1 + ε e t = ε But…a? ? We would like to know covariance (a, autocovariance) of error But we can only estimate it: V V = Σ λ i Q i V = λ 1 Q 1 + λ 2 Q 2 λ 1 and λ 2 : hyperparameters Q 1 and Q 2 : multiple error covariance components
40
III. The data are serially correlated Solution: 1. Use an enhanced noise model with hyperparameters for multiple error covariance components 2. Use estimated autocorrelation to specify filter matrix W for whitening the data e t = ae t-1 + ε (assuming ε ~ N(0,σ 2 I)) WY = WXβ + We Problems of this model
41
Other problems – Physiological confounds head movements arterial pulsations (particularly bad in brain stem) breathing eye blinks (visual cortex) adaptation affects, fatigue, fluctuations in concentration, etc.
42
Other problems – Correlated regressors Example: y = x 1 β 1 + x 2 β 2 + e When there is high (but not perfect) correlation between regressors, parameters can be estimated… But the estimates will be inefficiently estimated (ie highly variable)
43
HRF varies substantially across voxels and subjects For example, latency can differ by ± 1 second Solution: MULTIPLE BASIS FUNCTIONS (another talk) Other problems – Variability in the HRF HRF could be understood as a linear combination of A, B and C A B C
44
Model everything Important to model all known variables, even if not experimentally interesting: effects-of-interest (the regressors we are actually interested in) + head movement, block and subject effects… subjects global activity or movement conditions: effects of interest Ways to improve the model Minimise residual error variance
45
The aim of modelling the measured data was to make inferences about effects of interest Contrasts allow us to make such inferences How? T-tests and F-tests How to make inferences REMEMBER!! Another talk!!!
46
Using an easy example... SUMMARY
47
Given data (image voxel, y) Y = X. β + ε
48
Different (rigid and known) predictors (regressors, design matrix, X) x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 Time Y = X. β + ε
49
Fitting our models into our data (estimation of parameters, β) x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 Y = X. β + ε
50
HOW? Fitting our models into our data (estimation of parameters, β) Minimising residual error variance
51
e (error) = y o - y e Minimising the Sums of Squares of the Error differences between your predicted model and the observed data
52
y = x 1 6 + x 2 3+ x 3 1+ x 4 2+ x 5 1+ x 6 36 + e Y = X. β + ε Fitting our models into our data (estimation of parameters, β) y = x 1 β 1 + x 2 β 2 + x 3 β 3 + x 4 β 4 + x 5 β 5 + x 6 β 6 + e We must pay attention to the problems that the GLM has…
53
Y = X. β + ε Making inferences: our final goal!! The end
54
REFERENCES 1.Talks from previous years 2.Human brain function THANKS TO GUILLAUME FLANDIN Many thanks for your attention London 4th Nov 2009 GENERAL LINEAR MODEL – Methods for Dummies 2009-2010
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.