Download presentation
Presentation is loading. Please wait.
Published byPreston May Modified over 9 years ago
1
Statistical Inference, Multiple Comparisons and Random Field Theory Andrew Holmes SPM short course, May 2002 Andrew Holmes SPM short course, May 2002
2
Overview…Overview…
3
…a voxel by voxel hypothesis testing approach reliably identify regions showing a significant experimental effect of interest Assessment of statistic imagesAssessment of statistic images multiple comparisonsmultiple comparisons random field theoryrandom field theory smoothnesssmoothness spatial levels of inference & powerspatial levels of inference & power false discovery rate later...false discovery rate later... Generalisability, random effects & population inferenceGeneralisability, random effects & population inference inferring to the populationinferring to the population group comparisonsgroup comparisons Non-parametric inference later...Non-parametric inference later... …a voxel by voxel hypothesis testing approach reliably identify regions showing a significant experimental effect of interest Assessment of statistic imagesAssessment of statistic images multiple comparisonsmultiple comparisons random field theoryrandom field theory smoothnesssmoothness spatial levels of inference & powerspatial levels of inference & power false discovery rate later...false discovery rate later... Generalisability, random effects & population inferenceGeneralisability, random effects & population inference inferring to the populationinferring to the population group comparisonsgroup comparisons Non-parametric inference later...Non-parametric inference later... Overview…Overview…
4
realignment & motion correction smoothing normalisation General Linear Model Ümodel fitting Üstatistic image corrected p-values image data parameter estimates design matrix anatomical reference kernel Statistical Parametric Map random field theory
5
Statistical Parametric Mapping… condition 1condition 2 voxel by voxel modelling – parameter estimatevariance estimate = statistic image or SPM
6
Multiple comparisons…
7
Classical hypothesis testing… Null hypothesis HNull hypothesis H –test statistic –null distributions Hypothesis testHypothesis test –control Type I error incorrectly reject Hincorrectly reject H –test level Pr(“reject” H | H) Pr(“reject” H | H) –test size Pr(“reject H | H)Pr(“reject H | H) p –valuep –value –min at which H rejected –Pr(T t | H) –characterising “surprise” Null hypothesis HNull hypothesis H –test statistic –null distributions Hypothesis testHypothesis test –control Type I error incorrectly reject Hincorrectly reject H –test level Pr(“reject” H | H) Pr(“reject” H | H) –test size Pr(“reject H | H)Pr(“reject H | H) p –valuep –value –min at which H rejected –Pr(T t | H) –characterising “surprise” t –distribution, 32 df. F –distribution, 10,32 df.
8
Multiple comparisons… t 59 Gaussian 10mm FWHM (2mm pixels) p = 0.05 Threshold at p ?Threshold at p ? –expect (100 p)% by chance Surprise ?Surprise ? –extreme voxel values voxel level inference –big suprathreshold clusters cluster level inference –many suprathreshold clusters set level inference Power & localisationPower & localisation sensitivity spatial specificity Threshold at p ?Threshold at p ? –expect (100 p)% by chance Surprise ?Surprise ? –extreme voxel values voxel level inference –big suprathreshold clusters cluster level inference –many suprathreshold clusters set level inference Power & localisationPower & localisation sensitivity spatial specificity
9
Family of hypothesesFamily of hypotheses –H k k = {1,…,K} –H = H k Familywise Type I errorFamilywise Type I error –weak control – omnibus test Pr(“reject” H H ) Pr(“reject” H H ) “anything, anywhere” ?“anything, anywhere” ? –strong control – localising test Pr(“reject” H W H W ) Pr(“reject” H W H W ) W: W & H W “anything, & where” ?“anything, & where” ? Adjusted p–valuesAdjusted p–values –test level at which reject H k Family of hypothesesFamily of hypotheses –H k k = {1,…,K} –H = H k Familywise Type I errorFamilywise Type I error –weak control – omnibus test Pr(“reject” H H ) Pr(“reject” H H ) “anything, anywhere” ?“anything, anywhere” ? –strong control – localising test Pr(“reject” H W H W ) Pr(“reject” H W H W ) W: W & H W “anything, & where” ?“anything, & where” ? Adjusted p–valuesAdjusted p–values –test level at which reject H k Multiple comparisons terminology…
10
p = 0.05 p = 0.0000001 p = 0.0001 Simple threshold tests… Threshold u Threshold u –t k > u reject H k –reject any H k reject H reject H if t max > u Valid testValid test –weak control Pr(T max > u H ) –strong control since W Pr(T W max > u H W ) Adjusted p –valuesAdjusted p –values –Pr(T max > t k H ) Threshold u Threshold u –t k > u reject H k –reject any H k reject H reject H if t max > u Valid testValid test –weak control Pr(T max > u H ) –strong control since W Pr(T W max > u H W ) Adjusted p –valuesAdjusted p –values –Pr(T max > t k H ) uuuu uuuu
11
The “Bonferroni” correction… “The” Bonferroni inequality“The” Bonferroni inequality Carlo Emilio Bonferroni (1936) –For any set of events A k : Bonferroni correctionBonferroni correction –A k : correctly “accept” H k T k < u & H k –Assess H k at level ' correction ' = / K Adjusted p –valuesAdjusted p –values –min(1,K p k ) “The” Bonferroni inequality“The” Bonferroni inequality Carlo Emilio Bonferroni (1936) –For any set of events A k : Bonferroni correctionBonferroni correction –A k : correctly “accept” H k T k < u & H k –Assess H k at level ' correction ' = / K Adjusted p –valuesAdjusted p –values –min(1,K p k ) Conservative for correlated tests independent:K tests some dependence :? tests totally dependent:1 test u = -1 (1- /K) 5mm 10mm 15mm
12
Random field theory…
13
SPM approach: Random fields… Consider statistic image as lattice representation of a continuous random fieldConsider statistic image as lattice representation of a continuous random field Use results from continuous random field theoryUse results from continuous random field theory Consider statistic image as lattice representation of a continuous random fieldConsider statistic image as lattice representation of a continuous random field Use results from continuous random field theoryUse results from continuous random field theory lattice represtntation
14
Euler characteristic… Topological measureTopological measure –of excursion set u –of excursion set u A u R 3 A u = {x R 3 : Z(x) > u} –# components - # “holes” Single threshold testSingle threshold test –large u, near T max –Euler char. #local max –Expected Euler char p–value Pr(Z max > u ) Pr( u > 0 ) E [ u ] Pr(Z max > u ) Pr( u ) > 0 ) E [ u )] –single threshold test –u s.t. E [ u ] = –u s.t. E [ u ) ] = Topological measureTopological measure –of excursion set u –of excursion set u A u R 3 A u = {x R 3 : Z(x) > u} –# components - # “holes” Single threshold testSingle threshold test –large u, near T max –Euler char. #local max –Expected Euler char p–value Pr(Z max > u ) Pr( u > 0 ) E [ u ] Pr(Z max > u ) Pr( u ) > 0 ) E [ u )] –single threshold test –u s.t. E [ u ] = –u s.t. E [ u ) ] =
15
E [ u ] ( ) | | u 2 -1 exp-u 2 /22 2 E [ u )] ( ) | | (u 2 -1) exp(-u 2 /2) / (2 ) 2 – large search region R 3 – ( volume – | | smoothness –A u excursion setA u R 3 –A u excursion setA u = {x R 3 : Z(x) > u} – Gaussian random field R 3 + Multivariate Normal Finite Dimensional distributions + continuous + strictly stationary + marginal N(0,1) + continuously differentiable + twice differentiable at 0 + Gaussian ACF (at least near local maxima) –Z(x) Gaussian random fieldx R 3 + Multivariate Normal Finite Dimensional distributions + continuous + strictly stationary + marginal N(0,1) + continuously differentiable + twice differentiable at 0 + Gaussian ACF (at least near local maxima) E [ u ] ( ) | | u 2 -1 exp-u 2 /22 2 E [ u )] ( ) | | (u 2 -1) exp(-u 2 /2) / (2 ) 2 – large search region R 3 – ( volume – | | smoothness –A u excursion setA u R 3 –A u excursion setA u = {x R 3 : Z(x) > u} – Gaussian random field R 3 + Multivariate Normal Finite Dimensional distributions + continuous + strictly stationary + marginal N(0,1) + continuously differentiable + twice differentiable at 0 + Gaussian ACF (at least near local maxima) –Z(x) Gaussian random fieldx R 3 + Multivariate Normal Finite Dimensional distributions + continuous + strictly stationary + marginal N(0,1) + continuously differentiable + twice differentiable at 0 + Gaussian ACF (at least near local maxima) Expected Euler characteristic… AuAu
16
Smoothness, PRF, resels... Smoothness | |Smoothness | | –variance-covariance matrix of partial derivatives (possibly location dependent) Point Response Function PRFPoint Response Function PRF Full Width at Half Maximum FWHMFull Width at Half Maximum FWHM Smoothness | |Smoothness | | –variance-covariance matrix of partial derivatives (possibly location dependent) Point Response Function PRFPoint Response Function PRF Full Width at Half Maximum FWHMFull Width at Half Maximum FWHM Gaussian PRFGaussian PRF – – kernel var/cov matrix – ACF 2 – = (2 ) -1 FWHM f = (8ln(2)) f x f x 00 – f y f z – f y 01 00 f z 8ln(2) ignoring covariances | | = (4ln(2)) 3/2 / (f x f y f z ) Resolution Element ( RESEL )Resolution Element ( RESEL ) –Resel dimensions (f x f y f z ) –R 3 ( ) = ( ) / (f x f y f z ) if strictly stationary E [ u ] = R 3 ( ) (4ln(2)) 3/2 (u 2 -1) exp(-u 2 /2) / (2 ) 2 E [ u )] = R 3 ( ) (4ln(2)) 3/2 (u 2 -1) exp(-u 2 /2) / (2 ) 2 R 3 ( ) ( 1 – (u) ) for high thresholds u R 3 ( ) ( 1 – (u) ) for high thresholds u
17
Component fields… = +YX data matrix design matrix parameters errors + ? = ? voxels scans Üestimate ^ residuals estimated component fields parameter estimates “Image regression” variance estimated variance =
18
Component fields… Component fields… = + Component fields T –statistic image
19
Smoothness estimation… SmoothnessSmoothness –from standardised residuals –empirical derivatives at each voxel Resels per voxel (RPV) – an “image” of smoothnessResels per voxel (RPV) – an “image” of smoothness –correction for estimation of variance field 2 function of degrees of freedomfunction of degrees of freedom –covariances often ignored Euler CharacteristicsEuler Characteristics –using discrete methods SmoothnessSmoothness –from standardised residuals –empirical derivatives at each voxel Resels per voxel (RPV) – an “image” of smoothnessResels per voxel (RPV) – an “image” of smoothness –correction for estimation of variance field 2 function of degrees of freedomfunction of degrees of freedom –covariances often ignored Euler CharacteristicsEuler Characteristics –using discrete methods
20
General form for expected Euler characteristicGeneral form for expected Euler characteristic 2, F, & t fields restricted search regions D dimensions 2, F, & t fields restricted search regions D dimensions E [ A u ] = R d ( ) d ( u ) E [ ( A u )] = R d ( ) d ( u ) General form for expected Euler characteristicGeneral form for expected Euler characteristic 2, F, & t fields restricted search regions D dimensions 2, F, & t fields restricted search regions D dimensions E [ A u ] = R d ( ) d ( u ) E [ ( A u )] = R d ( ) d ( u ) Unified p-values… R d ( ):d-dimensional Minkowski functional of – function of dimension, space and smoothness: R 0 ( )= ( ) Euler characteristic of R 1 ( )=resel diameter R 2 ( )=resel surface area R 3 ( )=resel volume d ( ):d-dimensional EC density of Z(x) – function of dimension and threshold, specific for RF type: E.g. Gaussian RF: (strictly stationary &c…) 0 (u)=1- (u) 1 (u)=(4 ln2) 1/2 exp(-u 2 /2) / (2 ) 2 (u)=(4 ln2) exp(-u 2 /2) / (2 ) 3/2 3 (u)=(4 ln2) 3/2 (u 2 -1) exp(-u 2 /2) / (2 ) 2 4 (u)=(4 ln2) 2 (u 3 -3u) exp(-u 2 /2) / (2 ) 5/2 AuAu
21
Suprathreshold cluster tests… Primary threshold uPrimary threshold u –examine connected components of excursion set –Suprathreshold clusters –Reject H W for clusters of voxels W of size S > s Localisation (Strong control)Localisation (Strong control) –at cluster level –increased power –esp. high resolutions ( f MRI ) Thresholds, p –valuesThresholds, p –values –Pr(S max > s H ) Nosko, Friston, (Worsley) –Poisson occurrence (Adler) –Assumme form for Pr(S=s|S>0) Primary threshold uPrimary threshold u –examine connected components of excursion set –Suprathreshold clusters –Reject H W for clusters of voxels W of size S > s Localisation (Strong control)Localisation (Strong control) –at cluster level –increased power –esp. high resolutions ( f MRI ) Thresholds, p –valuesThresholds, p –values –Pr(S max > s H ) Nosko, Friston, (Worsley) –Poisson occurrence (Adler) –Assumme form for Pr(S=s|S>0) 5mm FWHM 10mm FWHM 15mm FWHM (2mm 2 pixels)
22
Poisson Clumping Heuristic Expected number of clusters p{cluster volume > k} Expected cluster volume EC density ( Search volume (R) Smoothness
23
Levels of inference… Parameters u - 3.09 k - 12 voxels S - 32 3 voxels FWHM - 4.7 voxels D - 3 set-level P(c 3 | n 12, u 3.09) = 0.019 cluster-level P(c 1 | n 82, t 3.09) = 0.029 (corrected) P(n 82 | t 3.09) = 0.019 (uncorrected) n=82 n=32 n=1 2 voxel-level P(c 1 | n 0, t 4.37) = 0.048 (corrected) P(t 4.37) = 1 - { 4.37 } < 0.001 (uncorrected) omnibus P(c 7 | n 0, u 3.09) = 0.031
24
Summary: Levels of inference & power
25
SPM results...
26
SPM results…
27
SPM results...
29
SPM results…
30
SPM results...
32
Assumptions…Assumptions… Model fit & assumptionsModel fit & assumptions –valid distributional results Multivariate normalityMultivariate normality –of component images Strict stationarity (pre SPM99)Strict stationarity (pre SPM99) –of component images –homogeneous spatial structure Model fit & assumptionsModel fit & assumptions –valid distributional results Multivariate normalityMultivariate normality –of component images Strict stationarity (pre SPM99)Strict stationarity (pre SPM99) –of component images –homogeneous spatial structure SmoothnessSmoothness –smoothness » voxel size lattice approximationlattice approximation smoothness estimationsmoothness estimation –practically FWHM 3 VoxDimFWHM 3 VoxDim –otherwise conservative (voxel level)conservative (voxel level) lax (spatial extent)lax (spatial extent) spatial smoothing? temporal smoothing?
33
Random effects…
34
Fixed effectsFixed effects –Are you confident that a new observation from any of subjects 1-3 will be greater than zero ? Yes! using within-subjects varianceYes! using within-subjects variance –infer for these subjects – case study Random effectsRandom effects –Are you confident that a new observation from a new subject will be greater than zero ? No! using between-subjects varianceNo! using between-subjects variance –infer for any subject – population Fixed effectsFixed effects –Are you confident that a new observation from any of subjects 1-3 will be greater than zero ? Yes! using within-subjects varianceYes! using within-subjects variance –infer for these subjects – case study Random effectsRandom effects –Are you confident that a new observation from a new subject will be greater than zero ? No! using between-subjects varianceNo! using between-subjects variance –infer for any subject – population Random effects & variance components
35
Multi-subject analysis…? p < 0.001 (uncorrected) p < 0.05 (corrected) SPM{t} 11 ^ 22 ^ 33 ^ 44 ^ 55 ^ 66 ^ ^ – c.f. 2 / nw — ^ ^ ^ ^ ^ ^ – c.f. estimated mean activation image
36
^ Two-stage analysis of random effect… 11 ^ 22 ^ 33 ^ 44 ^ 55 ^ 66 ^ ^ ^ ^ ^ ^ – c.f. 2 /n = 2 /n + 2 / nw ^ – c.f. level-one (within-subject) variance 2 ^ an estimate of the mixed-effects model variance 2 + 2 / w — level-two (between-subject) timecourses at [ 03, -78, 00 ] contrast images p < 0.001 (uncorrected) SPM{t} (no voxels significant at p < 0.05 (corrected) )
37
Two stage random effects group comparison 12 subjects contrast images level-one (within-subject) level-two (between-subject) two-sample t-test vs.
38
Multi-stage multi-level modelling… parameter estimation inference level-1 data, model & contrast(s) estimated contrasts from level-1 fits, level-2 model & level-2 contrasts level 2 estimated contrasts and residual variance level 2 (population) inference
39
Conjunction analysis Data contrasts Random effects analysis SPM{T} 2nd level design matrix 1st level design matrix critical critical Probability of conjunction
40
Hypothesis testing fallacy…
41
Hypothesis testing !? Why test?Why test? reliability genuine effects integrity of research (hopefully)reliability genuine effects integrity of research (hopefully) The fallacy…The fallacy… point null hypothesis (no change)point null hypothesis (no change) things are never the same ! (always some small chance change)things are never the same ! (always some small chance change) given enough observations can always reject null hypothesis !given enough observations can always reject null hypothesis ! fMRI !? (lots of observations)fMRI !? (lots of observations) …testing, rather than estimating significant important !?significant important !?…and: “absence of evidence is not evidence of absence” Why test?Why test? reliability genuine effects integrity of research (hopefully)reliability genuine effects integrity of research (hopefully) The fallacy…The fallacy… point null hypothesis (no change)point null hypothesis (no change) things are never the same ! (always some small chance change)things are never the same ! (always some small chance change) given enough observations can always reject null hypothesis !given enough observations can always reject null hypothesis ! fMRI !? (lots of observations)fMRI !? (lots of observations) …testing, rather than estimating significant important !?significant important !?…and: “absence of evidence is not evidence of absence” !?!?
42
Hypothesis testing reviewed… Assessing commonaltiesAssessing commonalties –variance as a yardstick –variability itself interesting ? When is testing appropriate?When is testing appropriate? –when need specificity (…sensitivity?) –small samples (need reliability) PET/SPECTPET/SPECT multi-subject random effectsmulti-subject random effects Augment significance with estimates off effect sizeAugment significance with estimates off effect size estimate of effect size with confidence (confidence surfaces)estimate of effect size with confidence (confidence surfaces) AlternativesAlternatives –modelling / characterisation (as opposed to imposition of a model for testing) Assessing commonaltiesAssessing commonalties –variance as a yardstick –variability itself interesting ? When is testing appropriate?When is testing appropriate? –when need specificity (…sensitivity?) –small samples (need reliability) PET/SPECTPET/SPECT multi-subject random effectsmulti-subject random effects Augment significance with estimates off effect sizeAugment significance with estimates off effect size estimate of effect size with confidence (confidence surfaces)estimate of effect size with confidence (confidence surfaces) AlternativesAlternatives –modelling / characterisation (as opposed to imposition of a model for testing)
43
Worsley KJ, Marrett S, Neelin P, Evans AC (1992) “A three-dimensional statistical analysis for CBF activation studies in human brain” Journal of Cerebral Blood Flow and Metabolism 12:900-918 Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC (1995) “A unified statistical approach for determining significant signals in images of cerebral activation” Human Brain Mapping 4:58-73 Friston KJ, Worsley KJ, Frackowiak RSJ, Mazziotta JC, Evans AC (1994) “Assessing the Significance of Focal Activations Using their Spatial Extent” Human Brain Mapping 1:214-220 Cao J (1999) “The size of the connected components of excursion sets of 2, t and F fields” Advances in Applied Probability (in press) Worsley KJ, Marrett S, Neelin P, Evans AC (1995) “Searching scale space for activation in PET images” Human Brain Mapping 4:74-90 Worsley KJ, Poline J-B, Vandal AC, Friston KJ (1995) “Tests for distributed, non-focal brain activations” NeuroImage 2:183-194 Friston KJ, Holmes AP, Poline J-B, Price CJ, Frith CD (1996) “Detecting Activations in PET and fMRI: Levels of Inference and Power” Neuroimage 4:223-235 Worsley KJ, Marrett S, Neelin P, Evans AC (1992) “A three-dimensional statistical analysis for CBF activation studies in human brain” Journal of Cerebral Blood Flow and Metabolism 12:900-918 Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC (1995) “A unified statistical approach for determining significant signals in images of cerebral activation” Human Brain Mapping 4:58-73 Friston KJ, Worsley KJ, Frackowiak RSJ, Mazziotta JC, Evans AC (1994) “Assessing the Significance of Focal Activations Using their Spatial Extent” Human Brain Mapping 1:214-220 Cao J (1999) “The size of the connected components of excursion sets of 2, t and F fields” Advances in Applied Probability (in press) Worsley KJ, Marrett S, Neelin P, Evans AC (1995) “Searching scale space for activation in PET images” Human Brain Mapping 4:74-90 Worsley KJ, Poline J-B, Vandal AC, Friston KJ (1995) “Tests for distributed, non-focal brain activations” NeuroImage 2:183-194 Friston KJ, Holmes AP, Poline J-B, Price CJ, Frith CD (1996) “Detecting Activations in PET and fMRI: Levels of Inference and Power” Neuroimage 4:223-235 Multiple Comparisons, & Random Field Theory Ch5Ch4
45
indexindex overviewoverviewoverview multiple comparisonsmultiple comparisonsmultiple comparisonsmultiple comparisons random field theoryrandom field theoryrandom field theoryrandom field theory random effectsrandom effectsrandom effectsrandom effects hypothesis testing fallacyhypothesis testing fallacyhypothesis testing fallacyhypothesis testing fallacy overviewoverviewoverview multiple comparisonsmultiple comparisonsmultiple comparisonsmultiple comparisons random field theoryrandom field theoryrandom field theoryrandom field theory random effectsrandom effectsrandom effectsrandom effects hypothesis testing fallacyhypothesis testing fallacyhypothesis testing fallacyhypothesis testing fallacy
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.