Statistics Part II John VanMeter, Ph.D. Center for Functional and Molecular Imaging Georgetown University Medical Center.

Statistics Part II John VanMeter, Ph.D. Center for Functional and Molecular Imaging Georgetown University Medical Center

Multiple Comparisons Problem The p-value gives the likelihood that the changes in the MRI signal are related to the task If the p = 0.05 the there is a 5% chance that we wrongly classify a voxel as active If we look at 100 voxels then at least 5 of them will be significant by chance alone

Multiple Comparisons Problem A voxel-by-voxel based analysis requires over 100,000 t-tests to be performed A correction must be applied otherwise we will have ~5,000 random voxels that are unrelated to the task that will appear to be activated Can reduce number of t-tests if we know where to test - such as just inside the brain or specific regions

Multiple Comparisons N can be over 100,000 in fMRI data 4,758 brain voxels in this slice alone 326,052 brain voxels in the whole volume 16,303 false positives expected with  = 0.05

Thresholding Without Correction Area from Z = 1.64 to  ≈ 0.05 Voxels  1.64

Multiple Comparisons: Correction Approaches Wavelet Methods (Gaussian) Random Field Theory Methods Bonferroni False Discovery Rate (FDR) Familywise Error Rate (FWE) Permutation / Randomization Tests Resel Unified P Scale Space Search Etc.

Family-Wise Error (FWE) Family-Wise Null hypothesis: No activation in any voxels If we reject null hypothesis at any voxel we reject the Family-Wise null hypothesis Thus, an  = 0.05 is equivalent to stating that there is a 5% chance of finding a statistical map with at least 1 activated voxel False positive anywhere in the the image gives the family-wise error Family-Wise Error rate = ‘corrected’ p-value

Bonferroni Correction  v   d / N N= # of tests  d =desired overall p-value  v =p-value to use on each test Carlo Emilio Bonferroni 1892-1960

Example: 2 t-tests N = 2 t-tests We want the chance of a false positive across the 2 t-tests to be only α d = 0.05  v   d / N = 0.05 / 2  0.025 So, we should use a p-level of 0.025 on each of the 2 t-tests

Problems with Bonferroni? See Perneger TV, What’s wrong with Bonferroni adjustments, British Medical Journal 316:1236-1238 (1998). “What tests should be included?” But see also Bender R, Lange S, Multiple test procedures other than Bonferroni's deserve wider use, British Medical Journal 318(7183):600-1 (1999).

Bonferroni in fMRI? Bonferroni is typically overly conservative Very little survives in most cases Spatial autocorrelations

Bonferroni Correction Example # brain voxels = 326,052  d = 0.05  v = 1.5 x 10 -7 Critical Threshold = 5.1

3D Resolvable Element (“Resel”) V resel = FWHM x * FWHM y * FWHM z Estimated N resel = N / V resel N = total number of voxels

Resel Correction Instead of dividing  d by the number of voxels, divide by the number of resels  v =  d / N resel N resel = 5094 gives  v = 9.8 x 10 -6

Random Field Theory (RFT) Random field theory is a method for estimating multiple comparisons correction Provides method to determine statistical height threshold while controlling FWE rate t-maps (or Z, F and  2 ) modeled as realizations of a random process Takes into account number of activated voxels AND smoothness

Random Field Theory Treat statistical image as discretisation (voxels) of a continuous random field Topology of a random field defines error rate based on a given height threshold Discretisation

Topology of Random Fields Euler Characteristic (EC) is a measure of the ‘roughness’ of a random field In neuroimaging number of blobs (aka clusters) that remain after thresholding at a particular p-value defines EC (minus holes/hollows)

Euler Characteristic (EC) Topological measure –threshold an image at u -EC = # blobs - holes/hollows

Resels, RFT, EC, and  EC of a RFT solves threshold problem and multiple comparisons Expected EC (E[EC]) expected number of clusters above a threshold u  equals E[EC] and is the corrected probability  for 2D Case: = E[EC] = R (4ln2)(2) -3/2 u exp(-u 2 /2) R = # of Resels (recall depends on smoothness) u = voxel-wise threshold Work backwards to determine u needed for a corrected threshold  based on R number of resels

α = R (4 ln 2) (2π) -3/2 u exp (-u 2 /2) For R=100 and α=0.05 RFT gives u=3.8 Example – 2D Gaussian Images

RFT Assumptions Entire image is a multivariate Gaussian or derived from such Discretely sampled statistical image is sufficiently smooth to approximate a continuous random field (smoothing 3x voxel size generally ensures this) Spatial autocorrelation function must have 2 derivates at origin Data must be stationary Highly accurate smoothness estimate

RFT Advantages Typically less stringent that Bonferroni though not always Adapts to volume searched (number of voxels) and smoothness Computationally simple Also applicable to joint inference on peak-height and cluster size

Random Field Theory Limitations Requires sufficient smoothness - FWHM 2-4 times voxel size - More like ~10 times for low-d.f. data Smoothness estimate is biased when images aren’t sufficiently smooth Multivariate normality assumption – difficult to check Several layers of approximation

Permutation Testing Non-parametric method Standard (parametric) methods use theory to determine null-distribution –p-value is area under the curve Nonparametric methods use the data to create an empirical null-distribution –p-value is proportion of null-distibution Null hypothesis: “ Each scan would have been the same whatever the condition, A or B ”

Basics of Permutation Test Permutation Test approach –Compute statistic t for real data –Shuffle labels Recompute statistic, call it t j Repeat –p-value is proportion of t j ’ s greater than or equal to t

Simplistic Example Data: A B A B A B 103.00 90.48 99.93 87.83 99.76 96.06 t-Statistic: = 9.44 Permute Labelings and Compute t-Statistic: AAABBB 4.81 ABABAB 9.44 BAAABB -1.49 BABBAA -6.86 AABABB -3.25 ABABBA 6.97 BAABAB 1.09 BBAAAB 3.14 AABBAB -0.67 ABBAAB 1.37 BAABBA -1.37 BBAABA 0.67 AABBBA -3.14 ABBABA -1.09 BABAAB -6.97 BBABAA 3.25 ABAABB 6.86 ABBBAA 1.49 BABABA -9.44 BBBAAA -4.81 Significance: Only 1 labeling has statistic ≥ t Thus, p-value is 1/20 = 0.05

mean difference smoothed variance t-statistic “pseudo” t-statistic variance mean difference

Multi-subject Comparison of Permutation and RFT

Permutation Test Assumptions and Disadvantages Only assumption required is exchangeability under H o Not valid for single subject fMRI as data is temporally autocorrelated Valid for multi-subject fMRI analyses (random effects) Main disadvantage computationally intensive! Implemented in SnPM toolbox

False Discovery Rate (FDR) FDR - Controlling for the proportion of false positives among rejected tests Observed FDR obsFDR = V 0R /(V 1R +V 0R ) = V 0R /N R Only know N R, not how many are true or false –Control is on the expected FDR FDR = E(obsFDR) Accept H 0 Reject H 0 H 0 is TrueV 0A V 0R H 0 is FalseV 1A V 1R NANA NRNR V

Benjamini & Hochberg Procedure Select desired limit q on FDR Order p-values of all voxels –p (1)  p (2) ...  p (V) Find the largest i such that Usually c(v) = 1 under right circumstances Accept all voxels corresponding to p (1),..., p (r) p (i)  i/V  q/c(V) p(i)p(i) i/Vi/V i/V  q/c(V) p-value 01 0 1 Nichols, Holmes (2002) HBM 15:1-25

Controlling FDR: Varying Signal Extent Signal Intensity3.0Signal Extent 1.0Noise Smoothness3.0 p = z = 1

Controlling FDR: Varying Signal Extent Signal Intensity3.0Signal Extent 5.0Noise Smoothness3.0 p = 0.000252z = 3.48 4

Controlling FDR: Varying Signal Extent Signal Intensity3.0Signal Extent 9.5Noise Smoothness3.0 p = 0.001628z = 2.94 5

Controlling FDR: Varying Signal Extent Signal Intensity3.0Signal Extent16.5Noise Smoothness3.0 p = 0.007157z = 2.45 6

Controlling FDR: Varying Signal Extent Signal Intensity3.0Signal Extent25.0Noise Smoothness3.0 p = 0.019274z = 2.07 7

Controlling FDR: Varying Noise Smoothness Signal Intensity3.0Signal Extent 5.0Noise Smoothness0.0 p = 0.000132z = 3.65 1

FWE vs FDR Working Memory Example FDR Threshold = 3.83 3,073 voxels FWE Perm. Thresh. = 7.67 58 voxels

FDR Properties Adaptive –Larger the signal, the lower the threshold –Larger the signal, the more false positives False positives constant as fraction of rejected tests Not much a problem with imaging’s sparse signals Smoothing data very helpful –Smoothing introduces positive correlations

Levels of Inference set-level P(c  3 | n  12, u  3.09) = 0.019 cluster-level P(c  1 | n  82, t  3.09) = 0.029 (corrected) n=82 n=32 n=1 2 voxel-level P(c  1 | n > 0, t  4.37) = 0.048 (corrected) At least one cluster with unspecified number of voxels above threshold At least one cluster with at least 82 voxels above threshold At least 3 clusters above threshold

SPM Statistics Report FWE-corr is the random field theory corrected p-value at the voxel level FDR-corr is the false discovery rate corrected p-value at the voxel level Cluster-level corrected p-value is RFT using both cluster size and maximum t-statistic

Group Analyses Single Subject Single-Subject Analysis Final Assessment of Significance Resel, FWE, FDR, etc. Single Subject #2 Single Subject #N Group Analysis …

Within-Subject Analysis Observed fMRI Time Series Fitted Boxcar S between-scan variability Observed – Fitted Boxcar SE t = S SE

Pool Data Across Subjects? Perform single-subject analysis on each subject Then show the results for each subject. Basically, present data as a collection of case reports Does not provide any statistical significance to the results If 8 out 10 subjects activate in an area - is that significant?

Giant GLM? Subject 1Subject 2Subject N Across-Subject: Create Statistic Maps General Linear Model Statistical Map Thresholding and Overlays Final Significance

Fixed Effects Analysis Fitted Boxcar Subject 1 Subject 2 Subject 3 100 scans 300 scans multiple regression  df = 296??

Fixed Effects Analysis Fitted Boxcar Subject 1 between-scan variability between-subject variability Subject 2 Subject 3 between-scan variability

“Combining Brains” Subject 1…Subject 2Subject N … Create Maps Within-Subject map 1map 2map N Combined Map Combine Maps Thresholding and Overlays Final Significance

Random Effects Fitted Boxcar Subject 1 Subject 2 Subject 3 between-subject variability S1S1 S2S2 S3S3

Random Effects Friston et al., Neuroimage, 10:385-396 (1999).Friston et al., Neuroimage, 10:1-5 (1999).

1 st -Level (“Fixed Effects”) Analysis Statistical Test Beta Weight Map Con(trast) Map (Statistical Maps) Single-Subject Input All EPI Images (after aligned, smoothed, etc.)

1 st -Level (“Fixed Effects”) Analysis Motion Correction Spatial Low-Pass Filtering Spatial Normalization Temporal High-Pass Filtering Within- Subject Contrasts spmT_*.img (t-maps) con_*.img (contrast maps) beta_*.img (parameter estimate maps)

2 nd -Level (“Random Effects”) Analysis Statistical Test Statistical Map Con Map Subj1Con Map Subj2 Con Map SubjN …

t-Test or ANOVA t-Test Maps showing group contrast, or group by task effect Control Group Con Images Test Group Con Images 2 nd -Level (“Random Effects”) Between Groups Analysis

“Mixed” Effects Analysis … … First-Level Fixed-Effects Analysis Statistical Analysis (e.g., t-test or regression) Statistical Maps Second-Level Random-Effects Analysis Subject 1Subject 2Subject N Con Image(s)

Random Effects Summary Pool data across subjects Inference generalizes to the population level Both inter-subject and inter-scan variability properly accounted for

Alternatives to Mixed Effects Analysis Conjunction Analysis “Stouffer’s Method” Average T Et cetera…

Worsley-Friston Conjunction Analysis (SPM99) Friston et al., Neuroimage, 10:385-396 (1999). Worsley and Friston, Statistics and Probability Letters, 47:135-140 (2000). 1.39.6 4.2 7.88.5 8.47.2 Z-maps minimum Z-map 1.34.2 (least significant voxels) > 2.0 (p<0.05, e.g.)

Inclusive Masking > 2.0 (p < ?) AND 1.39.6 4.2 7.88.5 8.47.2 Z-maps

Stouffer’s Method See: Vaina et al., Functional neuroanatomy of biological motion perception in humans, PNAS 98(20):11656-61. Bosch V, Statistical analysis of multi-subject fMRI data: assessment of focal activations, J Magn Reson Imaging. 2000 Jan;11(1):61-4. Z map 1… ( Σ Z i ) /√N Z Map Z map 2Z map N

Linear Sum of Normally Distributed Numbers If L is the sum of X 1, X 2, …, X N, where the X i are normally distributed with variance 1 and are uncorrelated, then var{L} = N, and std{L} is sqrt(N). Snedecor GW and Cochran WG, Statistical Methods, 8th ed., Ames:Iowa State University Press (1989), pp. 190-191.

Average T Method t map 1… ( Σ T i ) /√N ~ Z Map t map 2t map N Compare: Bosch V, Statistical analysis of multi-subject fMRI data: assessment of focal activations, J Magn Reson Imaging. 2000 Jan;11(1):61-4.

Data Driven Analysis Methods All statistical approaches described so far use a model of our expectation of the activity (eg. on-off periodicity of the our block design, etc) Data driven methods make no assumptions and instead use automated techniques to find changes of interest –Fourier Analysis –Independent Component Analysis (ICA) –Partial Least Squares (PLS) –Etc…

Fourier Analysis Examines temporal structure of the fMRI data Analyzes fMRI data in frequency domain which represents data as power for each frequency component Note this has nothing to do with image reconstruction or k-space –Advantage: Makes no assumptions about the data –Disadvantage: Can only be used with block-design types of paradigms

Fourier Analysis Example

Independent Components Analysis (ICA) Model-free (ie. data driven technique) Computational method to decompose data into subcomponents that are mutually independent When summed they represent the original signal All voxels that match the pattern are picked up irregardless of spatial location

Statistics Part II John VanMeter, Ph.D. Center for Functional and Molecular Imaging Georgetown University Medical Center.

Similar presentations

Presentation on theme: "Statistics Part II John VanMeter, Ph.D. Center for Functional and Molecular Imaging Georgetown University Medical Center."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Statistics Part II John VanMeter, Ph.D. Center for Functional and Molecular Imaging Georgetown University Medical Center.

Similar presentations

Presentation on theme: "Statistics Part II John VanMeter, Ph.D. Center for Functional and Molecular Imaging Georgetown University Medical Center."— Presentation transcript:

Similar presentations

About project

Feedback