Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Statistics Inference Methods & Issues: Multiple Testing, Nonparametrics, Conjunctions & Bayes Thomas Nichols, Ph.D. Department of Biostatistics.

Similar presentations


Presentation on theme: "Advanced Statistics Inference Methods & Issues: Multiple Testing, Nonparametrics, Conjunctions & Bayes Thomas Nichols, Ph.D. Department of Biostatistics."— Presentation transcript:

1 Advanced Statistics Inference Methods & Issues: Multiple Testing, Nonparametrics, Conjunctions & Bayes Thomas Nichols, Ph.D. Department of Biostatistics University of Michigan http://www.sph.umich.edu/~nichols http://www.sph.umich.edu/~nichols OHBM fMRI Course June 12, 2005 NIH Neuroinformatics / Human Brain Project

2 © 2005 Thomas Nichols 2 Overview Multiple Testing Problem –Which of my 100,000 voxels are “active”? Nonparametric Inference –Can I trust my P-value at this voxel? Conjunction Inference –Is this voxel “active” in all of my tasks? Bayesian vs. Classical Inference –What in the world is a posterior probability?

3 © 2005 Thomas Nichols 3 Overview Multiple Testing Problem –Which of my 100,000 voxels are “active”? Nonparametric Inference –Can I trust my P-value at this voxel? Conjunction Inference –Is this voxel “active” in all of my tasks? Bayesian vs. Classical Inference –What in the world is a posterior probability?

4 © 2005 Thomas Nichols 4 Null Hypothesis H 0 Test statistic T –t observed realization of T  level –Acceptable false positive rate –Level  = P( T > u  | H 0 ) –Threshold u  controls false positive rate at level  P-value –Assessment of t assuming H 0 –P( T > t | H 0 ) Prob. of obtaining stat. as large or larger in a new experiment –P(Data|Null) not P(Null|Data) Hypothesis Testing uu  Null Distribution of T t P-val Null Distribution of T

5 © 2005 Thomas Nichols 5 Hypothesis Testing in fMRI Massively Univariate Modeling –Fit model at each voxel –Create statistic images of effect Then find the signal in the image...

6 © 2005 Thomas Nichols 6 Assessing Statistic Images Where’s the signal? t > 0.5 t > 3.5t > 5.5 High ThresholdMed. ThresholdLow Threshold Good Specificity Poor Power (risk of false negatives) Poor Specificity (risk of false positives) Good Power...but why threshold?!

7 © 2005 Thomas Nichols 7 Don’t threshold, model the signal! –Signal location? Estimates and CI’s on (x,y,z) location –Signal magnitude? CI’s on % change –Spatial extent? Estimates and CI’s on activation volume Robust to choice of cluster definition...but this requires an explicit spatial model Blue-sky inference: What we’d like space

8 © 2005 Thomas Nichols 8 Blue-sky inference: What we need Need an explicit spatial model No routine spatial modeling methods exist –High-dimensional mixture modeling problem –Activations don’t look like Gaussian blobs –Need realistic shapes, sparse representation Some work by Hartvig et al., Penny et al.

9 © 2005 Thomas Nichols 9 Real-life inference: What we get Signal location –Local maximum – no inference –Center-of-mass – no inference Sensitive to blob-defining-threshold Signal magnitude –Local maximum intensity – P-values (& CI’s) Spatial extent –Cluster volume – P-value, no CI’s Sensitive to blob-defining-threshold

10 © 2005 Thomas Nichols 10 Voxel-level Inference Retain voxels above  -level threshold u  Gives best spatial specificity –The null hyp. at a single voxel can be rejected Significant Voxels space uu No significant Voxels

11 © 2005 Thomas Nichols 11 Cluster-level Inference Two step-process –Define clusters by arbitrary threshold u clus –Retain clusters larger than  -level threshold k  Cluster not significant u clus space Cluster significant kk kk

12 © 2005 Thomas Nichols 12 Cluster-level Inference Typically better sensitivity Worse spatial specificity –The null hyp. of entire cluster is rejected –Only means that one or more of voxels in cluster active Cluster not significant u clus space Cluster significant kk kk

13 © 2005 Thomas Nichols 13 Set-level Inference Count number of blobs c –Minimum blob size k Worst spatial specificity –Only can reject global null hypothesis u clus space Here c = 1; only 1 cluster larger than k kk

14 © 2005 Thomas Nichols 14 Voxel-wise Inference & Multiple Testing Problem (MTP) Standard Hypothesis Test –Controls Type I error of each test, at say 5% –But what if I have 100,000 voxels? 5,000 false positives on average! Must control false positive rate –What false positive rate? –Chance of 1 or more Type I errors? –Proportion of Type I errors? 5% 0

15 © 2005 Thomas Nichols 15 MTP Solutions: Measuring False Positives Familywise Error Rate (FWER) –Familywise Error Existence of one or more false positives –FWER is probability of familywise error False Discovery Rate (FDR) –R voxels declared active, V falsely so Observed false discovery rate: V/R –FDR = E(V/R)

16 © 2005 Thomas Nichols 16 FWER MTP Solutions Bonferroni Maximum Distribution Methods –Random Field Theory –Permutation

17 © 2005 Thomas Nichols 17 FWER MTP Solutions: Bonferroni V voxels to test Corrected Threshold –Threshold corresponding to  = 0.05/V Corrected P-value –min{ P-value  V, 1 }

18 © 2005 Thomas Nichols 18 FWER MTP Solutions Bonferroni Maximum Distribution Methods –Random Field Theory –Permutation

19 © 2005 Thomas Nichols 19 FWER MTP Solutions: Controlling FWER w/ Max FWER & distribution of maximum FWER= P(FWE) = P(One or more voxels  u | H o ) = P(Max voxel  u | H o ) 100(1-  )%ile of max dist n controls FWER FWER = P(Max voxel  u  | H o )   uu 

20 © 2005 Thomas Nichols 20 FWER MTP Solutions Bonferroni Maximum Distribution Methods –Random Field Theory –Permutation

21 © 2005 Thomas Nichols 21 FWER MTP Solutions: Random Field Theory Euler Characteristic  u –Topological Measure #blobs - #holes –At high thresholds, just counts blobs –FWER= P(Max voxel  u | H o ) = P(One or more blobs | H o )  P(  u  1 | H o )  E(  u | H o ) Random Field Suprathreshold Sets Threshold No holes Never more than 1 blob

22 © 2005 Thomas Nichols 22 RFT Details: Expected Euler Characteristic E(  u )  (  )  |  | (u 2 -1) exp(-u 2 /2) / (2  ) 2 –  Search region   R 3 – (  volume –  |  |  roughness Assumptions –Multivariate Normal –Stationary* –ACF twice differentiable at 0 *Stationary –Results valid w/out stationary –More accurate when stat. holds Only very upper tail approximates 1-F max (u)

23 © 2005 Thomas Nichols 23 Random Field Theory Smoothness Parameterization E(  u ) depends on |  | 1/2 –  roughness matrix: Smoothness parameterized as Full Width at Half Maximum –FWHM of Gaussian kernel needed to smooth a white noise random field to roughness  Autocorrelation Function FWHM

24 © 2005 Thomas Nichols 24 Instead of roughness  |  |, parameterize smoothness: Full Width at Half Maximum –FWHM of Gaussian kernel needed to smooth a white noise random field to roughness  RESELS – Resolution Elements –1 RESEL = FWHM x  FWHM y  FWHM z –RESEL Count R R = (  )  |  |  The only data-dependent part of E(  u ) Volume of search region in units of smoothness Eg: 10 voxels, 2.5 voxel FWHM smoothness, 4 RESELS. RESELs not num of independent ‘things’ in the image Random Field Theory Smoothness Parameterization

25 © 2005 Thomas Nichols 25 Random Field Intuition Corrected P-value for voxel value t P c = P(max T > t)  E(  t )  (  )  |  | t 2 exp(-t 2 /2) Statistic value t increases –P c decreases (of course!) Search volume (  ) increases –P c increases (more severe MCP) Smoothness increases (  |  | smaller) –P c decreases (less severe MCP)

26 © 2005 Thomas Nichols 26 Random Field Theory Strengths & Weaknesses Closed form results for E(  u ) –Z, t, F, Chi-Squared Continuous RFs Results depend only on volume & smoothness Smoothness assumed known Sufficient smoothness required –Results are for continuous random fields –Smoothness estimate becomes biased Multivariate normality Several layers of approximations Lattice Image Data  Continuous Random Field

27 © 2005 Thomas Nichols 27 Real Data fMRI Study of Working Memory –12 subjects, block design Marshuetz et al (2000) –Item Recognition Active:View five letters, 2s pause, view probe letter, respond Baseline: View XXXXX, 2s pause, view Y or N, respond Second Level RFX –Difference image, A-B constructed for each subject –One sample t test... D yes... UBKDA Active... N no... XXXXX Baseline

28 © 2005 Thomas Nichols 28 Real Data: RFT Result Threshold –S = 110,776 –2  2  2 voxels 5.1  5.8  6.9 mm FWHM –u = 9.870 Result –5 voxels above the threshold –0.0063 minimum FWE-corrected p-value -log 10 p-value

29 © 2005 Thomas Nichols 29 MTP Solutions: Measuring False Positives Familywise Error Rate (FWER) –Familywise Error Existence of one or more false positives –FWER is probability of familywise error False Discovery Rate (FDR) –FDR = E(V/R) –R voxels declared active, V falsely so Realized false discovery rate: V/R

30 © 2005 Thomas Nichols 30 False Discovery Rate For any threshold, all voxels can be cross-classified: Realized FDR rFDR = V 0R /(V 1R +V 0R ) = V 0R /N R –If N R = 0, rFDR = 0 But only can observe N R, don’t know V 1R & V 0R –We control the expected rFDR FDR = E(rFDR) Accept NullReject Null Null True (no effect)V 0A V 0R m0m0 Null False (true effect)V 1A V 1R m1m1 NANA NRNR V

31 © 2005 Thomas Nichols 31 False Discovery Rate Illustration: Signal Signal+Noise Noise

32 © 2005 Thomas Nichols 32 FWE 6.7% 10.4%14.9%9.3%16.2%13.8%14.0% 10.5%12.2%8.7% Control of Familywise Error Rate at 10% 11.3% 12.5%10.8%11.5%10.0%10.7%11.2%10.2%9.5% Control of Per Comparison Rate at 10% Percentage of Null Pixels that are False Positives Control of False Discovery Rate at 10% Occurrence of Familywise Error Percentage of Activated Pixels that are False Positives

33 © 2005 Thomas Nichols 33 Benjamini & Hochberg Procedure Select desired limit q on FDR Order p-values, p (1)  p (2) ...  p (V) Let r be largest i such that Reject all hypotheses corresponding to p (1),..., p (r). p (i)  i/V  q p(i)p(i) i/Vi/V i/V  q p-value 01 0 1 JRSS-B (1995) 57:289-300

34 © 2005 Thomas Nichols 34 Benjamini & Hochberg Procedure Details Method is valid under smoothness –Positive Regression Dependency on Subsets P(X 1  c 1, X 2  c 2,..., X k  c k | X i =x i ) is non-decreasing in x i Only required of test statistics for which null true Special cases include –Independence –Multivariate Normal with all positive correlations –Same, but studentized with common std. err. For arbitrary covariance structure –Replace q with q  c(V) c(V) =  i=1,...,V 1/i  log(V)+0.5772 Benjamini & Yekutieli (2001). Ann. Stat. 29:1165-1188

35 © 2005 Thomas Nichols 35 When no signal: P-value threshold  /V When all signal: P-value threshold  Ordered p-values p (i) Fractional index i/V Adaptiveness of Benjamini & Hochberg FDR...FDR adapts to the amount of signal in the data

36 © 2005 Thomas Nichols 36 Benjamini & Hochberg: Key Properties FDR is controlled E(rFDR)  q m 0 /V –Conservative, if large fraction of nulls false Adaptive –Threshold depends on amount of signal More signal, More small p-values, More p (i) less than i/V  q/c(V)

37 © 2005 Thomas Nichols 37 FDR Example FDR Threshold = 3.83 3,073 voxels FWER Perm. Thresh. = 7.67 58 voxels Threshold –Indep/PosDep u = 3.83 Result –3,073 voxels above Indep/PosDep u –<0.0001 minimum FDR-corrected p-value

38 © 2005 Thomas Nichols 38 Overview Multiple Testing Problem –Which of my 100,000 voxels are “active”? Nonparametric Inference –Can I trust my P-value at this voxel? Conjunction Inference –Is this voxel “active” in all of my tasks? Bayesian vs. Classical Inference –What in the world is a posterior probability?

39 © 2005 Thomas Nichols 39 Nonparametric Permutation Test Parametric methods –Assume distribution of statistic under null hypothesis Nonparametric methods –Use data to find distribution of statistic under null hypothesis –Any statistic! 5% Parametric Null Distribution 5% Nonparametric Null Distribution

40 © 2005 Thomas Nichols 40 Permutation Test Toy Example Data from V1 voxel in visual stim. experiment A: Active, flashing checkerboard B: Baseline, fixation 6 blocks, ABABAB Just consider block averages... Null hypothesis H o –No experimental effect, A & B labels arbitrary Statistic –Mean difference ABABAB 103.0090.4899.9387.8399.7696.06

41 © 2005 Thomas Nichols 41 Permutation Test Toy Example Under H o –Consider all equivalent relabelings AAABBBABABABBAAABBBABBAA AABABBABABBABAABABBBAAAB AABBABABBAABBAABBABBAABA AABBBAABBABABABAABBBABAA ABAABBABBBAABABABABBBAAA

42 © 2005 Thomas Nichols 42 Permutation Test Toy Example Under H o –Consider all equivalent relabelings –Compute all possible statistic values AAABBB 4.82ABABAB 9.45BAAABB -1.48BABBAA -6.86 AABABB -3.25ABABBA 6.97BAABAB 1.10BBAAAB 3.15 AABBAB -0.67ABBAAB 1.38BAABBA -1.38BBAABA 0.67 AABBBA -3.15ABBABA -1.10BABAAB -6.97BBABAA 3.25 ABAABB 6.86ABBBAA 1.48BABABA -9.45BBBAAA -4.82

43 © 2005 Thomas Nichols 43 Permutation Test Toy Example Under H o –Consider all equivalent relabelings –Compute all possible statistic values –Find 95%ile of permutation distribution AAABBB 4.82ABABAB 9.45BAAABB -1.48BABBAA -6.86 AABABB -3.25ABABBA 6.97BAABAB 1.10BBAAAB 3.15 AABBAB -0.67ABBAAB 1.38BAABBA -1.38BBAABA 0.67 AABBBA -3.15ABBABA -1.10BABAAB -6.97BBABAA 3.25 ABAABB 6.86ABBBAA 1.48BABABA -9.45BBBAAA -4.82

44 © 2005 Thomas Nichols 44 Permutation Test Toy Example Under H o –Consider all equivalent relabelings –Compute all possible statistic values –Find 95%ile of permutation distribution AAABBB 4.82ABABAB 9.45BAAABB -1.48BABBAA -6.86 AABABB -3.25ABABBA 6.97BAABAB 1.10BBAAAB 3.15 AABBAB -0.67ABBAAB 1.38BAABBA -1.38BBAABA 0.67 AABBBA -3.15ABBABA -1.10BABAAB -6.97BBABAA 3.25 ABAABB 6.86ABBBAA 1.48BABABA -9.45BBBAAA -4.82

45 © 2005 Thomas Nichols 45 Permutation Test Toy Example Under H o –Consider all equivalent relabelings –Compute all possible statistic values –Find 95%ile of permutation distribution 048-4-8

46 © 2005 Thomas Nichols 46 Controlling FWER: Permutation Test Parametric methods –Assume distribution of max statistic under null hypothesis Nonparametric methods –Use data to find distribution of max statistic under null hypothesis –Again, any max statistic! 5% Parametric Null Max Distribution 5% Nonparametric Null Max Distribution

47 © 2005 Thomas Nichols 47 Permutation Test & Exchangeability Exchangeability is fundamental –Def: Distribution of the data unperturbed by permutation –Under H 0, exchangeability justifies permuting data –Allows us to build permutation distribution Subjects are exchangeable –Under Ho, each subject’s A/B labels can be flipped fMRI scans are not exchangeable under Ho –If no signal, can we permute over time? –No, permuting disrupts order, temporal autocorrelation

48 © 2005 Thomas Nichols 48 Permutation Test & Exchangeability fMRI scans are not exchangeable –Permuting disrupts order, temporal autocorrelation Intrasubject fMRI permutation test –Must decorrelate data, model before permuting –What is correlation structure? Usually must use parametric model of correlation –E.g. Use wavelets to decorrelate Bullmore et al 2001, HBM 12:61-78 Intersubject fMRI permutation test –Create difference image for each subject –For each permutation, flip sign of some subjects

49 © 2005 Thomas Nichols 49 Permutation Test Other Statistics Collect max distribution –To find threshold that controls FWER Consider smoothed variance t statistic –To regularize low-df variance estimate

50 © 2005 Thomas Nichols 50 Permutation Test Smoothed Variance t Collect max distribution –To find threshold that controls FWER Consider smoothed variance t statistic t-statistic variance mean difference

51 © 2005 Thomas Nichols 51 Permutation Test Smoothed Variance t Collect max distribution –To find threshold that controls FWER Consider smoothed variance t statistic Smoothed Variance t-statistic mean difference smoothed variance

52 © 2005 Thomas Nichols 52 Permutation Test Example fMRI Study of Working Memory –12 subjects, block design Marshuetz et al (2000) –Item Recognition Active:View five letters, 2s pause, view probe letter, respond Baseline: View XXXXX, 2s pause, view Y or N, respond Second Level RFX –Difference image, A-B constructed for each subject –One sample, smoothed variance t test... D yes... UBKDA Active... N no... XXXXX Baseline

53 © 2005 Thomas Nichols 53 Permutation Test Example Permute! –2 12 = 4,096 ways to flip 12 A/B labels –For each, note maximum of t image. Permutation Distribution Maximum t Maximum Intensity Projection Thresholded t

54 © 2005 Thomas Nichols 54 Permutation Test Example Compare with Bonferroni –  = 0.05/110,776 Compare with parametric RFT –110,776 2  2  2mm voxels –5.1  5.8  6.9mm FWHM smoothness –462.9RESELs

55 © 2005 Thomas Nichols 55 t 11 Statistic, RF & Bonf. Threshold t 11 Statistic, Nonparametric Threshold u RF = 9.87 u Bonf = 9.80 5 sig. vox. u Perm = 7.67 58 sig. vox. Smoothed Variance t Statistic, Nonparametric Threshold 378 sig. vox. Test Level vs. t 11 Threshold

56 © 2005 Thomas Nichols Does this Generalize? RFT vs Bonf. vs Perm.

57 © 2005 Thomas Nichols Does this Generalize? RFT vs Bonf. vs Perm.

58 © 2005 Thomas Nichols 58 Overview Multiple Testing Problem –Which of my 100,000 voxels are “active”? Nonparametric Inference –Can I trust my P-value at this voxel? Conjunction Inference –Is this voxel “active” in all of my tasks? Bayesian vs. Classical Inference –What in the world is a posterior probability?

59 © 2005 Thomas Nichols 59 Conjunction Inference Consider several working memory tasks –N-Back tasks with different stimuli –Letter memory:D J P F D R A T F M R I B K –Number memory:4 2 8 4 4 2 3 9 2 3 5 8 9 3 1 4 –Shape memory:            Interested in stimuli-generic response –What areas of the brain respond to all 3 tasks? –Don’t want areas that only respond in 1 or 2 tasks

60 © 2005 Thomas Nichols 60 Conjunction Inference Notation –K Number of conjunctions –H k Null hypothesis of k th test H k = 0 If k th null is true, 1 if false –T k Statistic value of k th test Conjunction Alt. & Null Hypotheses “Not all effects present” “One or more effects absent” “All effects present”

61 © 2005 Thomas Nichols 61 Conjunction Inference For working memory example, K=3... –LettersH 1 T 1 –NumbersH 2 T 2 –ShapesH 3 T 3 –Test At least one of the three effects not present –versus All three effects present

62 © 2005 Thomas Nichols 62 Conjunction Inference Methods: Price & Friston Compute Main Effects, Mask out interactions –Compute two statistic images Main effect t image (avg. effect) Interaction F image (diffs btw effects) –Threshold & Mask Threshold t image as usual Mask out t voxels where F significant (e.g. 0.01 uncorrected) Reference –SPM96 Conjunction –Price, C. J., & Friston, K. J. 1997. Cognitive conjunction: A new approach to brain activation experiments. NeuroImage, 5, 261-270. Example t contrast Example F contrast

63 © 2005 Thomas Nichols 63 Conjunction Inference Methods: Price & Friston Strengths –Detects consistent effects –Can test dependent effects, from non-orthogonal contrasts Problems –Only detects consistent effects To some authors, a strength [ Caplan & Moo, NI 21:751-6 2004 ] –Depends on not rejecting F-test Not rejecting doesn’t mean No interaction Response Magnitude All effects large Conjunction Detected Response Magnitude All effects large, but different No Conjunction Detected

64 © 2005 Thomas Nichols 64 Conjunction Inference Methods: Friston et al Use the minimum of the K statistics –Idea: Only declare a conjunction if all of the statistics are sufficiently large – only when for all k

65 © 2005 Thomas Nichols 65 Conjunction Inference Methods: Friston et al Strengths –P-values easy to find Distribution of min k T k is trivial......assuming all K nulls true Problems –Needs K independent statistics –Inference assumes all K nulls are true! Wrong P-value!

66 © 2005 Thomas Nichols 66 Conjunction Inference Methods: Friston et al Problems (con’t) –Friston’s Min inference tests the null versus the alternative “No effects present” “Some effects present”

67 © 2005 Thomas Nichols 67 Impact of Using the Wrong Null Hypothesis Consider varying the number of effects tested... For K=9, only need all positive responses for this min test to reject! –Reason: Easy to reject the “No effects present” null

68 © 2005 Thomas Nichols 68 Valid Conjunction Inference With the Minimum Statistic Valid test controls worst-case false positive rate –  is the parameter of interest (e.g. true effect magnitude) –  0 is null-subset of the parameter space Must consider all possible null parameter values (0,0) Incorrect conjunction null hypothesis Correct conjunction null hypothesis 0 Usual single null hypothesis

69 © 2005 Thomas Nichols 69 Valid Conjunction Inference With the Minimum Statistic “Worst case” for conjunction inference –All but one effect present –Exactly one null true

70 © 2005 Thomas Nichols 70 Valid Conjunction Inference With the Minimum Statistic “Worst case” for conjunction inference –All but one effect present –Exactly one null true

71 © 2005 Thomas Nichols 71 Valid Conjunction Inference With the Minimum Statistic For valid inference, compare min stat to u  –Assess min k T k image as if it were just T 1 –E.g. u 0.05 =1.64 (or some corrected threshold) Equivalently, take intersection mask –Thresh. each statistic image at, say, 0.05 FWE corr. –Make mask: 0 = below thresh., 1 = above thresh. –Intersection of masks: conjunction-significant voxels

72 © 2005 Thomas Nichols 72 Overview Multiple Testing Problem –Which of my 100,000 voxels are “active”? Nonparametric Inference –Can I trust my P-value at this voxel? Conjunction Inference –Is this voxel “active” in all of my tasks? Bayesian vs. Classical Inference –What in the world is a posterior probability?

73 © 2005 Thomas Nichols 73 Classical Statistics: Model  Y Likelihood of Y p(Y|  ) Estimation –n = 12 subj. fMRI study Data at one voxel –Y, sample average % BOLD change Model –Y ~ N( ,  /√n) –  is true population mean BOLD % change –Likelihood p(Y|  ) Relative frequency of observing Y for one given value of 

74 © 2005 Thomas Nichols 74 Classical Statistics: MLE Estimating  –Don’t know  in practice Maximum Likelihood Estimation –Find  that makes data most likely –The MLE ( ) is our estimate of   Y Likelihood of Y p(Y|  ) Actual y observed in the experiment y –Here, the MLE of the population mean  is simply the BOLD sample mean, y

75 © 2005 Thomas Nichols 75 Classical Statistical Inference Level 95% Confidence Interval –Y ± 1.96  /√n –With many replications of the experiment, CI will contain  95% of the time  Y Likelihood of Y p(Y|  ) CI observed

76 © 2005 Thomas Nichols 76 Classical Statistics Redux Grounded in long-run frequency of observable phenomena –Data, over theoretical replications –Inference: Confidence intervals, P-values Estimation based on likelihood Parameters are fixed –Can’t talk about probability of parameters –P( Pop mean  > 0 ) ??? True population mean % BOLD  is either > 0 or not Only way to know is to scan everyone in population

77 © 2005 Thomas Nichols 77 Bayesian Statistics Grounded in degrees of belief –“Belief” expressed with the grammar of probability –No problem making statements about unobservable parameters Parameters are regarded random, not fixed Data is regarded as fixed, since you only have one dataset

78 © 2005 Thomas Nichols 78 Bayesian Essentials Prior Distribution –Expresses belief on parameters before seeing the data Likelihood –Same as with Classical Posterior Distribution –Expresses belief on parameters after the seeing the data Bayes Theorem –Tells how to combine prior with likelihood (data) to create posterior PosteriorLikelihoodPrior

79 © 2005 Thomas Nichols 79 Bayesian Statistics: From Prior to Posterior Prior p(  ) –  ~ N(  0,  ) –  0 = 0 %  : a priori belief that activation & de- activation are equally likely –  = 1 %  : a priori belief that activation is small Data: y = 5 %  Posterior –Somewhere between prior and likelihood Prior Likelihood Posterior 0 -5 0 510  Population Mean % BOLD Change

80 © 2005 Thomas Nichols 80 Bayesian Statistics: Posterior Inference All Inference based on posterior E.g. Posterior Mean (instead of MLE) Prior Likelihood Posterior 0 -5 0 510  Population Mean % BOLD Change Prior Mean Data (Sample mean) Posterior Mean –Weighted sum of prior & data mean –Weights based on prior & data precision

81 © 2005 Thomas Nichols 81 Bayesian Inference: Posterior Inference But posterior is just another distribution –Can ask any probability question E.g. “What’s the probability, after seeing the data, that  > 0”, or P(  > 0 | y ) –Here P(  > 0 | y ) ≈ 1 “Credible Intervals” –Here 4 ± 0.9 has 95% posterior prob. –No reference to repetitions of the experiment Posterior 0 -5 0 510  Population Mean % BOLD Change

82 © 2005 Thomas Nichols 82 Bayesian vs. Classical Foundations –Classical How observable statistics behave in long-run –Bayesian Measuring belief about unobservable parameters Inference –Classical References other possible datasets not observed Requires awkward explanations for CI’s & P-values –Bayesian Based on posterior, combination of prior and data Allows intuitive probabilistic statements (posterior probabilities)

83 © 2005 Thomas Nichols 83 Bayesian vs. Classical Bayesian Challenge: Priors –I can set my prior to always find a result –“Objective” priors can be found; results then often similar to Classical inference When are the two similar? –When n large, the prior can be overwhelmed by likelihood –One-sided P-value ≈ Posterior probability of  > 0 –Doesn’t work with 2-sided P-value! [ P(   0 | y ) = 1 ]

84 © 2005 Thomas Nichols 84 Bayesian vs. Classical SPM T vs SPM PPM Auditory experiment SPM: Voxels with T > 5.5 PPM: Voxels with Posterior Probability > 0.95 Slide: Alexis Roche, CEA, SHFJ SPM mip [0, 0, 0] < << SPM{T 39.0 } SPMresults: Height threshold T = 5.50 Extent threshold k = 0 voxels Design matrix 1471013161922 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 60 contrast(s) 3 SPM mip [0, 0, 0] < << PPM 2.06 SPMresults: Height threshold P = 0.95 Extent threshold k = 0 voxels Design matrix 1471013161922 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 60 contrast(s) 4 Qualitatively similar, but hard to equate thresholds

85 © 2005 Thomas Nichols 85 Conclusions Multiple Testing Problem –Choose a MTP metric (FDR, FWE) –Use a powerful method that controls the metric Nonparametric Inference –More power for small group FWE inferences Conjunction Inference –Use intersection mask, or treat min k T k as single T Bayesian Inference –Conceptually different, but simpler than Classical –Priors controversial, but objective ones can be used

86 © 2005 Thomas Nichols 86 References Multiple Testing Problem –Worsley, Marrett, Neelin, Vandal, Friston and Evans, A Unified Statistical Approach for Determining Significant Signals in Images of Cerebral Activation. Human Brain Mapping, 4:58-73, 1996. –Nichols & Hayasaka, Controlling the Familywise Error Rate in Functional Neuroimaging: A Comparative Review. Statistical Methods in Medical Research, 12:419-446, 2003. –CR Genovese, N Lazar and TE Nichols. Thresholding of Statistical Maps in Functional Neuroimaging Using the False Discovery Rate. NeuroImage, 15:870-878, 2002. Nonparametric Inference –TE Nichols and AP Holmes. Nonparametric Permutation Tests for Functional Neuroimaging: A Primer with Examples. Human Brain Mapping, 15:1-25, 2002. –Bullmore, Long and Suckling. Colored noise and computational inference in neurophysiological (fMRI) time series analysis: resampling methods in time and wavelet domains. Human Brain Mapping, 12:61-78, 2001. Conjunction Inference –TE Nichols, M Brett, J Andersson, TD Wager, J-B Poline. Valid Conjunction Inference with the Minimum Statistic. NeuroImage, 2005. –KJ Friston, WD Penny and DE Glaser. Conjunction Revisited. NeuroImage, NeuroImage 25:661– 667, 2005. Bayesian Inference –L.R. Frank, R.B. Buxton, E.C. Wong. Probabilistic analysis of functional magnetic resonance imaging data. Magnetic Resonance in Medicine, 39:132–148, 1998. –Friston,, Penny, Phillips, Kiebel, Hinton and Ashbuarner, Classical and Bayesian inference in neuroimagining: theory. NeuroImage, 16: 465-483, 2002. (See also, 484-512) –Woolrich, M., Behrens, T., Beckmann, C., Jenkinson, M., and Smith, S. (2004). Multi-Level Linear Modelling for FMRI Group Analysis Using Bayesian Inference. NeuroImage, 21(4):1732-1747


Download ppt "Advanced Statistics Inference Methods & Issues: Multiple Testing, Nonparametrics, Conjunctions & Bayes Thomas Nichols, Ph.D. Department of Biostatistics."

Similar presentations


Ads by Google