Presentation is loading. Please wait.

Presentation is loading. Please wait.

FMRI Techniques to Investigate Neural Coding: Multivoxel Pattern Analysis (MVPA) Last Update: January 18, 2012 Last Course:

Similar presentations


Presentation on theme: "FMRI Techniques to Investigate Neural Coding: Multivoxel Pattern Analysis (MVPA) Last Update: January 18, 2012 Last Course:"— Presentation transcript:

1 fMRI Techniques to Investigate Neural Coding: Multivoxel Pattern Analysis (MVPA) http://www.fmri4newbies.com/ Last Update: January 18, 2012 Last Course: Psychology 9223, W2010, University of Western Ontario Last Update: November 24, 2014 Last Course: Psychology 9223, F2014 Jody Culham Brain and Mind Institute Department of Psychology Western University

2 Limitations of Subtraction Logic Example: We know that neurons in the brain can be tuned for individual faces “Jennifer Aniston” neuron in human medial temporal lobe Quiroga et al., 2005, Nature

3 Fusiform Face Area (FFA) 3 mm low activity high activity 3 mm A voxel might contain millions of neurons, so the fMRI signal represents the population activity fMRI spatial resolution: 1 voxel 3 mm

4 Limitations of Subtraction Logic Firing Rate Activation Neuron 1 “likes” Jennifer Aniston Neuron 2 “likes” Julia Roberts Neuron 3 “likes” Brad Pitt Even though there are neurons tuned to each object, the population as a whole shows no preference fMRI resolution is typically around 3 x 3 x 6 mm so each sample comes from millions of neurons. Let’s consider just three neurons.

5 Two Techniques with “Subvoxel Resolution” “subvoxel resolution” = the ability to investigate coding in neuronal populations smaller than the voxel size being sampled 1.Multi-Voxel Pattern Analysis (MVPA or decoding or “mind reading”) 2.fMR Adaptation (or repetition suppression or priming)

6 Multivoxel Pattern Analyses (or decoding or “mind reading”)

7 3 mm low activity high activity fMRI spatial resolution: 1 voxel

8 3 mm low activity high activity Region Of Interest (ROI): group of voxels

9 RL 3 mm Voxel Pattern Information Condition 1Condition 2 3 mm

10 Spatial Smoothing most conventional fMRI studies spatially smooth (blur) the data –increases signal-to-noise –facilitates intersubject averaging loses information about the patterns across voxels No smoothing 4 mm FWHM7 mm FWHM10 mm FWHM

11 Effect of Spatial Smoothing and Intersubject Averaging 3 mm

12 Standard fMRI Analysis trial 1 trial 3 trial 2 trial 1 trial 2 trial 3 FACESHOUSES Average Summed Activation

13 Perhaps voxels contain useful information In traditional fMRI analyses, we average across the voxels within an area, but these voxels may contain valuable information In traditional fMRI analyses, we assume that an area encodes a stimulus if it responds more, but perhaps encoding depends on pattern of high and low activation instead But perhaps there is information in the pattern of activation across voxels

14 Decoding for Dummies Kerri Smith, 2013, Nature, “Reading Minds”

15 Approaches to Multi-Voxel Pattern Analysis 1.MVPA classifier 2.MVPA correlation: Basic approach 3.MVPA correlation: Representational similarity analysis

16 Preparatory Steps

17 Initial Steps Step 1: Select a region of interest (ROI) –e.g. a cube centred on an activation hotspot [15 mm (5 functional voxels)] 3 = 3,375 mm 3 = 125 functional voxels DO NOT SPATIALLY SMOOTH THE DATA Step 2 : Extract a measure of brain activation from each of the functional voxels within the ROI β weights –z-normalized –%-transformed % BOLD signal change –minus baseline t-values –β/error

18 MVPA Methods block or event-related data resolution –works even with moderate resolution (e.g., 3 mm isovoxel) –tradeoff between resolution and coverage, SNR –preprocessing usually steps apply (slice scan time correction, motion correction, low pass temporal filter) –EXCEPT: No spatial smoothing! Model single subjects, not combined group data (at least initially)

19 Classifier Approach

20 Training Trials Test Trials (not in training set) trial 1 Can an algorithm correctly “guess” trial identity better than chance (50%)? trial 3 trial 2 trial 1 trial 2 trial 3 …… FACESHOUSES Classifier Approach

21 Activity in Voxel 1 Activity in Voxel 2 Faces Houses Each dot is one measurement (trial) from one condition (red circles) or the other (green circles) Voxel 1Voxel 2

22 Activity in Voxel 1 Activity in Voxel 2 Classifier Training setTest set Faces Houses

23 Activity in Voxel 1 Activity in Voxel 2 Classifier Test set Classifier Accuracy = Correct Incorrect 8 6 == 75 % Faces Houses Can the classifier generalize to untrained data?

24 Iterative testing (“folds”) Example: Leave one-pair out –10 trials of faces + 10 trials of houses –There are 100 possible combinations of trial pairs F1, H1 F1, H2 … F2, H1 F2, H2 … F10, H10 –We can train on 9/10 trials of each with 1/10 excluded for 100 iterations –Average the accuracy across the 100 iterations Many options: e.g., Leave one run out; classify the average of several trials left out

25 simple 2D example Classifier can act on single voxels. Conventional fMRI analysis would detect the difference. Classifier would require curved decision boundary Classifier can not act on single voxels because distributions overlap Classifier can act on combination of voxels using a linear decision boundary Each dot is one measurement (trial) from one trial type (red circles) or the other (blue squares) decision boundary White and black circles show examples of correct and erroneous classification in the test set 9 voxels  9 dimensions Haynes & Rees, 2006, Nat Rev Neurosci

26 Where to “Draw the Line”? There are different approaches to determining what line/plane/hyperplane to use as the boundary between classes We want an approach with good generalization to untrained data The most common approach is the linear support vector machine (SVM)

27 Support Vector Machine (SVM) Mur et al., 2009 SVM finds a linear decision boundary that discriminates between two sets of points constrained to have a the largest possible distance from the closest points on both sides. response patterns closest to the decision boundary (yellow circles) that defined the margins are called “support vectors”.

28 Is decoding better than chance? Two options 1.Use intersubject variability to determine significance Mean chance +/- 95% CI

29 Permutation Testing randomize all the condition labels run SVMs on the randomized data repeat this many times (e.g., 1000X) get a distribution of expected decoding accuracy test the null hypothesis (H 0 ) that the decoding accuracy you found came from this permuted distribution

30 Is decoding better than chance? Two options 2.Permutation Testing median of permuted distribution (should be 33.3%) upper quartile of permuted distribution lower quartile of permuted distribution upper bound of 95% confidence limits on permuted distribution our data  reject H 0

31 Example of MVPA classifier approach: decoding future actions Gallivan et al., 2013, eLife

32 Conditions

33

34 Hand and Tool Decoding +/- 1 SEM

35 Cross-decoding Logic Task-Across-Effector –Train Grasp vs. Reach for one effector (e.g. Hand) –Test Grasp vs. Reach for other effector (e.g., Tool) –If (Accuracy > chance), then area codes task regardless of effector

36 Hand and Tool Decoding L SPOC % Decoding Accuracy +/- 1 SEM

37 Hand and Tool Decoding L aIPS L SPOC L SMGL M1 L PMd % Decoding Accuracy +/- 1 SEM L PMv

38 Single TR Decoding Time (volumes) % Decoding Accuracies +/- 1 SEM

39 Basic Correlation Approach

40 First Demonstration

41 trial 1 trial 3 trial 2 trial 3 FacesHouses Average Summed Activation MVPA correlation approach trial 1 trial 2 trial 3 trial 1 trial 2 trial 3

42 trial 1 trial 3 trial 2 trial 3 Faces Houses Average Summed Activation MVPA correlation approach trial 1 trial 2 trial 3 trial 1 trial 2 trial 3 The same category evokes similar patterns of activity across trials trial 1 trial 2 trial 3

43 trial 1 trial 3 trial 2 trial 3 FacesHouses Average Summed Activation MVPA correlation approach trial 1 trial 2 trial 3 Similarity Within the same category trial 1 trial 2 trial 3

44 trial 1 trial 3 trial 2 trial 1 trial 2 trial 3 FacesHouses Average Summed Activation MVPA correlation approach trial 1 trial 2 trial 3 Similarity Between different categories trial 1 trial 2 trial 3

45 Within-category similarity Between-category similarity > The brain area contains distinct information about faces and houses

46 Haxby et al., 2001, Science Category-specificity of patterns of response in the ventral temporal cortex

47 Haxby et al., 2001, Science Category-specificity of patterns of response in the ventral temporal cortex SIMILARITY MATRIX ODD RUNS EVEN RUNS high low similarity Within-category similarity

48 Haxby et al., 2001, Science Category-specificity of patterns of response in the ventral temporal cortex ODD RUNS EVEN RUNS SIMILARITY MATRIX high low similarity Between-category similarity

49 Correlation Approach Using Representational Similarity Analysis

50 Representational similarity approach (RSA) ODD RUNS Kriegeskorte et al (2008) EVEN RUNS MVPA correlation Differently from the MVPA correlation, RSA does not separate stimuli into a priori categories RSA high low similarity (correlation)

51 ........ TRIALS........ CONDITIONS C1C1 C1C1 C 96 No class boundaries! high low similarity C 96

52 Kriegeskorte et al (2008) high low similarity Can compare theoretical models to data

53 REAL DATA Kriegeskorte et al (2008) high low similarity Which prediction matrix is more similar to the real data? Can compare theoretical models to data

54 “Metacorrelations” Calculate correlation between model correlation matrix and data correlation matrix

55 Can look at metacorrelations to determine best model or see similarity between areas Right FFA pattern is similar to left FFA pattern Right FFA pattern is similar to the fane-anim prototype theoretical model Right FFA pattern is not very similar to a low-level vision theoretical model

56 Metacorrelation Matrix

57 Multidimensional Scaling (MDS) Input = matrix of distances (km here) VancouverWinnipegTorontoMontrealHalifaxSt. John'sYellowknifeWhitehorse Vancouver01869336636944439504615661484 Winnipeg0151818252581325017532463 Toronto05031266211230784093 Montreal0792161331944261 Halifax088537684867 St. John's041275233 Yellowknife01109 Whitehorse0

58 Yellowknife Vancouver Winnipeg Toronto Montreal Halifax St. John’s Multidimensional Scaling (MDS) Output = representational space (2D here)

59 Whitehorse Yellowknife Vancouver Winnipeg Toronto Montreal Halifax St. John’s

60 MDS on MVPA Data MDS

61 Different Representational Spaces in Different Areas

62 Metacorrelation Matrix

63 MDS on Metacorrelations

64 Searchlights

65 Searchlight: 8 Voxel Example

66 Let’s zoom in on 8 voxels

67 Spherical Searchlight Cross-Section Ideally we’d like to test a spherical volume but the functional brain image is voxelized so we end up with a Lego-like sphere Typical radius = 4 mm Kriegeskorte, Goebel & Bandettini, 2006, PNAS

68 Moving the Searchlight 5562736760524851 Each value in white is the decoding accuracy for a sphere of 5-voxels diameter centered on a given voxel

69 First- and Second-Level Analysis 4652656960595348 S2 4855627058525049 S3 5255595756434252 S15 …… First-level Analysis Second-level Analysis.81.06.001.008.012.08.25.44 p threshold at p <.05 (or use your favorite way of correcting for multiple comparisons) 0.32.04.13.71.91.20.8 t(14) Do a univariate t-test (which is an RFX test based on intersubject variability) at each voxel to calculate the probability that the decoding accuracy is higher than chance 2.9 5562736760524851 S1 V1V2V3V4V5V6V7V8 The same 8 voxels in stereotaxic space (e.g., Tal space) SVM Classifier Decoding accuracies for spheres centred at each of the eight voxels in each of the 15 Ss 51596967595550 Average Decoding Accuracy

70 Thresholded t-map

71 51596967595550 Decoding Accuracies Second-level Analysis 0. 1 0. 7 1. 2 1. 5 1. 1 0. 5 0. 3 0. 2 Beta Weights (or Differences in Beta Weights = Contrasts).03.22.41.50.38.19-.01.04 Correlations Between Model and MVPA data Are they sig diff than zero? Are they sig diff than chance? Are they sig diff than zero? V1V2V3V4V5V6V7V8 S2 S3 S15 …… First-level Analysis S1 V1V2V3V4V5V6V7V8 Second-level Analysis The principles of a second-level analysis are the same regardless of what dependent variable we are testing UNIVARIATE VOXELWISE ANALYSIS MULTI- VARIATE SEARCHLIGHT ANALYSIS

72 Regions vs. Brains Univariate ROI analysis is to univariate voxelwise analysis as multivariate ROI analysis is to multivariate searchlight analysis There are no differences at the second-level analysis It’s a way to find things by searching the whole brain Subjects’ brains must be aligned (Talairach, MNI or surface space) The same problems and solutions for multiple comparisons arise Degrees of freedom = #Ss - 1 There are differences at the first-level analysis Univariate voxelwise analyses are done one voxel at a time Multivariate searchlight analyses are done one sphere at a time

73 MVPA Searchlight Kriegeskorte, Goebel & Bandettini, 2006, PNAS

74 Activation- vs. information-bases analysis Kriegeskorte, Goebel & Bandettini, 2006 Activation-based (standard fMRI analysis): regions more strongly active during face than house perception Information-based (searchlight MVPA analysis): regions whose activity pattern distinguished the two categories 35 % of voxels are marked only in the information- based map: category information is lost when data are smoothed

75 Activation- vs. information-based analysis Mur et al., 2009, Social Cognitive and Affective Neuroscience

76 What Is MVPA Picking Up On?

77 Limitations of MVPA MVPA will use whatever information is available, including confounds –e.g., reaction time MVPA works best for attributes that are coded at certain spatial scales (e.g., topography: retintopy, somatotopy, etc.) A failure to find effects does not mean that neural representations do not differ –information may be present at a finer scale –choice of classifier may not have been optimal (e.g., maybe nonlinear would work better) Good classification indicates presence of information (not necessarily neuronal selectivity) (Logothetis, 2008). –e.g., successful face decoding in primary visual cortex Pattern-classifier analysis requires many decisions that affect the results (see Misaki et al., 2010) Classifiers and correlations don’t always agree

78 How can MVPA see patterns < 1 voxel? Data from: Kamitami & Tong, 2005, Nat Neurosci Figure from: Norman et al., 2006, TICS

79 “Mind-Reading”: Reconstructing new stimuli from brain activity

80 Reconstruct new images Miyawaki et al., 2008

81 Decoding Vision Gallant Lab, UC Berkeley

82 Lie detector (Davatzikos et al., 2005) Non-linear classifier applied to fMRI data to discriminate spatial patterns of activity associated to lie and truth in 22 individual participants. 88% accuracy to detect lies in participants not included in the training

83 Lie detector Non-linear classifier applied to fMRI data to discriminate spatial patterns of activity associated to lie and truth in 22 individual participants. 88% accuracy to detect lies in participants not included in the training The real world is more complex!

84 Reconstruct dreams Measure brain activity while 3 participants were asleep and ask them to describe their dream when awake Comparison between brain activity during sleep and vision of pictures of categories frequently dreamt Activity in higher order visual areas (i.e. FFA) could successfully (accuracy of 75-80%) decode the dream contents 9 seconds before waking the participant! Kamitami Lab ATR Japan

85 Huth et al., 2012 Shared Semantic Space from brain activity during observation of movies Similar colors for categories similarly represented in the brain

86 Huth et al., 2012 Similar colors for categories similarly represented in the brain Shared Semantic Space from brain activity during observation of movies People and communication verbs are represented similarly

87 Continuous Semantic Space across the surface Each voxel is colored accordingly to which part of the semantic space is selective for http://gallantlab.org/semanticmovies/

88 FUSIFORM FACE AREA Click on each voxel to see which categories it represents Continuous Semantic Space across the surface

89 MVPA Tutorial http://www.fmri4newbies.com/ Last Update: January 18, 2012 Last Course: Psychology 9223, W2010, University of Western Ontario Last Update: March 10, 2013 Last Course: Psychology 9223, W2013, Western University Jody Culham Brain and Mind Institute Department of Psychology Western University

90 Test Data Set Two runs: A and B (same protocol) 5 trials per condition for 3 conditions

91 Measures of Activity β weights –z-normalized –%-transformed t-values –β/error % BOLD signal change –minus baseline low activity high activity low t low β z low β % high t high β z high β %

92 Step 1: Trial Estimation Just as in the Basic GLM, we are running one GLM per voxel Now however, each GLM is estimating activation not across a whole condition but for each instance (trial or block) of a condition

93 Three Predictors Per Instance 2-gammaconstantlinear within trial 5 instances of motor imagery 5 instances of mental calculation 5 instances of mental singing

94 Step 1: Trial Estimation Dialog

95 Step 1: Trial Estimation Output Now for each instance of each condition in each run, for each voxel we have an estimate of activation

96 Step 2: Support Vector Machine SVMs are usually run in a subregion of the brain –e.g., a region of interest (= volume of interest) sample data: SMA ROI sample data: 3 Tasks ROI

97 Step 2: Support Vector Machine test data must be independent of training data –leave-one-run-out –leave-one-trial-out –leave-one-trial-set-out often we will run a series of iterations to test multiple combinations of leave-X-out –e.g., with two runs, we can run two iterations of leave-one-run-out –e.g., with 10 trials per condition and 3 conditions, we could run up to 10 3 = 1000 iterations of leave-one-trial-set-out

98 MVP file plots 98 functional voxels 15 trials Run A = training set Run B = test set intensity = activation

99 SVM Output: Train Run A; Test Run B Guessed Condition Actual Condition Guessed Condition Actual Condition                               15/15 correct 10/15 correct (chance = 5/15)

100 SVM Output: Train Run B; Test Run A

101 Permutation Testing randomize all the condition labels run SVMs on the randomized data repeat this many times (e.g., 1000X) get a distribution of expected decoding accuracy test the null hypothesis (H 0 ) that the decoding accuracy you found came from this permuted distribution

102 Output from Permutation Testing median of permuted distribution (should be 33.3%) upper quartile of permuted distribution lower quartile of permuted distribution upper bound of 95% confidence limits on permuted distribution our data  reject H 0

103 Voxel Weight Maps voxels with high weights contribute strongly to the classification of a trial to a given condition


Download ppt "FMRI Techniques to Investigate Neural Coding: Multivoxel Pattern Analysis (MVPA) Last Update: January 18, 2012 Last Course:"

Similar presentations


Ads by Google