Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-Voxel Pattern Analyses MVPA

Similar presentations


Presentation on theme: "Multi-Voxel Pattern Analyses MVPA"— Presentation transcript:

1 Multi-Voxel Pattern Analyses MVPA
z-score -5 5

2 What Information Can You Infer from Multivoxel Patterns (MVP)?

3 What Information Can You Infer from Multivoxel Patterns (MVP)?
In traditional fMRI analyses, we average across the voxels within an area, but what if the voxels are not homogeneous in their response properties? Distributed activations across voxels may contain valuable information.

4 What Information Can You Infer from Multivoxel Patterns (MVP)?
In traditional fMRI analyses, we average across the voxels within an area, but what if the voxels are not homogeneous in their response properties? Distributed activations across voxels may contain valuable information. In traditional fMRI analyses, we assume that an area encodes a stimulus if it responds more to it than others. However, encoding may depend on the distributed pattern of both high and low activations.

5 Object-, face- & place-selective regions in the human brain
face & object selective place & object selective Face-selective (fusiform) Place-selective (CoS & PHG) 3.5 % signal 2.5 2.5 1.5 1.5 0.5 % signal FUS FUS 0.5 houses houses -0.5 faces animals novel houses emptyscene text -0.5 Adult high level visual cortex is characterized by regions that respond selectively to different objects, including loc regions that respond more to objects than scrambled objects, regions that respond more strongly to faces than objects and regions that respond more strongly to places than objects and faces Grill-Spector, CONB 2003 > object-selective > lateral view face-selective > lateral view place-selective

6 Graphical Representation of Multi Voxel Patterns (MVP)

7 Graphical Representation of Multi Voxel Patterns (MVP)
Threshold

8 Graphical Representation of Multi Voxel Patterns (MVP)
ROI in standard fMRI: impoverished view of activations Threshold

9 Is there an area in the brain for every object category? Hypothesis:
Problem: Is there an area in the brain for every object category? Hypothesis: Distributed activation patterns across the ventral stream (rather than areas) code object categories. Haxby et al. Science 2001

10 Haxby et al. Science 2001

11 Split-half analysis of multi voxel patterns:
within-category correlation = reproducibility % signal face Odd runs cat house chair Voxel # % signal face cat Even runs house chair Voxel # Haxby et al. Science 2001

12 Split-half analysis of multi voxel patterns:
between-category correlation = discriminability % signal face Odd runs cat house chair Voxel # % signal face cat Even runs house chair Voxel # Haxby et al. Science 2001

13 Haxby et al. Science 2001

14 Simple Correlation Analysis
Measure within-category correlations within faces (F1: F2) within cats (C1:C2) Measure between-category correlations Between faces: cats (F1: C2; F2: C1) If within-category correlations > between-category correlations, conclude that area encodes different stimuli face cat house chair face cat house chair

15 This cross correlation matrix got a new name: representational similarity matrix (RSM, Kriegeskorte 2008)

16 What values do you include to represent multivoxel patterns?
Beta values/raw amplitudes Subtracted betas T-values Z-scores

17 Normalization: or why we want to discount between-voxels effects
Raw Betas Amplitude Relative to mean response category position Sayres & Grill-Spector, JNP 2008

18 Normalization: or why we want to discount between-voxels effects
Raw Amplitude Relative to mean response Sayres & Grill-Spector, JNP 2008

19 Decoding: Can you classify what category the subject saw from brain MVPs?

20 Classification Training set: example multivoxel patterns (vectors which dimensionality is the number of voxels) that are labeled to classes.

21 Classification Training set: example multivoxel patterns (vectors which dimensionality is the number of voxels) that are labeled to classes. Find a rule that separates the classes

22 Classification Training set: example multivoxel patterns (vectors which dimensionality is the number of voxels) that are labeled to classes. Find a rule that separates the classes Testing set: new multivoxel patterns. For each pattern determine to which class it belongs

23 Training: Block, 1 back S1 Right Left A P L M A P M L Testing: Event-related And then use a winner-take-all (WTA) classifier to train it on one session data and test if it can determine the category across sessions and experiments. The classifier chooses the class based on the highest correlation between the test and training set. Chance level is 25% z-score -6 -4 -2 2 4 6 Winner-take-all classifier: train on one session and determine category in a different session based on the highest correlation

24 Classifier [% correct]
Winner-take-all classifier successfully decodes object category from distributed responses across ventral stream, but not for V1 or control ROI 100 (n=7) * * Anatomical VTC Control ROI 80 * Classifier [% correct] 60 * OTS CoS 40 20 chance level Medial VTC Lateral VTC Control ROI V1 Medial VTC Lateral VTC Whole VTC * significantly > chance, P < 0.05 Weiner & Grill-Spector, NeuroImage 2010

25 Classifier An algorithm that finds the rule separating between classes. Based on this rule, new data points can be labeled as “Class 1” or “Class 2” Suppose we have a 2D space of points. We want an algorithm that separates them into 2 classes Voxel 1 Voxel 2

26 Nearest Neighbor Classifier
Measure the distance from new data point to each of your classes (usually centers) and choose the closest one. Nearest Neighbor Classifier Voxel 1 Voxel 2

27 Linear Classifier Voxel 2 Voxel 1
Find a line (hyperplane) that separates the two classes. All points on one side of the line are labeled as “class 1” and points on the other side of the line as “class 2”

28 Linear Classifier Voxel 2 Voxel 1 H3 is not a separating line (hyperplane for spaces with more than 2 dimensions)

29 Linear Classifier “Class 1” “Class 2”
Voxel 2 Voxel 1 H1 is a separating line (hyperplane). Note that some points are close to the boundary.

30 Linear Classifier “Class 1” “Class 2”
Voxel 2 Voxel 1 H2 is an optimal separating plane because the data points closest to the boundary are as far apart as possible

31 Support Vector Machine (SVM)
Maximum-margin hyperplane and margins for a SVM trained with samples from two classes To find the separating hyperplane you need to find w and b, given the training points x with known labels (“class 1”, “class 2”). The distance between classes is is twice the distance to the closest examples: Voxel 2 To get the best separating hyperplane you want to minimize Voxel 1 SVM is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. The points on the margin are called the support vectors

32 Support Vector Machine (SVM)
Support vector machines (SVMs) are a set of supervised learning methods used for classification.

33 Support Vector Machine (SVM)
Support vector machines (SVMs) are a set of supervised learning methods used for classification. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. sup

34 Support Vector Machine (SVM)
Support vector machines (SVMs) are a set of supervised learning methods used for classification. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. sup

35 Support Vector Machine (SVM)
Support vector machines (SVMs) are a set of supervised learning methods used for classification. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on. The separating plane between the two categories is called a hyperplane. sup

36 Support Vector Machine (SVM)
Support vector machines (SVMs) are a set of supervised learning methods used for classification. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on. The separating plane between the two categories is called a hyperplane. Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the nearest training datapoints of any class (so- called functional margin), since in general the larger the margin the lower the generalization error of the classifier. Training samples on the margin are called the support vectors.

37 Support Vector Machine (SVM)
Support vector machines (SVMs) are a set of supervised learning methods used for classification. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that predicts whether a new example falls into one category or the other. An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on. The separating plane between the two categories is called a hyperplane. Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the nearest training datapoints of any class (so- called functional margin), since in general the larger the margin the lower the generalization error of the classifier. Training samples on the margin are called the support vectors.

38 Average fMRI response from V1 does not have orientation information
One of the reasons MVPA got famous is that Kamitani & Tong (Nature Neuroscience 2005) claimed that you can decode orientation from V1 using MVPA, despite the fact that orientation columns are smaller than voxels Orientation columns (Optical Imaging) Average fMRI response from V1 does not have orientation information

39 Kamitani & Tong, Nature Neuroscience 2005:
Decoding orientation from V1 MVPs Orientation columns (Optical Imaging) Small orientation biases in V1 voxels

40 So while each v1 voxel has a small orientation bias using MVPA one can decode from MVP of V1 voxels what orientation the subject is viewing Kamitani & Tong. Nature Neuroscience 2005

41 In each panel Top inset shows stimulus
In each panel Top inset shows stimulus. Bottom show predicted orientation based on 400 voxels from V1 and V2 RMSE classification error was around 20 degrees per subject. Kamitani & Tong. Nature Neuroscience 2005

42 (or maps) rather than within voxel biases
Orientation and angular-position topographic maps for a single subject shown on a flattened representation of the occipital lobe. However, Jeremy Freemen, David Heeger and Elias Miriam are pushing back on the interpretation that MVPA gives you subvoxel resolution. In fact they show that what is being picked up in MVPA is between voxel patterns (or maps) rather than within voxel biases Orientation and angular-position topographic maps for a single subject shown on a flattened representation of the occipital lobe. A, Responses to phase-encoded oriented gratings (shown in inset). The stimulus cycled through 16 steps of orientation, ranging from 0° to 180° every 24 s. The map is thresholded at a coherence of 0.3. B, Responses to double-wedge retinotopy stimulus (shown in inset). The stimulus cycled through 16 steps of angular position, ranging from 0° to 180° every 24 s. The map is thresholded at a coherence of 0.68 to account for differences in signal-to-noise ratio between the two experiments (see Materials and Methods). Color indicates phase of best-fitting sinusoid; white lines indicate the V1/V2 boundaries. Freeman J et al. J. Neurosci. 2011;31: ©2011 by Society for Neuroscience

43 Orientation map matches angular-position map.
Angular position map necessary and sufficient for classification. Orientation map matches angular-position map. A, Circular correlation between orientation and angular-position maps. Preferred orientation is plotted against preferred angular position. Each dot corresponds to a voxel from an annular region of interest in V1 defined using data from an independent eccentricity mapping experiment. Data were combined across subjects (n = 3). Dot color indicates coherence from the orientation mapping experiment. B, Response amplitude (percentage change in image intensity), averaged across voxels and across subjects (n = 3), as a function of temporal frequency for angular-position (top) and orientation mapping (bottom). Red dot indicates stimulus frequency. C, Line vector plot showing radial bias of orientation preference in the visual field. Each line corresponds to a voxel. The center of each line is the voxel's retinotopic position (angular position and eccentricity), and the angle of each line indicates the voxel's orientation preference. Line color and size indicate coherence from the orientation mapping experiment. Data were combined across n = 3 subjects. Freeman J et al. J. Neurosci. 2011;31: ©2011 by Society for Neuroscience

44 Limitations of decoding MVPs
Limited by voxels “biases” at the resolution you measure These biases need to be reproducible

45 Limitations of decoding MVPs
Limited by voxels “biases” at the resolution you measure These biases need to be reproducible Useful for applications like prosthetics, mind reading, but does it tell us how the brain works? Need generative/encoding models of the brain (e.g., Kay et al, Nature, 2008)

46 Limitations of decoding MVPs
Limited by voxels “biases” at the resolution you measure These biases need to be reproducible Useful for applications like prosthetics, mind reading, but does it tell us how the brain works? Need generative/encoding models of the brain (e.g., Kay et al, Nature, 2008) A brain area may contain distributed information, but the brain may not use this information as a classifier does? In other words, does the brain operate like a classifier? SVM? What about non-linear classifiers?

47 Neural responses to visual scenes reveals inconsistencies between fMRI adaptation and multivoxel pattern analysis. Epstein RA, Morgan LK. Neuropsychologia Mar;50(4):

48 Epstein RA, Morgan LK. Neuropsychologia. 2012 Mar;50(4):530-43.
Neural responses to visual scenes reveals inconsistencies between fMRI adaptation and multivoxel pattern analysis. Epstein RA, Morgan LK. Neuropsychologia Mar;50(4): MVPA: reveals stronger category than landmark information fMRI-Adaptation: reveals adaptation to landmarks not category


Download ppt "Multi-Voxel Pattern Analyses MVPA"

Similar presentations


Ads by Google