Mind Reading with fMRI Ken Norman Department of Psychology Princeton University May 1, 2007.

Slides:



Advertisements
Similar presentations
FMRI Methods Lecture 10 – Using natural stimuli. Reductionism Reducing complex things into simpler components Explaining the whole as a sum of its parts.
Advertisements

Chapter 4: Cognitive science and the integration challenge
What is the neural code? Puchalla et al., What is the neural code? Encoding: how does a stimulus cause the pattern of responses? what are the responses.
Adjusting Active Basis Model by Regularized Logistic Regression
Andreas Kleinschmidt INSERM U992 CEA NeuroSpin Saclay, France Mind Reading - Can Imaging Tell What You Are Thinking?
Classification Classification Examples
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Decoding Seen and Attended Edge Orientation and Motion Direction from the Human Brain Activity Measured by functional Magnetic Resonance Imaging (fMRI)
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
A (very) brief introduction to multivoxel analysis “stuff” Jo Etzel, Social Brain Lab
Data Visualization STAT 890, STAT 442, CM 462
Predictive Modeling of Spatial Properties of fMRI Response Predictive Modeling of Spatial Properties of fMRI Response Melissa K. Carroll Princeton University.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.
Un Supervised Learning & Self Organizing Maps Learning From Examples
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Three kinds of learning
Multi-voxel Pattern Analysis (MVPA) and “Mind Reading” By: James Melrose.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
An aside: peripheral drift illusion illusion of motion is strongest when reading text (such as this) while viewing the image in your periphery. Blinking.
Modeling fMRI data generated by overlapping cognitive processes with unknown onsets using Hidden Process Models Rebecca A. Hutchinson (1) Tom M. Mitchell.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Crash Course on Machine Learning
Multiclass object recognition
How To Do Multivariate Pattern Analysis
Comparison of Boosting and Partial Least Squares Techniques for Real-time Pattern Recognition of Brain Activation in Functional Magnetic Resonance Imaging.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 – Fall 2014.
Basics of Functional Magnetic Resonance Imaging. How MRI Works Put a person inside a big magnetic field Transmit radio waves into the person –These "energize"
1 / 41 Inference and Computation with Population Codes 13 November 2012 Inference and Computation with Population Codes Alexandre Pouget, Peter Dayan,
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.
FMRI Methods Lecture7 – Review: analyses & statistics.
Sonification of fMRI Data Nik Sawe Music 220C. Overview PhD studies assess decision-making on environmental issues through neuroimaging Neural activation.
LOGISTIC REGRESSION David Kauchak CS451 – Fall 2013.
An Introduction to Support Vector Machines (M. Law)
Today Ensemble Methods. Recap of the course. Classifier Fusion
Fields of Experts: A Framework for Learning Image Priors (Mon) Young Ki Baik, Computer Vision Lab.
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
Pattern Classification of Attentional Control States S. G. Robison, D. N. Osherson, K. A. Norman, & J. D. Cohen Dept. of Psychology, Princeton University,
MVPD – Multivariate pattern decoding Christian Kaul MATLAB for Cognitive Neuroscience.
Thinking part I Mental Representations and Visual Imagery Mind Reading
C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Machine Learning Methods on functional MRI Data Siemens AG Corporate.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
Introduction  Conway 1 proposes there are two types of autobiographical event memories (AMs):  Unique, specific events  Repeated, general events  These.
Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.
Chapter 6 Neural Network.
FMRI and Behavioral Studies of Human Face Perception Ronnie Bryan Vision Lab
Once Size Does Not Fit All: Regressor and Subject Specific Techniques for Predicting Experience in Natural Environments Denis Chigirev, Chris Moore, Greg.
Thinking part I Visual Imagery Mind Reading. Solving problems through imagery What shape are mickey mouse’s ears? How many windows are there in your apartment?
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Thinking part I Mental Representations and Visual Imagery Mind Reading.
1 An introduction to machine learning for fMRI Francisco Pereira Botvinick Lab Princeton Neuroscience Institute Princeton University.
Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.
Multivariate Pattern Analysis of fMRI data. Goal of this lecture Introduction of basic concepts & a few commonly used approaches to multivariate pattern.
Big data classification using neural network
Deep Learning Amin Sobhani.
Multi-Voxel Pattern Analyses MVPA
To discuss this week What is a classifier? What is generalisation?
Classification of fMRI activation patterns in affective neuroscience
Machine Learning Basics
Department of Psychology University of Washington
Gaurav Aggarwal, Mark Shaw, Christian Wolf
On Convolutional Neural Network
Artificial Intelligence Lecture No. 28
Presentation transcript:

Mind Reading with fMRI Ken Norman Department of Psychology Princeton University May 1, 2007

Brain Scanning Todays topic: Applying pattern classifiers to brain scanning data, to decode the information represented in a persons brain at a particular point in time This is NOT the standard approach Standard approach: Stick someone in the scanner Have them perform a cognitive task Explore which brain regions are engaged by the cognitive task

Brain Scanning If youre interested in memory retrieval: Scan people while theyre retrieving memories Scan people during a control condition Look at which brain regions respond differentially This approach has been very productive for cognitive neuroscience

Brain Scanning Alternative approach to analyzing brain scanning data: Use pattern classification algorithms, applied to distributed patterns of neural activity, to identify the neural signatures of particular thoughts and memories Once we have trained the classifier to recognize a particular thought, we can use the classifier to track the comings and goings of those thoughts over time

Motivation Why pattern classification? Reason #1: Improve the interface between fMRI and cognitive theories Cognitive neuroscientists have developed very detailed theories of how information is processed in the brain What information is represented in different brain structures? How is it represented? How is that information transformed at different stages of processing? To directly test these theories, we need a way of decoding the informational contents of the subjects brain state

Motivation Reason #2: We arent doing as good a job of data mining fMRI data as we could... We collect several GB of information from each subject There is a lot of information about subjects thoughts buried in these big data files; the challenge is how to extract this information Machine learning researchers have developed tremendously powerful algorithms for extracting meaningful regularities from large data sets These algorithms are not routinely used in fMRI data analysis…

Outline 3 minute overview of functional MRI Brief overview of existing research on fMRI pattern classification Technical challenges & machine learning issues

Brain Scanning 101

How do we image neural activity with functional MRI? Brain regions that are active use up more metabolic resources In particular, they use up more oxygen from the blood The MRI machine can be tuned to detect the difference between oxygenated and deoxygenated blood By looking at which brain areas have deoxygenated vs. oxygenated blood, we can get a sense of which brain areas are active at a particular moment

Brain Scanning 101 it takes approx. 2 seconds for the MRI machine to take a snapshot of blood flow (across the entire brain)

fMRI images Big cube, made out of a grid of little cubes – Pixel = one square in a 2D grid (picture element) – Voxel = one of the tiny little cubes in an fMRI image (like a volumetric pixel) Voxels are approx. 3 millimeters on each side Neuron size ~ 10 micrometer Each voxel reflects the aggregate activity of a very large number of neurons We arent directly measuring activity, we are measuring blood flow! Blood flow response is smeared out in time (peak response = ~6 sec after neural activity)

Patterns in the brain Key idea: Cognitive states correspond to distributed patterns of brain activity What do these patterns in the brain look like?

The Eight Categories Study (Haxby et al. 2001) Faces Cats Scissors Chairs Houses Bottles Shoes Scrambled Pictures slides courtesy of Jim Haxby

Accuracy of Category Identification Identification Accuracy ± SE Chance Overall Accuracy = 96% slides courtesy of Jim Haxby

Our Studies We set out to extend the basic pattern classification method The brain patterns from the Haxby study correspond to several minutes worth of brain activity We wanted to see if we could classify cognitive states based on single brain images (reflecting ~2 seconds worth of neural activity)

Pattern Classification Method General approach: Say that we want to be able to track the presence of two different cognitive states in the subjects brain (e.g., viewing shoes vs. bottles) using fMRI

Pattern Classification Method 1.Acquire brain data while the subject is thinking about shoes or bottles

Pattern Classification Method 1.Acquire brain data 2.Convert each functional brain volume (~ 2 seconds worth of data) into a vector that reflects the pattern of activity across voxels at that point in time. We typically do some kind of feature selection to cut down on the number of voxels

Pattern Classification Method 1.Acquire brain data 2.Generate brain patterns 3. Label brain patterns according to whether the subject was viewing shoes vs. bottles (adjusting for lag in the blood flow response)

Pattern Classification Method 1.Acquire brain data 2.Generate brain patterns 3.Label brain patterns 4.Train a classifier to discriminate between bottle patterns and shoe patterns

Simple Neural Network Classifier (Logistic Regression) To estimate how much subjects are thinking about bottles, compute a weighted sum of voxel activity values; do the same for shoes Apply decision rule (e.g., sigmoid function) To train the classifier, we use a learning algorithm that sets the weights to maximize decision performance (e.g., backpropagation) Output layer BottleShoe vs Input layer (voxels)

Pattern Classification Method 1.Acquire brain data 2.Generate brain patterns 3.Label brain patterns 4.Train the classifier 5.Apply the trained classifier to new brain patterns (not presented at training).

Free Recall & Mental Time Travel (Polyn et al., 2005) How do we selectively retrieve memories from a particular event? Intuitively: We try to recapture our mindset from that event Concretely: We try to make our brain state during recall resemble our brain state during the original event Mental Time Travel Goal of the study: Use fMRI pattern-analysis to image this process of mental time travel as it happens...

Imaging Mental Time Travel (Polyn et al., 2005) Memory experiment: Subjects study 3 types of stimuli Jack Nicholson Giza pyramids flask Recall test: Recall items from all 3 categories, in any order Hypothesis: To recall a particular category, subjects try to recapture their mindset from the study phase In concrete terms: Subjects try to make their brain state at test resemble their brain state when they were studying that category If subjects succesfully recapture their brain state from the study phase, this will trigger recall of specific studied items...

Analysis strategy Step 1: Feed fMRI data from the study phase into a pattern classification algorithm Train the pattern classifier to recognize the brain patterns associated with studying faces vs. locations vs. objects

Neural network classifier Mapping from voxel activity values to output units (one per category)

Analysis strategy Step 2: Apply the trained classifier to brain data from the retrieval phase Use the classifier to track, second-by-second, how well the subjects brain state at retrieval matches their brain state when they were studying faces vs. locations vs. objects

Predictions As subjects try recall faces, locations, and objects, their brain state should come into alignment with the brain states associated with studying faces, locations, and objects This neural measure of category-specific mental reinstatement should be predictive of recall

Final free recall - classifier output match to face study context match to location study context match to object study context Classifier traces for Subject 9 during final free recall.

Final free recall - classifier output Classifier traces for Subject 9 during final free recall. match to face study context match to location study context match to object study context

Other findings Kamitani & Tong (2005): decode the orientation of a striped pattern that is being viewed by the subject (accurate to within 20 degrees)

2006 Pittsburgh competition Subjects were scanned while they watched 3 episodes of Home Improvement Time-varying ratings obtained for amusement, food, tools, faces... Goal: predict ratings using brain data Train a classifier using brain data + ratings from 2 episodes Then, feed the trained classifier the brain data from the 3rd episode and use the classifier to predict (in a second-by- second fashion) the subjects feature ratings

2006 competition some representative correlation values: Amusement:.46 Faces:.67 Language:.69 Laughter:.58 Motion:.49 Music:.76 Tools:.62

2007 Pittsburgh competition

Interim Summary By applying classifiers to fMRI data, we can derive a time- varying estimate of the subjects cognitive state, that relates in a meaningful way to their behavior Technical challenges

Technical Challenges From the perspective of machine learning, fMRI classification is a particularly difficult problem (Mitchell et al., 2004, Machine Learning) Big patterns Noisy patterns Relatively few patterns What can we do to improve classification?

Classifiers We have tried lots of classifiers –Neural network, correlation-based classifiers, support vector machines, Gaussian Naive Bayes, boosting, k-nearest- neighbor, linear discriminant analysis... The exact classifier that we use doesnt seem to matter (much); nonlinear classifiers do not systematically outperform linear classifiers... Regularization helps (e.g., ridge regression outperforms normal regression)

Feature Selection Getting rid of noisy voxels greatly helps performance Standard method: Run a voxel-wise omnibus ANOVA on the conditions of interest (e.g., face vs. location vs. object) Get rid of voxels that dont vary significantly across conditions

Feature Selection This ANOVA method helps, but it has several problems Main benefit of linear classifiers is that they can aggregate weak signals across voxels In light of this, it seems like a bad idea to discard individual voxels just because the voxels signal is weak...

Feature Selection What we really want to do is to come up with multivariate means of voxel selection we want to select sets of voxels that in aggregate carry useful information Promising approach: Searchlights (Kriegeskorte et al., 2006, PNAS)

Dimensionality Reduction We are also exploring different methods of re-coding the data There is extensive redundancy across voxels (esp. spatially proximal voxels) Is there a more efficient way to represent the input (i.e., with fewer dimensions) manifold learning Spatial wavelet decomposition ICA

Dimensionality reduction algorithms Generative models (David Weiss & David Blei) Each brain state is made of a linear combination of neural topics Each topic = a pattern of voxel activity across the whole brain (positive and negative values are OK) To generate a brain state from topics, multiply each topic by a positive value Topics are constrained to be spatially sparse (L2 regularization; trying L1 also)

Next steps We know a lot about the brain (in general), the fMRI response, and cognition that we are not telling the classifier… Currently: Each brain pattern is treated as a distinct observation In actuality: There is massive correlation between adjacent time points Knowing the information represented at time n tells you a lot about the information represented at time n + 1

Next steps In addition to temporal correlation, there is extensive spatial correlation Nearby voxels tend to represent similar things One way to address this issue is by spatially smoothing the data (averaging together activity from nearby voxels) However, you can lose information this way A more sophisticated approach would be to directly measure pairwise correlations between voxels and incorporate this information in the model

Next steps Currently, our analyses are focused on single subjects Is there some way to leverage data from other subjects to help with classification If you run 10 subjects in the Haxby 8-category experiment, none of the subjects will have the exact same shoe representation, but the shoe representations are not random either It might be possible to draw on data from other subjects to set priors on which voxels will be involved in representing shoes

Next steps Also, there is an enormous body of evidence relating to which brain structures are involved in a given cognitive task face area, place area We can use this information to set priors on voxel weights in the classification process

Next steps The cognitive states that we are trying to classify often have a hierarchical structure How you represent a stimulus depends on the task that you are performing Informing the classifier about this hierarchical structure should boost classification

Next steps Different tasks (dangerous/safe, land/water) have different neural signatures If we can detect the neural signatures of these tasks, we can conditionalize the classifer on which task representation is present in the subjects head

Next steps Lots of potential constraints Temporal autocorrelation Correlation between nearby voxels Data from other subjects in the same experiment Data from other experiments Hierarchical structure of cognitive states How do we inform the classifier of these constraints? Graphical models should provide a way of doing this….

Summary By applying pattern classification algorithms to neuroimaging data, we can extract a tremendous amount of information regarding what subjects are thinking, and how subjects thoughts evolve over time Plenty of room for improvement... Solving these problems will require meaningful contributions from several disciplines: Cognitive psychology, neuroscience, machine learning, engineering, signal processing, statistics, and mathematics...

Computational Memory Lab Michael Bannert Melissa Carroll Denis Chigirev Greg Detre Chris Moore Ehren Newman Joel Quamme Susan Robison Per Sederberg Matt Weber David Weiss And many others… Princeton Colleagues David Blei (Comp. Sci.) Matt Botvinick (PSY) Jon Cohen (PSY) Ingrid Daubechies (Math) Jim Haxby (PSY) Fei-Fei Li (Comp. Sci.) Dan Osherson (PSY) Peter Ramadge (EE) Rob Schapire (Comp. Sci.) Greg Stephens (Physics)

my Princeton Multi-Voxel Pattern Analysis Toolkit currently in public beta-testing: NiAM (NeuroImaging Analysis Methods) group meets Fridays 2pm