MVPD – Multivariate pattern decoding 23.4.2009 Christian Kaul MATLAB for Cognitive Neuroscience.

Slides:



Advertisements
Similar presentations
Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Advertisements

ECG Signal processing (2)
Neural networks Introduction Fitting neural networks
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Decoding Seen and Attended Edge Orientation and Motion Direction from the Human Brain Activity Measured by functional Magnetic Resonance Imaging (fMRI)
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
Multivariate analyses and decoding Kay H. Brodersen Computational Neuroeconomics Group Institute of Empirical Research in Economics University of Zurich.
A (very) brief introduction to multivoxel analysis “stuff” Jo Etzel, Social Brain Lab
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
x – independent variable (input)
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Application of Statistical Techniques to Neural Data Analysis Aniket Kaloti 03/07/2006.
Engineering Data Analysis & Modeling Practical Solutions to Practical Problems Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer.
Support Vector Machines Kernel Machines
Sparse Kernels Methods Steve Gunn.
2806 Neural Computation Support Vector Machines Lecture Ari Visa.
Multi-voxel Pattern Analysis (MVPA) and “Mind Reading” By: James Melrose.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Experimental Design Tali Sharot & Christian Kaul With slides taken from presentations by: Tor Wager Christian Ruff.
Classification III Tamara Berg CS Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell,
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
How To Do Multivariate Pattern Analysis
July 11, 2001Daniel Whiteson Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley.
Outline Separating Hyperplanes – Separable Case
Efficient Model Selection for Support Vector Machines
Author: Sotetsu Koyamada, Yumi Shikauchi, et al. (Kyoto University)
Comparison of Boosting and Partial Least Squares Techniques for Real-time Pattern Recognition of Brain Activation in Functional Magnetic Resonance Imaging.
Advance fMRI analysis: MVPA
1 / 41 Inference and Computation with Population Codes 13 November 2012 Inference and Computation with Population Codes Alexandre Pouget, Peter Dayan,
Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos VC 14/15 – TP19 Neural Networks & SVMs Miguel Tavares.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Seungchan Lee Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Software Release and Support.
FMRI Methods Lecture7 – Review: analyses & statistics.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Current work at UCL & KCL. Project aim: find the network of regions associated with pleasant and unpleasant stimuli and use this information to classify.
? Reading Hidden Intentions in the Human Brain John-Dylan Haynes, Katsuyuki Sakai, Geraint Rees, Sam Gilbert, Chris Frith, and Richard E. Passingham Tae.
fMRI Methods Lecture 12 – Adaptation & classification
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Applying Statistical Machine Learning to Retinal Electrophysiology Matt Boardman January, 2006 Faculty of Computer Science.
Distributed Representative Reading Group. Research Highlights 1Support vector machines can robustly decode semantic information from EEG and MEG 2Multivariate.
Christopher M. Bishop, Pattern Recognition and Machine Learning.
CISC667, F05, Lec22, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) Support Vector Machines I.
Pattern Classification of Attentional Control States S. G. Robison, D. N. Osherson, K. A. Norman, & J. D. Cohen Dept. of Psychology, Princeton University,
BCS547 Neural Decoding.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
Functional Brain Signal Processing: EEG & fMRI Lesson 14
Support Vector Machines and Gene Function Prediction Brown et al PNAS. CS 466 Saurabh Sinha.
C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Machine Learning Methods on functional MRI Data Siemens AG Corporate.
1 Statistics & R, TiP, 2011/12 Neural Networks  Technique for discrimination & regression problems  More mathematical theoretical foundation  Works.
1 Modeling the fMRI signal via Hierarchical Clustered Hidden Process Models Stefan Niculescu, Tom Mitchell, R. Bharat Rao Siemens Medical Solutions Carnegie.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Sensory recruitment during visual Working Memory John Serences Department of Psychology University of California, San Diego.
Next, this study employed SVM to classify the emotion label for each EEG segment. The basic idea is to project input data onto a higher dimensional feature.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
1 An introduction to machine learning for fMRI Francisco Pereira Botvinick Lab Princeton Neuroscience Institute Princeton University.
High resolution product by SVM. L’Aquila experience and prospects for the validation site R. Anniballe DIET- Sapienza University of Rome.
Multivariate Pattern Analysis of fMRI data. Goal of this lecture Introduction of basic concepts & a few commonly used approaches to multivariate pattern.
Representational Similarity Analysis
Representational Similarity Analysis
Multi-Voxel Pattern Analyses MVPA
Classification of fMRI activation patterns in affective neuroscience
Volume 87, Issue 2, Pages (July 2015)
Gaurav Aggarwal, Mark Shaw, Christian Wolf
Decoding Neuronal Ensembles in the Human Hippocampus
Neural Correlates of Visual Working Memory
Adaptive multi-voxel representation of stimuli, rules and responses
Outlines Introduction & Objectives Methodology & Workflow
Presentation transcript:

MVPD – Multivariate pattern decoding Christian Kaul MATLAB for Cognitive Neuroscience

Outline What is MVPD –What types of classifiers are there? MVPD in fMRI How to design an experiment – a few examples The MVPD MatLab toolbox Common problems when thinking about MVPD of fMRI data Relevant introduction papers

What is MVPD? –Methodology in which an algorithm is trained to tell two or more conditions from each other. –The algorithm is then presented with a new set of data and categorises/classifies it into the conditions previously learned. –MVPD is a relatively new tool in fMRI, however note that Pattern Classification as such has long been developed and used in Artificial Intelligence and Neuronal Networks. MVPD – Multivariate pattern decoding

What types of classifiers are there? The most common classifiers used for fMRI data are –LDA (Linear Discriminant Analysis) –SVM (Support Vector Machines) –SLR (Sparse Logistic regression) All are generally doing a good job. SLR and LDA find solutions based on linear combinations of features only. However SVMs also take non-linear effects into account. This is largely done by mapping the information into a higher dimensional space (feature space). algorithm: maximize margin!

Non-linear SVMs - Feature space 2 examples: Downside of non-linear SVMs: There are more and more parameters to be optimized during learning.

MVPD in fMRI In situations where we do find a univariate effect, a multivariate effect is unlikely to reveal anything new! But when conventional analysis is not feasible, multivariate analysis might be an option. What are we actually measuring? What does a “pattern of brain activity” mean? Example: Visual feature sensitive information is present in BOLD signal

fMRI of basic visual features - Conventional analysis was thought to be not feasible due to its lack of spatial resolution, compared to invasive single cell recordings. = + Haynes & Rees (2006)

fMRI of basic visual features - Conventional analysis was thought to be not feasible due to its lack of spatial resolution, compared to invasive single cell recordings. Haynes & Rees (2006) +

fMRI of basic visual features Haynes & Rees (2005) Mean signal LDA Kamitani & Tong (2006) Multivariate Decoding! Pattern Often multivariate results are presented ROI specific...

Multivariate pattern analysis – how to design an experiment Does the pattern of activity contain meaningful information we can extract?  Not the level of brain activity is addressed, but the pattern of information within the activity.  questions that can be answered with multivariate pattern analysis:  “What have I seen?” Decoding of visual input, majority of publications  “What have I heard/ felt/ …?” Decoding of other sensitive input should be possible.  “What am I going to do next?” Decisions seem to be coded in distinctive patterns of brain activity.

More interesting Questions? Does feature selective information contained in the BOLD signal for an irrelevant stimulus change under different levels of attentional load in a central task? = + ?

Experiment 1 Prediction (from load theory): –Feature selective information should be reduced in high load condition V2V3V1 Low ROI Accuracy High Chance % correct decoded

Decoding Result Result: Feature selective information NOT reduced expected actual V2V3V1 Low ROI Accuracy High Chance

Example 2 - intentions Question: At the beginning of each trial, the word “select” was presented that instructed the subjects to freely and covertly choose one of two possible tasks, addition or subtraction. From the button press, it was possible to determine the covert intention of the subject during the previous delay period. Decoding objective : –Can subjects decision be decoded? Haynes et al, 2007c

Example 2, Result In the anterior medial prefrontal cortex decoding during the delay (green bars) was highest but was at chance level during the task execution (red bars) after onset of the task-relevant stimuli. In contrast posterior & superior medial prefrontal cortex (MPFCp) encoded the chosen task only once it had entered the stage of execution, but not during the delay period. Results presented with “searchlight” approach: A spherical searchlight centered on one voxel is used to define a local neighborhood. Haynes et al, 2007c

Example 3 – Voxel based tuning functions Serences et al, 2008 Monkey-data like tuning functions with fMRI!

Example 4 – Real time reconstruction of seen images Miyawaki et al, 2008

The MVPD MatLab “toolbox” MatLab- functions to perform MVPD with “any” suitable data. Presented is the basic control-script. –It is quite easy to follow the workflow in this control- script as a demonstration of how MVPD using SLR can look like.  If anyone is interested in working with the code, please contact me directly:

The common problems when thinking about MVPD of fMRI data Decoding of what? TR, block average, betas. Overfitting - too many features at too few data samples. Voxel selection.

The common problems when thinking about MVPD of fMRI data: TR, BLOCK or BETA? In principle there are 3 different strategies how to get your brain pattern: single TRs (raw data), averaged blocks of TRs, betas (spm-estimates). Number of observations Noise single TRs avg. BLOCKs BETAs

The common problems when thinking about MVPD of fMRI data - OVERFITTING –(1) an SVM classifier is unstable on a small-sized training set; –(2) SVM’s optimal hyper-plane may be biased when the positive feedback samples are much less than the negative samples –(3) overfitting happens because the number of feature dimensions is much higher than the size of the training set.

Over-fitting and Under-fitting To avoid overfitting, cross-validation is used to evaluate the fitting provided by each parameter value set tried during the grid or pattern search process.

The common problems: VOXEL SELECTION (LDA & SVM) To reduce feature input dimensionality (# of voxels) it is common to preselect voxels: –ROI based selection on voxels But: ROI must be defined independent from classification –Threshold based selection of voxels But: threshold must be independent from classification –Searchlight approach: A fixed sphere is moved over the brain, voxel-by-voxel But: multiple comparisons! SLR does not have this problem due to automatic relevance detection

Relevant introduction papers Revealing representational content with pattern- information fMRI--an introductory guide. Mur M, Bandettini PA, Kriegeskorte N Machine learning classifiers and fMRI: a tutorial overview. Pereira F, Mitchell T, Botvinick M Sparse estimation automatically selects voxels relevant for the decoding of fMRI activity patterns. Yamashita O, Sato MA, Yoshioka T, Tong F, Kamitani Y.

Thanks – enjoy this sunny afternoon!