Pattern Recognition for NeuroImaging (PR4NI) Interpreting predictive models in terms of anatomically labelled regions J. Schrouff Stanford University,

Slides:



Advertisements
Similar presentations
SPM Software & Resources Wellcome Trust Centre for Neuroimaging University College London SPM Course London, May 2011.
Advertisements

SPM Software & Resources Wellcome Trust Centre for Neuroimaging University College London SPM Course London, October 2008.
Bayesian models for fMRI data
Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
Topological Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course London, May 2014 Many thanks to Justin.
Correlation Aware Feature Selection Annalisa Barla Cesare Furlanello Giuseppe Jurman Stefano Merler Silvano Paoli Berlin – 8/10/2005.
A (very) brief introduction to multivoxel analysis “stuff” Jo Etzel, Social Brain Lab
COMPUTER AIDED DIAGNOSIS: FEATURE SELECTION Prof. Yasser Mostafa Kadah –
Lecture 4: Embedded methods
Multiple testing Justin Chumbley Laboratory for Social and Neural Systems Research Institute for Empirical Research in Economics University of Zurich With.
Multivariate Methods Pattern Recognition and Hypothesis Testing.
Multiple testing Justin Chumbley Laboratory for Social and Neural Systems Research Institute for Empirical Research in Economics University of Zurich With.
Multiple comparison correction Methods & models for fMRI data analysis 18 March 2009 Klaas Enno Stephan Laboratory for Social and Neural Systems Research.
Predictive Modeling of Spatial Properties of fMRI Response Predictive Modeling of Spatial Properties of fMRI Response Melissa K. Carroll Princeton University.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Reduced Support Vector Machine
Multiple comparison correction Methods & models for fMRI data analysis 29 October 2008 Klaas Enno Stephan Branco Weiss Laboratory (BWL) Institute for Empirical.
Preprocessing II: Between Subjects John Ashburner Wellcome Trust Centre for Neuroimaging, 12 Queen Square, London, UK.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Ensemble Learning (2), Tree and Forest
Jeff Howbert Introduction to Machine Learning Winter Machine Learning Feature Creation and Selection.
General Linear Model & Classical Inference
London, UK 21 May 2012 Janaina Mourao-Miranda, Machine Learning and Neuroimaging Lab, University College London, UK Pattern Recognition for Neuroimaging.
General Linear Model & Classical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM M/EEGCourse London, May.
2nd Level Analysis Jennifer Marchant & Tessa Dekker
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
How To Do Multivariate Pattern Analysis
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
Current work at UCL & KCL. Project aim: find the network of regions associated with pleasant and unpleasant stimuli and use this information to classify.
Usman Roshan Machine Learning, CS 698
Bayesian Inference and Posterior Probability Maps Guillaume Flandin Wellcome Department of Imaging Neuroscience, University College London, UK SPM Course,
Applying Statistical Machine Learning to Retinal Electrophysiology Matt Boardman January, 2006 Faculty of Computer Science.
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.
NIPS 2001 Workshop on Feature/Variable Selection Isabelle Guyon BIOwulf Technologies.
Contrasts & Statistical Inference
MVPD – Multivariate pattern decoding Christian Kaul MATLAB for Cognitive Neuroscience.
The False Discovery Rate A New Approach to the Multiple Comparisons Problem Thomas Nichols Department of Biostatistics University of Michigan.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Dd Generalized Optimal Kernel-based Ensemble Learning for HS Classification Problems Generalized Optimal Kernel-based Ensemble Learning for HS Classification.
C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Machine Learning Methods on functional MRI Data Siemens AG Corporate.
Spatial Smoothing and Multiple Comparisons Correction for Dummies Alexa Morcom, Matthew Brett Acknowledgements.
Statistical Analysis An Introduction to MRI Physics and Analysis Michael Jay Schillaci, PhD Monday, April 7 th, 2007.
Logistic Regression & Elastic Net
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
SPM short – Mai 2008 Linear Models and Contrasts Stefan Kiebel Wellcome Trust Centre for Neuroimaging.
Multiple comparisons problem and solutions James M. Kilner
Topological Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course London, May 2015 With thanks to Justin.
1 Bernard Ng 1, Arash Vahdat 2, Ghassan Hamarneh 3, Rafeef Abugharbieh 1 Contact 1 Biomedical Signal and Image Computing Lab,
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Bayesian Inference in SPM2 Will Penny K. Friston, J. Ashburner, J.-B. Poline, R. Henson, S. Kiebel, D. Glaser Wellcome Department of Imaging Neuroscience,
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
General Linear Model & Classical Inference London, SPM-M/EEG course May 2016 Sven Bestmann, Sobell Department, Institute of Neurology, UCL
General Linear Model & Classical Inference Short course on SPM for MEG/EEG Wellcome Trust Centre for Neuroimaging University College London May 2010 C.
Multivariate analysis (Machine learning) Supervised True answer is known Classification Answer is categorical Regression Answer is continuous (ordered.
Multivariate Pattern Analysis of fMRI data. Goal of this lecture Introduction of basic concepts & a few commonly used approaches to multivariate pattern.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Topological Inference
Classification of fMRI activation patterns in affective neuroscience
Machine Learning Feature Creation and Selection
Wellcome Trust Centre for Neuroimaging University College London
A Similarity Retrieval System for Multimodal Functional Brain Images
Topological Inference
Contrasts & Statistical Inference
The general linear model and Statistical Parametric Mapping
SPM2: Modelling and Inference
Contrasts & Statistical Inference
Bayesian Inference in SPM2
Will Penny Wellcome Trust Centre for Neuroimaging,
Contrasts & Statistical Inference
Presentation transcript:

Pattern Recognition for NeuroImaging (PR4NI) Interpreting predictive models in terms of anatomically labelled regions J. Schrouff Stanford University, CA, USA

J. Schrouff Stanford University, CA, USA Data Feature extraction/selection Model specification Accuracy and significance Model interpretation Pattern Recognition for NeuroImaging (PR4NI) 2

Where in the brain is the information of interest? i.e. Which brain areas contribute most to the model’s prediction? Data Feature extraction/selection Model specification Accuracy and significance Model interpretation J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 3

Computing “weight maps”: - From linear (kernel) methods - Attributing one weight per feature - For an SVM model: J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 4

J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 5

J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 6 1

Different strategies: 1.A priori knowledge: ROI selection 2.Searchlight 3.Feature selection 4.Sparse algorithms 5.Multiple Kernel Learning (MKL) 6.Permutation tests 7.A posteriori knowledge: weight summarization J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 7

Different strategies: 1.A priori knowledge: ROI selection 2.Searchlight 3.Feature selection 4.Sparse algorithms 5. Multiple Kernel Learning (MKL) 6.Permutation tests 7.A posteriori knowledge: weight summarization 8 Data Feature extraction/selection Model specification Accuracy and significance Model interpretation J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 8

1. A priori knowledge: ROI selection e.g. Choosing voxels/features based on literature or on previous univariate results (independent dataset). J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 9

1. A priori knowledge: ROI selection +- Good accuracy?Need an anatomical hypothesis Anatomically/ functionally relevant Cannot threshold obtained map Arbitrary choice of region number and size Multiple models = multiple comparisons J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 10

2. Searchlight (Kriegeskorte et al., 2006) J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 11

2. Searchlight (Kriegeskorte et al., 2006) +- Threshold for each neighborhood Not anatomically/functionally relevant (spheres) Not a weight filterLocally multivariate Arbitrary choice of neighborhood size Multiple models = multiple comparisons Time-consuming J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 12 Review: Etzel et al., 2013

3. Feature selection Rank the features according to a specific criterion and select a subset. Example: -Univariate t-test -Recursive Feature Elimination (RFE, De Martino, 2008) based on SVM weights J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 13

3. Feature selection +- Data-drivenNot anatomically/functionally relevant (voxel sparse) Time-consuming: nested CV Depends on selected technique: flaws in feature selection are reflected in model.  user choice must be informed. Different strategies = different patterns with same accuracy. J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 14

4. Sparse algorithms Algorithms with regularization constraints such that the weight of some features is perfectly null. Examples: -LASSO (Tibshirani, 1996) -Elastic-net (Zou and Hastie, 2005) J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 15

4. Sparse algorithms +- Data-drivenNot anatomically/functionally relevant (voxel sparse) Embedded in modelStability of selected voxels (Baldassarre, et al., 2012): different regularization terms = different patterns for same accuracy. J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 16

4. Sparse algorithms Note: stability selection (Varoquaux et al., 2012, Rondina et al., 2014). +- Data-drivenNot anatomically/functionally relevant Embedded in modelStability of selected voxels (Baldassarre, et al., 2012): different regularization terms = different patterns for same accuracy. J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 17

J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 18

19 J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 19

20 J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 20

5. Multiple Kernel Learning (MKL) J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 21

5. Multiple Kernel Learning (MKL) J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 22

5. Multiple Kernel Learning (MKL) +- Anatomically/ functionally relevant Stability of the selected regions Used the full pattern to build the model Depends on atlas Thresholded and sorted list of regions according to d m Hierarchical view of the model parameters J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 23

6. Permutation test Build weight maps from permutation tests (e.g. Mourão- Miranda et al., 2005) and associate a p-value to each weight. J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 24 Simulated data. Image from Gaonkar and Davatzikos, 2013.

6. Permutation test +- Map can be thresholdedComputationally expensive Used the full pattern to build the model Need to correct for multiple comparisons (cannot make the hypothesis of independent voxels/weights) J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 25 Note: analytic approximation of null distribution for SVM (Gaonkar and Davatzikos, 2013).

J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 26

7. A posteriori knowledge: weight summarization AAL atlas (Tzourio-Mazoyer, 2002). J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 27

7. A posteriori knowledge: weight summarization +- Anatomically/ functionally relevant No thresholding of the sorted list of regions Used the full pattern to build the model Depends on atlas Sorted list of regions according to their proportion of NW ROI J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 28

How about electrophysiological recordings? -All strategies explained previously apply -For MKL, can be applied on each dimension (i.e. channels, time point, frequency band) or combination (e.g. channel + frequency band) J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 29

How about non-linear models? -Statistical sensitivity maps (Rasmussen et al., 2011) for SVM, kernel logistic regression and kernel Fisher discriminant. sensitivity map (linear) = (weight filter) 2 -Use of non-linear kernels in neuroimaging? (Kragel et al., 2012) J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 30

Conclusions: -Interpreting machine learning based models can be complex. -Different strategies exist, but always a trade-off between different parameters: Thresholded list of voxels/regions contributing Anatomical/functional relevance Multivariate power Stability of the pattern Computational time Your (informed) choice! J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 31

Acknowledgments: -Belgian American Educational Foundation (BAEF) -Medical Foundation of the Liège Rotary Club -Stanford University -University College London -Wellcome Trust Figures created using PRoNTo version 2.0 ( 32 J. Schrouff Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 32

References: -Baldassarre, L., J. Mourao-Miranda, M. Pontil. «Structured Sparsity Models for Brain Decoding from fMRI data.» Proceedings of the 2nd conference on Pattern Recognition in NeuroImaging Gaonkar, B., C. Davatzikos. «Analytic estimation of statistical significance maps for support vector machine based multi-variate image analysis and classification.» NeuroImage 78 (2013): Etzel, J., A. Jeffrey, M. Zacks, T.S. Braver. «Searchlight analysis: promise, pitfalls, and potential.» Neuroimage 78 (2013): Filippone, M., A. Marquand, C. Blain, C. Williams, J. Mourao-Miranda, M. Girolami. «Probabilistic prediction of neurological disorders with a statistical assessement of neuroimaging data modalities.» Annals of Applied Statistics 6 (2012): Haufe, S., F. Meinecke, K. Görgen, S. Dhäne, J.-D. Haynes, B. Blankertz, F. Biessmann. «On the interpretation of weight vectors of linear models in multivariate neuroimaging.» NeuroImage 87 (2014): Kragel, P., R. Carter, S. Huettel. «What makes a pattern? Matching decoding methods to data in multivariate pattern analysis.» Front Neurosci. 6 (2012): Kriegeskorte, N., R. Goebel, P. Bandettini. «Information-based functional brain mapping.» PNAS 103 (2006): Mourão-Miranda, J., A. Bokde, C. Born, H. Hampel, M. Stetter. «Classifying brain states and determining the discriminating activation patterns: Support Vector Machine on functional MRI data.» NeuroImage 28 (2005):

References: -Rakotomamonjy, A., F. Bach, S. Canu, et Y. Grandvalet. «SimpleMKL.» Journal of Machine Learning 9 (2008): Rasmussen, P., K. Madsen, T. Lund, L.K. Hansen. «Visualization of nonlinear kernel models in neuroimaging by sensitivity maps.» NeuroImage 55 (2011): Schrouff, J., J. Cremers, G. Garraux, L. Baldassarre, J. Mourão-Miranda, et C. Phillips. «Localizing and comparing weight maps generated from linear kernel machine learning models.» Proceedings of the 3rd workshop on Pattern Recognition in NeuroImaging Tibshirani, R. «Regression shrinkage and selection via the lasso.» Journal of the Royal Statistical Society. 58 (1996): Tzourio-Mazoyer, N., et al. «Automated Anatomical Labeling of activations in SPM using a Macroscopic Anatomical Parcellation of the MNI MRI single-subject brain.» NeuroImage 15 (2002): Varoquaux, G., A. Gramfort, B. Thirion. «Small-sample brain mapping: sparse recovery on spatially correlated designs with randomization and clustering.» Proceedings of ICML (2012). icml.cc/discuss/2012/688.html -Zou, H., et T. Hastie. «Regularization and variable selection via the elastic net.» J. R. Statist. Soc. B 67 (2005):