C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Machine Learning Methods on functional MRI Data Siemens AG Corporate.

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Machine Learning Methods on functional MRI Data Siemens AG Corporate Technology Dept. of Neural Computation (2003-2005) Team: Dr. Janaina Mourao-Miranda, Dr. Martin Stetter In Cooperation with : Dr. Arun Bokde, Ludwig Maxmilians University

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Aim: Develop machine learning algorithms to train classifiers to detect differences in brain activity between two cognitive states or between groups of subjects (e.g. task 1 vs. task 2 or patients vs. healthy controls) f : single fMRI scan -> cognitive state or group membership

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Input (e.g. brain scans): X 1 X 2 X 3 Output (e.g. patients vs. controls) y 1 y 2 y 3 Learning Methodology f Supervised Learning Learning/Training Generate a function or hypothesis f such that Training Examples: (X 1, y 1 ), (X 2, y 2 ),...,(X n, y n ) f(x i ) -> y i f Test Prediction Test Example X i f(X i ) = y i

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Example of Neuroimaging (“brain scan”) techniques: –Computed Tomography (CT), –Positron Emission Tomography (PET), –Single Photon Emission Computed Tomography (SPECT), –Structural Magnetic Resonance Imaging (MRI), –Functional Magnetic Resonance Imaging (fMRI). Among other imaging modalities MRI/fMRI became largely used due to its low invasiveness, lack of radiation exposure, and relatively wide availability.

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation MRI studies brain anatomy. Functional MRI (fMRI) studies brain function. MRI vs. fMRI Source: Jody Culham’s fMRI for Dummies web sitefMRI for Dummies

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Examples of brain scans MRI one image high resolution (1 mm) fMRI many images (e.g., every 2 sec for 5 mins) low resolution (~3 mm but can be better)

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation When neurons fire in response to sensory or cognitive process a sequence of events happens resulting in an increase in local cerebral metabolism. An increase in neural activity (and metabolism) causes an increased demand for oxygen. To compensate for this demand the vascular system increases the amount of oxygenated haemoglobin relative to the deoxygenated haemoglobin. fMRI measures changes in the Blood Oxygen Level Dependent (BOLD) signal due to changing in neural activity. Source: Arthurs & Boniface, 2002, Trends in Neurosciences fMRI: What it measures?

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation fMRI Setup

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation fMRI: relative measure Brain scans acquired during task 1 Brain scans acquired during task 2 Brain scans acquired during task 1 Brain scans acquired during task 2 time ………… … During a standard fMRI experiment, hundreds of volumes or scans comprising brain activations at thousands of locations (voxels) are acquired. Brain scan 3D matrix of voxels

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation The most popular method is the General Linear Model – GLM (Friston et al.,1995), in which a regression is performed on the signal value at a voxel in order to determine whether the voxel’s activity is related to one stimulus or cognitive state. Typical question: Which areas are related with one stimulus or cognitive state? Programs: SPM (FIL-UCL), AFNI (NIMH-NIH) … fMRI Data Analysis

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Time single voxel time series single voxel time series Intensity BOLD signal fMRI Data Slice Voxel fMRI time series

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Intensity Time = 11 00 + + noise x1x1 x0x0  Y Regression model: Least squares parameter estimate Null hypothesis:  = (X T X) -1 X T Y  1 =0 t =  /Std(  ) Y = X  + 

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation In these applications the fMRI scans are treated as spatial patterns and machine learning methods are used to identify statistical properties of the data that discriminate between brain states (e.g. task 1 vs. task 2) or group of subjects (e.g. patients and controls). Multivariate pattern recognition methods

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Each fMRI volume is treated as a vector in a extremely high dimensional space (~200,000 voxels or dimensions after the mask) fMRI data as input to a classifier fMRI volume feature vector (dimension = number of voxels)

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation ML - training Input Output Volumes from task 1 Volumes from task 2 … … Map: Discriminating regions between task 1 and task 2 Machine Learning Approach on fMRI data ML - test Prediction: task 1 or task 2 New example

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Binary classification can be viewed as a task of finding a hyperplane voxel 1 voxel 2 w Hyperplane H: w.X i +b=0 w.X i +b>0 task 1 w.X i +b<0 task 2 fMRI scans from task 1 fMRI scans from task 2 fMRI scan from a new subjects Machine Learning (Training) Test volume in t 1 volume in t 3 volume in t 2 volume in t 4 volume in t 1 volume in t 3 volume in t 2 volume in t 4 volume from a new subject where: X i is an example (volume) w is a learning weight vector b is the offset

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation First Approach: Fisher Linear Discriminant (FLD) voxel 1 voxel 2 w thr w Projections onto the learning weight vector FLD with correction w FLD without correction w

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation 1. Compute the mean vector of each class (i=1,2 task or group). 2. Find a (normalized) weight vector between the two means. 4. Project each volume onto the weight vector 3. Correct for the weight vector by the within-class covariance 5. Choose a threshold thr = (m 2 + m 1 ) / 2 and classify Fisher Linear Discriminant (FLD)

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Optimal Hyperplane Which of the linear separators is optimal? A classifier that does very well on the training data might not generalize well to unseen examples. voxel 1 voxel 2 SVM selects from many possible solutions the most robust one (“large margin classifier”).

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation If the optimal hyperplane has margin  >r it will correctly separate the test points. r  Among all hyperplanes separating the data there is a unique optimal hyperplane, the one which presents the largest margin (the distance of the closest points to the hyperplane). Largest Margin Classifier Let us consider that all test points are generated by adding bounded noise (r) to the training examples. Given a training set with 6 examples:

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Finding the optimal hyperplane is a quadratic optimization problem with linear constrains and can be formally stated as: Determine w and b that minimize the functional  (w) = ||w|| 2 /2 subject to the constraints y i [(wX i )+b] ≥ 1, i=1,…,n The solution has the form: w = Σα i y i X i b = wX i -y i for any X i such that α i  0 The examples X i for which α i > 0 are called the Support Vectors. Data:, i=1,..,N Observations: X i  R d Labels: y i  {-1,+1} w Support vectors Optimal hyperplane d XiXi  Margin Second Approach: Support Vector Machine (SVM)

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation How to interpret the weight vector w (Discriminating Volume)? Weight vector (Discriminating Volume) W = [0.45 0.89] 14232.54.50.50.311.521 task1task2task1task2task1task2 H: Hyperplane w The value of each voxel in the weight vector indicates the importance of such voxel in discriminating between the two classes or brain states. 0.450.89

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation General Procedure Split data: training and test ML training and test Dimensionality Reduction (e.g. PCA) and/or feature selection (e.g. ROI) Pre-processing: Realignment Normalization Smooth Outputs: 1. Accuracy 2. Discriminating Maps (weight vector)

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation We used fMRI data from 16 healthy subjects and 16 MCI (Mild Cognitive Impairment) patients during two different experiments: Face Matching Experiment Location Matching Experiment Application 1

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Time Experiment Design I : Face Matching Control task Instruction Face matching task Press button when faces are identical Press button when image appears (Scans or Volumes)

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Time Experiment Design II : Location Matching Control task Instruction Location matching task Press button when location of abstract images are different Press button when image appears (Scans or Volumes)

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Data Description Leave one-out-test Number of subjects: 16 First Experiment: Face matching task (3 blocks of 7 scans) x Control Task (3 blocks of 7 scans) Second Experiment: Location task (3 blocks of 7 scans) x Control Task (3 blocks of 7 scans) Pre-Processing Procedures Time shift correction, motion correction, normalization to standard space (MNI template) Correction for base line and the low frequency components. Mask to select voxels inside the brain. Machine Learning: 15 subjects Test: 1 subject This procedure was repeated 16 times and the results were averaged. Sensitivity = TP/(TP+FN) Specificity = TN/(TN+FP) Error rate: the ratio of the number of data units in error to the total of data units.

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Training Phase : Test Phase:............ volume with the most discriminative regions Machine Learning fMRI volume from a new subject task 1 or task 2 Machine Learning 21 volumes x 15 subjects = 315 volumes of task 1 21 volumes x 15 subjects = 315 volumes of task 1 21 volumes of task 1 21 volumes of task 2 w X Projection of the volume

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Control task (negative x) and Face task (positive o) PCA & FLD: Learning weight vector Face task x Control task Healthy Subjects - FLD : test individual volumes

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Healthy Subjects - FLD : test individual volumes Control task (negative x) and Location task (positive o) PCA & FLD: Learning weight vector Location task x Control task

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Healthy Subjects - FLD : test individual volumes Location task (negative x) and Face task (positive o) PCA & FLD: Learning weight vector Face task x Location task

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Healthy Subjects - SVM : test individual volumes Control task (negative x) and Face task (positive o) PCA & SVM: Learning weight vector Face task x Control task

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Healthy Subjects - SVM : test individual volumes Control task (negative x) and Location task (positive o) PCA & SVM: Learning weight vector Location task x Control task

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Healthy Subjects - SVM : test individual volumes Location task (negative x) and Face task (positive o) PCA & SVM: Learning weight vector Face task x Location task

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Patients vs. Healthy Subjects - FLD : test individual volumes Face task: Healthy Subject (negative x) and Patient (positive o) Location task: Healthy Subject (negative x) and Patient (positive o)

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Patients vs. Healthy Subjects - SVM : test individual volumes Face task: Healthy Subject (negative x) and Patient (positive o) Location task: Healthy Subject (negative x) and Patient (positive o)

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Test error ratesensitivityspecificity FLDSVMFLDSVMFLDSVM Face X Control0.260.130.720.850.760.90 Location X Control0.180.170.710.780.930.88 Face X Location0.470.270.450.760.620.70 Test error ratesensitivityspecificity FLDSVMFLDSVMFLDSVM Face X Control0.340.250.450.710.870.79 Location X Control0.470.270.450.760.620.70 Face X Location0.350.300.700.720.600.68 Patients Healthy Subjects

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Test error ratesensitivityspecificity FLDSVMFLDSVMFLDSVM Face task0.440.500.550.440.57 Location task0.500.520.460.360.540.60 Healthy Subjects vs. Patients

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Conclusions The classifiers were able to distinguish between tasks for both groups Face matching task vs. Control task Location matching task vs. Control task Face matching task vs. Location matching task The classifiers were not able to distinguish between the groups Location matching task: Healthy vs. Patients Face matching task: Healthy vs. Patients Which method is better? Using 5 subjects the results are similar for both classifiers. Using 16 subjects SVM presented better results than the FLD.

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Machine Learning Methods on functional MRI Data Siemens AG Corporate.

Similar presentations

Presentation on theme: "C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Machine Learning Methods on functional MRI Data Siemens AG Corporate."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Machine Learning Methods on functional MRI Data Siemens AG Corporate.

Similar presentations

Presentation on theme: "C O R P O R A T E T E C H N O L O G Y Information & Communications Neural Computation Machine Learning Methods on functional MRI Data Siemens AG Corporate."— Presentation transcript:

Similar presentations

About project

Feedback