Multivariate analysis (Machine learning) Supervised True answer is known Classification Answer is categorical Regression Answer is continuous (ordered ) BinaryMulticlass Unsupervised similarities/structure in the data Clustering Data grouping incl. hierarchy Dimension reduction Feature grouping Discriminative (difference)Generative (data) Bayesian (full distribution)VS.Frequentist (estimates) Identical independently distributed (iid)Structured data SVM classification: supervised, binary classification, discriminative, frequentist, iid DCM: unsupervised, generative, Bayesian, structured Bayesian linear regression: supervised, regression, generative, Bayesian, iid
Data points: trials, subjects,… Features: activity of voxel subsets, GLM coefficients, DCM parameters, questionnaire data, … Labels: stimulus, behavioral model state (i.e. learning rate), diagnoses, questionnaire score, … Model: clustering, hierarchy, effective connectivity, … Feature selection & preprocessing!!! Multivariate analysis (Neuroscience) Features Data points Label/target method or model
Interpretation of results: Prediction: BCI, diagnostics, treatment outcome Hypothesis comparison by performance comparison: subsets of voxels, model structures, number of clusters etc. Performance significance as an evidence of information being encoded in the data Interpretation of parameters: feature weights, clustering centroids, most prominent dimensions, etc… Multivariate analysis (Neuroscience) Features Data points Label/target method or model
K-fold cross-validation
Accuracy vs. balanced accuracy actual +actual – predicted + predicted – Brodersen, Ong, Stephan, Buhmann (2010) ICPR Accuracy Balanced accuracy
empirical distribution parametric model based on true labels voxel weight based on permuted labels frequency We assessed the informativeness of each voxel by comparing its coefficient to an empirical null distribution. Statistical inference on voxel coefficients