Extensions of Non-Negative Matrix Factorization (NMF) to Higher Order Data HONMF (Higher Order Non-negative Matrix Factorization) NTF2D/SNTF2D ((Sparse)

Slides:



Advertisements
Similar presentations
Nonnegative Matrix Factorization with Sparseness Constraints S. Race MA591R.
Advertisements

CS 450: COMPUTER GRAPHICS LINEAR ALGEBRA REVIEW SPRING 2015 DR. MICHAEL J. REALE.
Multilinear Algebra for Analyzing Data with Multiple Linkages Tamara G. Kolda plus: Brett Bader, Danny Dunlavy, Philip Kegelmeyer Sandia National Labs.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Informatics and Mathematical Modelling / Intelligent Signal Processing 1 Morten Mørup Decomposing event related EEG using Parallel Factor Morten Mørup.
Fitting the PARAFAC model Giorgio Tomasi Chemometrics group, LMT,MLI, KVL Frederiksberg. Denmark
Informatics and Mathematical Modelling / Intelligent Signal Processing 1 Morten Mørup Extensions of Non-negative Matrix Factorization to Higher Order data.
1 Maarten De Vos SISTA – SCD - BIOMED K.U.Leuven On the combination of ICA and CPA Maarten De Vos Dimitri Nion Sabine Van Huffel Lieven De Lathauwer.
Nonsmooth Nonnegative Matrix Factorization (nsNMF) Alberto Pascual-Montano, Member, IEEE, J.M. Carazo, Senior Member, IEEE, Kieko Kochi, Dietrich Lehmann,
Latent Causal Modelling of Neuroimaging Data Informatics and Mathematical Modeling Morten Mørup 1 1 Cognitive Systems, DTU Informatics, Denmark, 2 Danish.
1 Applications on Signal Recovering Miguel Argáez Carlos A. Quintero Computational Science Program El Paso, Texas, USA April 16, 2009.
Bayesian Nonparametric Matrix Factorization for Recorded Music Reading Group Presenter: Shujie Hou Cognitive Radio Institute Friday, October 15, 2010 Authors:
ERPWAVELAB 1st International Summer School in Biomedical Engineering1st International Summer School in Biomedical Engineering August 8, st International.
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
Clustered alignments of gene- expression time series data Adam A. Smith, Aaron Vollrath, Cristopher A. Bradfield and Mark Craven Department of Biosatatistics.
Non-Negative Tensor Factorization with RESCAL Denis Krompaß 1, Maximilian Nickel 1, Xueyan Jiang 1 and Volker Tresp 1,2 1 Department of Computer Science.
Non-negative Tensor Decompositions
A quick introduction to the analysis of questionnaire data John Richardson.
Basic Concepts and Definitions Vector and Function Space. A finite or an infinite dimensional linear vector/function space described with set of non-unique.
ICA Alphan Altinok. Outline  PCA  ICA  Foundation  Ambiguities  Algorithms  Examples  Papers.
Sufficient Dimensionality Reduction with Irrelevance Statistics Amir Globerson 1 Gal Chechik 2 Naftali Tishby 1 1 Center for Neural Computation and School.
Informatics and Mathematical Modelling / Intelligent Signal Processing ISCAS Morten Mørup Approximate L0 constrained NMF/NTF Morten Mørup Informatics.
Normalised Least Mean-Square Adaptive Filtering
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Machine Learning for Signal Processing Latent Variable Models and Signal Separation Class Oct Oct /18797.
SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS Jain-De,Lee Emad M. GraisHakan Erdogan 17 th International.
ERP DATA ACQUISITION & PREPROCESSING EEG Acquisition: 256 scalp sites; vertex recording reference (Geodesic Sensor Net)..01 Hz to 100 Hz analogue filter;
Informatics and Mathematical Modelling / Intelligent Signal Processing 1 EUSIPCO’09 27 August 2009 Tuning Pruning in Sparse Non-negative Matrix Factorization.
The Wavelet Tutorial: Part3 The Discrete Wavelet Transform
SPECTRO-TEMPORAL POST-SMOOTHING IN NMF BASED SINGLE-CHANNEL SOURCE SEPARATION Emad M. Grais and Hakan Erdogan Sabanci University, Istanbul, Turkey  Single-channel.
Non Negative Matrix Factorization
Extraction of Fetal Electrocardiogram Using Adaptive Neuro-Fuzzy Inference Systems Khaled Assaleh, Senior Member,IEEE M97G0224 黃阡.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Shifted Independent Component Analysis Morten Mørup, Kristoffer Hougaard Madsen and Lars Kai Hansen The shift problem Informatics and Mathematical Modelling.
Parallel Factor Analysis as an exploratory tool for wavelet transformed event related EEG Morten Mørup 1, Lars Kai Hansen 1, Sidse M. Arnfred 2 1) Informatics.
Local Non-Negative Matrix Factorization as a Visual Representation Tao Feng, Stan Z. Li, Heung-Yeung Shum, HongJiang Zhang 2002 IEEE Presenter : 張庭豪.
SAND C 1/17 Coupled Matrix Factorizations using Optimization Daniel M. Dunlavy, Tamara G. Kolda, Evrim Acar Sandia National Laboratories SIAM Conference.
Informatics and Mathematical Modelling / Intelligent Signal Processing 1 Sparse’09 8 April 2009 Sparse Coding and Automatic Relevance Determination for.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
SINGULAR VALUE DECOMPOSITION (SVD)
Estimation of Number of PARAFAC Components
A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj NIPS 2009.
EXPERIMENT DESIGN  Variations in Channel Density  The original 256-channel data were downsampled:  127 channel datasets  69 channels datasets  34.
A Clustering Method Based on Nonnegative Matrix Factorization for Text Mining Farial Shahnaz.
Optimal Component Analysis Optimal Linear Representations of Images for Object Recognition X. Liu, A. Srivastava, and Kyle Gallivan, “Optimal linear representations.
AGC DSP AGC DSP Professor A G Constantinides©1 Signal Spaces The purpose of this part of the course is to introduce the basic concepts behind generalised.
Non-negative Matrix Factor Deconvolution; Extracation of Multiple Sound Sources from Monophonic Inputs International Symposium on Independent Component.
PCA vs ICA vs LDA. How to represent images? Why representation methods are needed?? –Curse of dimensionality – width x height x channels –Noise reduction.
NONNEGATIVE MATRIX FACTORIZATION WITH MATRIX EXPONENTIATION Siwei Lyu ICASSP 2010 Presenter : 張庭豪.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Chapter 8: Adaptive Networks
Acknowledgement Work supported by NINDS (grant NS39845), NIMH (grants MH42900 and 19116) and the Human Frontier Science Program Methods Fullhead.
Chapter 61 Chapter 7 Review of Matrix Methods Including: Eigen Vectors, Eigen Values, Principle Components, Singular Value Decomposition.
Algorithm for non-negative matrix factorization Daniel D. Lee, H. Sebastian Seung. Algorithm for non-negative matrix factorization. Nature.
Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs C.G. Puntonet and A. Prieto (Eds.): ICA 2004 Presenter.
1 Nonlinear models for Natural Image Statistics Urs Köster & Aapo Hyvärinen University of Helsinki.
Large Graph Mining: Power Tools and a Practitioner’s guide
Estimation Techniques for High Resolution and Multi-Dimensional Array Signal Processing EMS Group – Fh IIS and TU IL Electronic Measurements and Signal.
Zhu Han University of Houston Thanks for Dr. Hung Nguyen’s Slides
LECTURE 11: Advanced Discriminant Analysis
d-Fold Hermite-Gauss Quadrature and Computation of Special Functions
PCA vs ICA vs LDA.
Degree and Eigenvector Centrality
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
EE513 Audio Signals and Systems
Biointelligence Laboratory, Seoul National University
Application of Singular Value Decomposition to the Analysis of Time-Resolved Macromolecular X-Ray Data  Marius Schmidt, Sudarshan Rajagopal, Zhong Ren,
Non-Negative Matrix Factorization
NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &
Presentation transcript:

Extensions of Non-Negative Matrix Factorization (NMF) to Higher Order Data HONMF (Higher Order Non-negative Matrix Factorization) NTF2D/SNTF2D ((Sparse) Non-negative Tensor Factor 2D Deconvolution) Data results The algorithms were used on a dataset containing the inter trial phase coherence (ITPC) of wavelet transformed EEG data. Briefly stated the data consist of 14 subject recorded during a proprioceptive stimuli consisting of a weight change of left hand during odd trials and right hand during even trials giving a total of 14·2=28 trials. Consequently, the data has the following form X Channel  Time-Frequency  Trials (Mørup et al. 2006a) Informatics and Mathematical Modeling Increasing attention has lately been given to Non-negative Matrix Factorization due to its part based representation and ease of algorithmic implementation (Lee & Seung, 1999 & 2001). However, NMF is not in general unique – only when data adequately spans the positive orthant (Donoho and Stodden, 2004). Consequently, constraints in the form of sparsity is useful to achieve unique decompositions (Hoyer 2002,2004 Eggert & Körner 2004). As a result, algorithms for sparse coding using multiplicative updates have been derived (Eggert & Körner 2004, Mørup & Scmidt 2006b) Sparse Coding NMF: Sparse Coding NMF regularizes H while keeping W normalizes such that regularization is not simply achieved by letting H go to zero while W goes to infinity (Eggert and Körner, 2004 Mørup & Schmidt 2006b). C sparse (H) can be any function with positive derivative - a frequently used function is the 1-norm. Title of Nature article on NMF from 1999 NMF is based on gradient descent: Each component is updated by a step in the negative gradient direction NMF uses the concept of multiplicative updates: The derivative of the cost function can be split into a positive part  i,d and a negative part  i,d. Choosing the step size as the ratio of W i,d to the positive part of the derivative  i,d yield multiplicative updates since the gradient step then cancel the W i,d term in the gradient based update. The resulting NMF updates: The least squares (LS) and Kullback-Leibler (KL) divergence updates derived from the multiplicative update approach (Lee & Seung, 2001). NMF not in general unique: If the data does not adequately span the positive orthant no unique solution can be obtained. Here red and green vectors both perfectly span the data points. However, the green vectors represent the solution the most sparse. NTF (Non-negative Tensor Factorization) Model NTF is based on the PARAFAC model (Harshman 1970, Carrol & Chang 1970, Fitzgerald et al., 2005) Model The HONMF is based on the Tucker model (Tucker, 1977) where non-negativity is imposed on all modalities (Mørup et al. 2006e). Model The NTF2D is a PARAFAC model convolutive in 2 dimensions (Mørup & Schmidt 2006c): Algorithms Data results The algorithms were used to analyze the absolute value of the log spectrogram of stereo recordings of music, i.e. the data had the form X Channel  Log-Frequency  Time Data results The algorithms were tested on a dataset of flow injection analysis (Nørgaard, 1994 Smilde, 1999), i.e. X Spectre  Time  Batch number Result obtained by the SNTF2D algorithms (bottom panel) when decomposing the log-spectrogram of synthetically generated stereo music (middle panel) generated from the true components given in the top panel. Decomposition result of a real stereo recording of music consisting of a Flute and Harp playing ”The Fog is Lifting” by Carl Nielsen. Scores given at the top. Clearly the SNTF2D separates the log-spectrogram into two components pertaining to the harp and flute respectively. By spectral masking of the log-spectrograms the two components are reconstructed revealing that the one component indeed pertains to the harp whereas the other pertains to the flute. Morten Mørup, Department of Signal Processing, Informatics and Mathematical Modeling, Technical University of Denmark, webpage: Parts of the above work done in collaboration with (see also references): Lars Kai Hansen, Professor Department of Signal Processing Informatics and Mathematical Modeling, Technical University of Denmark Mikkel N. Schmidt, Stud. PhD Department of Signal Processing Informatics and Mathematical Modeling, Technical University of Denmark Mathematical notation: Sidse M. Arnfred, Dr. Med. PhD Cognitive Research Unit Hvidovre Hospital University Hospital of Copenhagen References: Carroll, J. D. and Chang, J. J. Analysis of individual differences in multidimensional scaling via an N-way generalization of "Eckart-Young" decomposition, Psychometrika Eggert, J. and Korner, E. Sparse coding and NMF. In Neural Networks volume 4, pages , 2004 Eggert, J et al Transformation-invariant representation and nmf. In Neural Networks, volume 4, pages , 2004 Fiitzgerald, D. et al. Non-negative tensor factorization for sound source separation. In proceedings of Irish Signals and Systems Conference, 2005 FitzGerald, D. and Coyle, E. C Sound source separation using shifted non.-negative tensor factorization. In ICASSP2006, 2006 Fitzgerald, D et al. Shifted non-negative matrix factorization for sound source separation. In Proceedings of the IEEE conference on Statistics in Signal Processing Harshman, R. A. Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-modal factor analysis},UCLA Working Papers in Phonetics —84 Lathauwer, Lieven De and Moor, Bart De and Vandewalle, Joos MULTILINEAR SINGULAR VALUE DECOMPOSITION.SIAM J. MATRIX ANAL. APPL.2000 (21)1253–1278 Lee, D.D. and Seung, H.S. Algorithms for non-negative matrix factorization. In NIPS, pages , 2000 Lee, D.D and Seung, H.S. Learning the parts of objects by non-negative matrix factorization, NATURE 1999 Mørup, M. and Hansen, L.K.and Arnfred, S.M.Decomposing the time-frequency representation of EEG using nonnegative matrix and multi-way factorization Technical report, Institute for Mathematical Modeling, Technical University of Denmark, 2006a Mørup, M. and Schmidt, M.N. Sparse non-negative matrix factor 2-D deconvolution. Technical report, Institute for Mathematical Modeling, Tehcnical University of Denmark, 2006b Mørup, M and Schmidt, M.N. Non-negative Tensor Factor 2D Deconvolution for multi-channel time-frequency analysis. Technical report, Institute for Mathematical Modeling, Technical University of Denmark, 2006c Schmidt, M.N. and Mørup, M. Non-negative matrix factor 2D deconvolution for blind single channel source separation. In ICA2006, pages , 2006d Mørup, M. and Hansen, L.K.and Arnfred, S.M. Algorithms for Sparse Higher Order Non-negative Matrix Factorization (HONMF), Technical report, Institute for Mathematical Modeling, Technical University of Denmark, 2006e Nørgaard, L and Ridder, C.Rank annihilation factor analysis applied to flow injection analysis with photodiode-array detection Chemometrics and Intelligent Laboratory Systems 1994 (23) Schmidt, M.N. and Mørup, M. Sparse Non-negative Matrix Factor 2-D Deconvolution for Automatic Transcription of Polyphonic Music, Technical report, Institute for Mathematical Modelling, Tehcnical University of Denmark, 2005 Smaragdis, P. Non-negative Matrix Factor deconvolution; Extraction of multiple sound sources from monophonic inputs. International Symposium on independent Component Analysis and Blind Source Separation (ICA)W Smilde, Age K. Smilde and Tauller, Roma and Saurina, Javier and Bro, Rasmus, Calibration methods for complex second-order data Analytica Chimica Acta Tamara G. Kolda Multilinear operators for higher-order decompositions technical report Sandia national laboratory 2006 SAND Tucker, L. R. Some mathematical notes on three-mode factor analysis Psychometrika —311 Welling, M. and Weber, M. Positive tensor factorization. Pattern Recogn. Lett And also on the inter trial phase coherence (ITPC) of EEG data (see section on NTF for dataset details). The NTF decomposition reveals a right parietal activity mainly present during odd trials corresponding to left hand stimuli as well as a more frontal and a higher frequent central parietal activity While the HONMF is not unique when no sparseness is imposed, it becomes unique when imposing sparseness on the core. Here revealing that the appropriate model to the data is a PARAFAC model (Mørup et al., 2006e). Furthermore, the HONMF decomposition gives a more part based representation that is easier to interpret than the solution found by HOSVD (Lathauwer et al., 2000). The HONMF with sparseness imposed on the core and third modality resulted in a very consistent decomposition of the flow injection data capturing unsupervised the true concentrations present in each batch (given by modality 3). The PARAFAC model is a generalization of the factor analysis to higher orders, where the data is explained by an outer product of factor effects pertaining to each modality. To the right is given the general expression of the PARAFAC model for N-order tensors Synthetic data True stereo music Three equivalent ways of stating the Tucker model. The Tucker model accounts for all possible linear interactions between the factor effects pertaining to each modality. Table giving how to update when imposing sparseness/normalizing the various modalities of the model Updates for the NTF2D - by including updates marked in gray sparseness is imposed on H forming the SNTf2D.