Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs C.G. Puntonet and A. Prieto (Eds.): ICA 2004 Presenter.

Slides:



Advertisements
Similar presentations
DCSP-12 Jianfeng Feng
Advertisements

Nonnegative Matrix Factorization with Sparseness Constraints S. Race MA591R.
Section 13-4: Matrix Multiplication
Computer Vision Lecture 7: The Fourier Transform
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Modulation Spectrum Factorization for Robust Speech Recognition Wen-Yi Chu 1, Jeih-weih Hung 2 and Berlin Chen 1 Presenter : 張庭豪.
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
ACHIZITIA IN TIMP REAL A SEMNALELOR. Three frames of a sampled time domain signal. The Fast Fourier Transform (FFT) is the heart of the real-time spectrum.
1 Chapter 16 Fourier Analysis with MATLAB Fourier analysis is the process of representing a function in terms of sinusoidal components. It is widely employed.
Nonsmooth Nonnegative Matrix Factorization (nsNMF) Alberto Pascual-Montano, Member, IEEE, J.M. Carazo, Senior Member, IEEE, Kieko Kochi, Dietrich Lehmann,
Bayesian Nonparametric Matrix Factorization for Recorded Music Reading Group Presenter: Shujie Hou Cognitive Radio Institute Friday, October 15, 2010 Authors:
Chapter 3 Steady-State Conduction Multiple Dimensions
Linear Transformations
Image Enhancement in the Frequency Domain Part I Image Enhancement in the Frequency Domain Part I Dr. Samir H. Abdul-Jauwad Electrical Engineering Department.
Some Properties of the 2-D Fourier Transform Translation Distributivity and Scaling Rotation Periodicity and Conjugate Symmetry Separability Convolution.
On Signal Reconstruction from Fourier Magnitude Gil Michael Department of Electrical Engineering Technion - Israel Institute of Technology Haifa 32000,
Microarray analysis 2 Golan Yona. 2) Analysis of co-expression Search for similarly expressed genes experiment1 experiment2 experiment3 ……….. Gene i:
HCC class lecture 14 comments John Canny 3/9/05. Administrivia.
Orthogonal Transforms
lecture 2, linear imaging systems Linear Imaging Systems Example: The Pinhole camera Outline  General goals, definitions  Linear Imaging Systems.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
ELEC ENG 4035 Communications IV1 School of Electrical & Electronic Engineering 1 Section 2: Frequency Domain Analysis Contents 2.1 Fourier Series 2.2 Fourier.
1 Chapter 8 The Discrete Fourier Transform 2 Introduction  In Chapters 2 and 3 we discussed the representation of sequences and LTI systems in terms.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Machine Learning for Signal Processing Latent Variable Models and Signal Separation Class Oct Oct /18797.
SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS Jain-De,Lee Emad M. GraisHakan Erdogan 17 th International.
Image recognition using analysis of the frequency domain features 1.
Parallelism and Robotics: The Perfect Marriage By R.Theron,F.J.Blanco,B.Curto,V.Moreno and F.J.Garcia University of Salamanca,Spain Rejitha Anand CMPS.
SPECTRO-TEMPORAL POST-SMOOTHING IN NMF BASED SINGLE-CHANNEL SOURCE SEPARATION Emad M. Grais and Hakan Erdogan Sabanci University, Istanbul, Turkey  Single-channel.
Non Negative Matrix Factorization
Local Non-Negative Matrix Factorization as a Visual Representation Tao Feng, Stan Z. Li, Heung-Yeung Shum, HongJiang Zhang 2002 IEEE Presenter : 張庭豪.
Signals CY2G2/SE2A2 Information Theory and Signals Aims: To discuss further concepts in information theory and to introduce signal theory. Outcomes:
EE104: Lecture 5 Outline Review of Last Lecture Introduction to Fourier Transforms Fourier Transform from Fourier Series Fourier Transform Pair and Signal.
Yi-zhang Cai, Jeih-weih Hung 2012/08/17 報告者:汪逸婷 1.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj NIPS 2009.
CHEE825 Fall 2005J. McLellan1 Spectral Analysis and Input Signal Design.
4.4 Identify and Inverse Matrices Algebra 2. Learning Target I can find and use inverse matrix.
Wavelets and Multiresolution Processing (Wavelet Transforms)
Professor Walter W. Olson Department of Mechanical, Industrial and Manufacturing Engineering University of Toledo System Solutions y(t) t +++++… 11 22.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Non-negative Matrix Factor Deconvolution; Extracation of Multiple Sound Sources from Monophonic Inputs International Symposium on Independent Component.
CS654: Digital Image Analysis Lecture 13: Discrete Fourier Transformation.
NONNEGATIVE MATRIX FACTORIZATION WITH MATRIX EXPONENTIATION Siwei Lyu ICASSP 2010 Presenter : 張庭豪.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Section 9-1 An Introduction to Matrices Objective: To perform scalar multiplication on a matrix. To solve matrices for variables. To solve problems using.
CS 376b Introduction to Computer Vision 03 / 17 / 2008 Instructor: Michael Eckmann.
2D-LDA: A statistical linear discriminant analysis for image matrix
Chapter 13 Discrete Image Transforms
The Frequency Domain Digital Image Processing – Chapter 8.
بسم الله الرحمن الرحيم Digital Signal Processing Lecture 14 FFT-Radix-2 Decimation in Frequency And Radix -4 Algorithm University of Khartoum Department.
Digital Image Processing Lecture 8: Fourier Transform Prof. Charlene Tsai.
Linear Algebra Review.
Compressive Coded Aperture Video Reconstruction
LECTURE 11: Advanced Discriminant Analysis
MECH 373 Instrumentation and Measurements
EE Audio Signals and Systems
Computer Vision Lecture 16: Texture II
4. Image Enhancement in Frequency Domain
Bayesian Nonparametric Matrix Factorization for Recorded Music
EE513 Audio Signals and Systems
Analysis of Audio Using PCA
9.4 Enhancing the SNR of Digitized Signals
Non-negative Matrix Factorization (NMF)
INTRODUCTION TO THE SHORT-TIME FOURIER TRANSFORM (STFT)
Matrix A matrix is a rectangular arrangement of numbers in rows and columns Each number in a matrix is called an Element. The dimensions of a matrix are.
Emad M. Grais Hakan Erdogan
ENEE222 Elements of Discrete Signal Analysis Lab 9 1.
NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &
Presentation transcript:

Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs C.G. Puntonet and A. Prieto (Eds.): ICA 2004 Presenter : 張庭豪

Outline Introduction Non-negative Matrix Factorization Non-negative Matrix Factor Deconvolution Conclusions 2

Introduction  Non-Negative Matrix Factorization (NMF), was introduced as a concept independently by Paatero (1997) as the Positive Matrix Factorization, and by Lee and Seung (1999) who also proposed some very efficient algorithms for its computation.  Since its inception NMF has been applied successfully to a variety of problems despite a hazy statistical underpinning.  In this paper we will introduce an extension of NMF for time series, which is useful for problems akin to source separation for single channel inputs. 3

Non-negative Matrix Factorization  The original formulation of NMF is defined as follows. Starting with a nonnegative M× N matrix V ∈ R ≥ 0,M×N the goal is to approximate it as a product of two non-negative matrices W ∈ R ≥ 0,M×R and H ∈ R ≥ 0,R×N where R ≤ M, such that we minimize the error of reconstruction of V by W ・ H.  The success of the reconstruction can be measured using a variety of cost functions, in this paper we will use a cost function introduced by Lee and Seung (1999): 4

Non-negative Matrix Factorization 5

 where 1 is a M× N matrix with all its elements set to unity, and the divisions are again element-wise. The variable R corresponds to the number of basis functions to extract.  It is usually set to a small number so that NMF results into a low-rank approximation. 6

Non-negative Matrix Factorization NMF for Sound Object Extraction  Consider a sound scene s ( t ), and its short-time Fourier transform packed into a M × N matrix:  where M is the DFT size and N the overall number of frames computed.  From the matrix F ∈ R M×N we can extract the magnitude of the transform V =| F |, V ∈ R ≥ 0,M×N and then apply NMF on it. 7

Non-negative Matrix Factorization 8

 We notice that they have energy only at the two frequencies that are present in the input spectrogram. We can interpret these two columns as basis functions for the spectra contained in the spectrogram.  Likewise the rows of H, shown at the top of the figure, only have energy at the time points where the two sinusoids do.  We can interpret the rows of H as the weights of the spectral bases at each time. 9

Non-negative Matrix Factor Deconvolution  The process we described above works well for many audio tasks. It is however a weak model since it does not take into account the relative positions of each spectrum thereby discarding temporal information.  In this section we will introduce an extended version of NMF which deals with this issue.  In the previous section we used the model V ≈ W ・ H. In this section we will extend it to: 10

Non-negative Matrix Factor Deconvolution 11

Non-negative Matrix Factor Deconvolution  Just as before our objective is to find a set of W t and a H to approximate V as best as possible. We set and define the cost function:  The update rules for this case will be the same as when performing NMF for each iteration of t, plus some shifting to appropriately line up the arguments: 12

Non-negative Matrix Factor Deconvolution  In complex cases it is often useful to average the updates of H over all t ’s.  Due to the rapid convergence properties of the multiplicative rules there is the danger that H has been more influenced by the last W t used for its update, rather than the entire ensemble of W t. 13

Non-negative Matrix Factor Deconvolution  We perform a two-component NMFD with T = 10. This results into a H and T W t matrices of size M × 2.  The n th column of the t th W t matrix is the n th basis offset by t spots in the left-right dimension (time in our case). NMFD for Sound Object Extraction  Using the above formulation of NMFD we analyze a sound snippet which contains a set of drum sounds.  NMFD was performed for 3 basis functions each with a time extend of 10 DFT frames (R = 3 and T = 10). 14

Non-negative Matrix Factor Deconvolution  The results are shown in figure 3. There are three types of drum sounds present into the scene; four instances of the bass drum (the low frequency element), two instances of a snare drum (the two loud wideband bursts), and the hi-hat the repeating high-band burst. 15

Non-negative Matrix Factor Deconvolution 16

Conclusions  In this paper we presented an convolutional version of NMF.  We have pinpointed some of the shortcomings of conventional NMF when analyzing temporal patterns and presented an extension which results in the extraction of more expressive basis functions.  We have also shown how these basis functions can be used in the same way spectral bases have been used on spectrograms to extract sound objects from single channel sound scenes. 17