Bayesian Nonparametric Matrix Factorization for Recorded Music

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Part 2: Unsupervised Learning
Signals and Fourier Theory
Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.
An Introduction to LDA Tools Kuan-Yu Chen Institute of Information Science, Academia Sinica.
Patch to the Future: Unsupervised Visual Prediction
Variational Inference for Dirichlet Process Mixture Daniel Klein and Soravit Beer Changpinyo October 11, 2011 Applied Bayesian Nonparametrics Special Topics.
Guillaume Bouchard Xerox Research Centre Europe
Bayesian Nonparametric Matrix Factorization for Recorded Music Reading Group Presenter: Shujie Hou Cognitive Radio Institute Friday, October 15, 2010 Authors:
Variational Inference and Variational Message Passing
Lecture 5: Learning models using EM
Latent Dirichlet Allocation a generative model for text
Machine Learning CMPT 726 Simon Fraser University
Clustering with Bregman Divergences Arindam Banerjee, Srujana Merugu, Inderjit S. Dhillon, Joydeep Ghosh Presented by Rohit Gupta CSci 8980: Machine Learning.
Introduction to Spectral Estimation
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
SINGLE CHANNEL SPEECH MUSIC SEPARATION USING NONNEGATIVE MATRIXFACTORIZATION AND SPECTRAL MASKS Jain-De,Lee Emad M. GraisHakan Erdogan 17 th International.
Lecture 1 Signals in the Time and Frequency Domains
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.
SPECTRO-TEMPORAL POST-SMOOTHING IN NMF BASED SINGLE-CHANNEL SOURCE SEPARATION Emad M. Grais and Hakan Erdogan Sabanci University, Istanbul, Turkey  Single-channel.
Transforms. 5*sin (2  4t) Amplitude = 5 Frequency = 4 Hz seconds A sine wave.
Probabilistic Graphical Models
Hyperparameter Estimation for Speech Recognition Based on Variational Bayesian Approach Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee and Keiichi.
1 A Simple Asymptotically Optimal Energy Allocation and Routing Scheme in Rechargeable Sensor Networks Shengbo Chen, Prasun Sinha, Ness Shroff, Changhee.
Mean Field Variational Bayesian Data Assimilation EGU 2012, Vienna Michail Vrettas 1, Dan Cornford 1, Manfred Opper 2 1 NCRG, Computer Science, Aston University,
A Sparse Non-Parametric Approach for Single Channel Separation of Known Sounds Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj NIPS 2009.
Randomized Algorithms for Bayesian Hierarchical Clustering
Spatial Frequencies Spatial Frequencies. Why are Spatial Frequencies important? Efficient data representation Provides a means for modeling and removing.
EE 685 presentation Optimization Flow Control, I: Basic Algorithm and Convergence By Steven Low and David Lapsley.
Fourier Series Fourier Transform Discrete Fourier Transform ISAT 300 Instrumentation and Measurement Spring 2000.
Halftoning With Pre- Computed Maps Objective Image Quality Measures Halftoning and Objective Quality Measures for Halftoned Images.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Non-negative Matrix Factor Deconvolution; Extracation of Multiple Sound Sources from Monophonic Inputs International Symposium on Independent Component.
Dropout as a Bayesian Approximation
ECE 8443 – Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem Proof EM Example – Missing Data Intro to Hidden Markov Models.
by Ryan P. Adams, Iain Murray, and David J.C. MacKay (ICML 2009)
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
Recitation4 for BigData Jay Gu Feb MapReduce.
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
Rate Distortion Theory. Introduction The description of an arbitrary real number requires an infinite number of bits, so a finite representation of a.
A latent Gaussian model for compositional data with structural zeroes Adam Butler & Chris Glasbey Biomathematics & Statistics Scotland.
Sparse Approximate Gaussian Processes. Outline Introduction to GPs Subset of Data Bayesian Committee Machine Subset of Regressors Sparse Pseudo GPs /
Non-negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs C.G. Puntonet and A. Prieto (Eds.): ICA 2004 Presenter.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Biointelligence Laboratory, Seoul National University
Speech Enhancement Summer 2009
Bayesian Semi-Parametric Multiple Shrinkage
Variational filtering in generated coordinates of motion
EEE422 Signals and Systems Laboratory
MECH 373 Instrumentation and Measurements
Variational Bayes Model Selection for Mixture Distribution
Model Inference and Averaging
On Tap Angular Spread and Kronecker Structure of WLAN Channel Models
Course: Autonomous Machine Learning
Latent Variables, Mixture Models and EM
A Non-Parametric Bayesian Method for Inferring Hidden Causes
Collapsed Variational Dirichlet Process Mixture Models
Generalized Spatial Dirichlet Process Models
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
A Tutorial on Bayesian Speech Feature Enhancement
Lecture 11: Mixture of Gaussians
EE513 Audio Signals and Systems
Generally Discriminant Analysis
Latent Dirichlet Allocation
Topic Models in Text Processing
Lecture 11 Generalizations of EM.
Lec.6:Discrete Fourier Transform and Signal Spectrum
Emad M. Grais Hakan Erdogan
Presentation transcript:

Bayesian Nonparametric Matrix Factorization for Recorded Music Matthew D. Hoffman, David M. Blei, Perry R. Cook Presented by Lu Ren Electrical and Computer Engineering Duke University

Outline Variational Inference Evaluation Related Work Conclusions Introduction GaP-NMF Model Variational Inference Evaluation Related Work Conclusions

Introduction Specifying the number of sources---Bayesian Nonparametric Breaking audio spectrograms into separate sources of sound Identifying individual instruments and notes Predicting hidden or distorted signals Source separation previous work Specifying the number of sources---Bayesian Nonparametric Gamma Process Nonnegative Matrix Factorization (GaP-NMF) Computational challenge: non-conjugate pairs of distributions favor for spectrogram data, not for computational convenience bigger variational family analytic coordinate ascent algorithm

GaP-NMF Model : M by N matrix of nonnegative reals Observation: Fourier power sepctrogram of an audio signal : M by N matrix of nonnegative reals : power at time window n and frequency bin m A window of 2(M-1) samples Squared magnitude in each frequency bin DFT Keep only the first M bins Assume K static sound sources : describe these sources is the average amount of energy source k exhibits at frequency m : amplitude of each source changing over time is the gain of source k at time n

GaP-NMF Model Mixing K sound sources in the time domain (under certain assumptions), spectrogram is distributed1 Infer both the characters and number of latent audio sources : trunction level 1Abdallah & Plumbley (2004) and Fevotte et al. (2009)

GaP-NMF Model As goes infinity, approximates an infinite sequence drawn from a gamma process Number of elements greater than some is finite almost surely: If is sufficiently large relative to , only a few elements of are substantially greater than 0. Setting :

Variational Inference Variational distribution: expanded family Generalized Inverse-Gaussian (GIG): denotes a modified Bessel function of the second kind Gamma family is a special case of the GIG family where ,

Variational Inference Lower bound of GaP-NMF model: If : GIG family sufficient statistics: Gamma family sufficient statistics:

Variational Inference The likelihood term expands to: With Jensen’s inequality:

Variational Inference With a first order Taylor approximation: : an arbitrary positive point

Variational Inference Tightening the likelihood bound Optimizing the variational distributions For example:

Evaluation Compare GaP-NMF to two variations: 1. Finite Bayesian model 2. Finite non-Bayesian model Itakura-Saito Nonnegative Matrix Factorization (IS-NMF) : maximize the likelihood in the above fomula Compare with another two NMF algorithms: EU-NMF: minimize the sum of the squared Euclidean distance KL-NMF: minimize the generalized KL-divergence

Evaluation 1. Synthetic Data

Evaluation 2. Marginal Likelihood & Bandwidth Expansion

Evaluation 3. Blind Monophonic Source Separation

Conclusions Related work Bayesian nonparametric model GaP-NMF Applicable to other types of audio