SCALE Speech Communication with Adaptive LEarning Computational Methods for Structured Sparse Component Analysis of Convolutive Speech Mixtures Volkan.

Slides:

Advertisements

Similar presentations

Beamforming Issues in Modern MIMO Radars with Doppler

Advertisements

Joint work with Irad Yavneh

Pixel Recovery via Minimization in the Wavelet Domain Ivan W. Selesnick, Richard Van Slyke, and Onur G. Guleryuz *: Polytechnic University, Brooklyn, NY.

Online Performance Guarantees for Sparse Recovery Raja Giryes ICASSP 2011 Volkan Cevher.

Digital Audio Processing Lab, Dept. of EEThursday, June 17 th Data-Adaptive Source Separation for Audio Spatialization Supervisors: Prof. Preeti Rao and.

Submodular Dictionary Selection for Sparse Representation Volkan Cevher Laboratory for Information and Inference Systems - LIONS.

Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.

Manifold Sparse Beamforming

Wangmeng Zuo, Deyu Meng, Lei Zhang, Xiangchu Feng, David Zhang

Approximate Message Passing for Bilinear Models

Extensions of wavelets

More MR Fingerprinting

Ilias Theodorakopoulos PhD Candidate

Compressed sensing Carlos Becker, Guillaume Lemaître & Peter Rennert

Learning With Dynamic Group Sparsity Junzhou Huang Xiaolei Huang Dimitris Metaxas Rutgers University Lehigh University Rutgers University.

ECE Department Rice University dsp.rice.edu/cs Measurements and Bits: Compressed Sensing meets Information Theory Shriram Sarvotham Dror Baron Richard.

Dictionary-Learning for the Analysis Sparse Model Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000,

“Random Projections on Smooth Manifolds” -A short summary

Volkan Cevher, Marco F. Duarte, and Richard G. Baraniuk European Signal Processing Conference 2008.

Compressive Data Gathering for Large- Scale Wireless Sensor Networks Chong Luo Feng Wu Shanghai Jiao Tong University Microsoft Research Asia Jun Sun Chang.

Sparse and Overcomplete Data Representation

Image Denoising via Learned Dictionaries and Sparse Representations

Random Convolution in Compressive Sampling Michael Fleyer.

3/24/2006Lecture notes for Speech Communications Multi-channel speech enhancement Chunjian Li DICOM, Aalborg University.

Rice University dsp.rice.edu/cs Distributed Compressive Sensing A Framework for Integrated Sensing and Processing for Signal Ensembles Marco Duarte Shriram.

Recent Trends in Signal Representations and Their Role in Image Processing Michael Elad The CS Department The Technion – Israel Institute of technology.

A Multipath Sparse Beamforming Method

6.829 Computer Networks1 Compressed Sensing for Loss-Tolerant Audio Transport Clay, Elena, Hui.

A Weighted Average of Sparse Several Representations is Better than the Sparsest One Alone Michael Elad The Computer Science Department The Technion –

Topics in MMSE Estimation for Sparse Approximation Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000,

Model-based Compressive Sensing

An ALPS’ view of Sparse Recovery Volkan Cevher Laboratory for Information and Inference Systems - LIONS

Recovery of Clustered Sparse Signals from Compressive Measurements

Cs: compressed sensing

Solution for non-negative ffCO2 emissions ‒ Incorporate priors ‒ Solve, using StOMP [1] ‒ StOMP solution does not give non-negative ffCO2 emissions; a.

“A fast method for Underdetermined Sparse Component Analysis (SCA) based on Iterative Detection- Estimation (IDE)” Arash Ali-Amini 1 Massoud BABAIE-ZADEH.

2010/12/11 Frequency Domain Blind Source Separation Based Noise Suppression to Hearing Aids (Part 1) Presenter: Cian-Bei Hong Advisor: Dr. Yeou-Jiunn Chen.

SCALE Workshop, Saarbrücken, January 12, 2010 Prof. Hervé Bourlard Idiap Research Institute EPFL Idiap Research Institute Centre du Parc P.O Box 592 CH.

May 3 rd, 2010 Update Outline Monday, May 3 rd 2  Audio spatialization  Performance evaluation (source separation)  Source separation  System overview.

Source Localization on a budget Volkan Cevher Rice University Petros RichAnna Martin Lance.

Model-Based Compressive Sensing Presenter: Jason David Bonior ECE / CMR Tennessee Technological University November 5, 2010 Reading Group (Richard G. Baraniuk,

STRUCTURED SPARSE ACOUSTIC MODELING FOR SPEECH SEPARATION AFSANEH ASAEI JOINT WORK WITH: MOHAMMAD GOLBABAEE, HERVE BOURLARD, VOLKAN CEVHER.

Compressible priors for high-dimensional statistics Volkan Cevher LIONS/Laboratory for Information and Inference Systems

Shriram Sarvotham Dror Baron Richard Baraniuk ECE Department Rice University dsp.rice.edu/cs Sudocodes Fast measurement and reconstruction of sparse signals.

Signal Processing Algorithms for Wireless Acoustic Sensor Networks Alexander Bertrand Electrical Engineering Department (ESAT) Katholieke Universiteit.

MITSUBISHI ELECTRIC RESEARCH LABORATORIES Cambridge, Massachusetts High resolution SAR imaging using random pulse timing Dehong Liu IGARSS’ 2011 Vancouver,

PARALLEL FREQUENCY RADAR VIA COMPRESSIVE SENSING

Full-rank Gaussian modeling of convolutive audio mixtures applied to source separation Ngoc Q. K. Duong, Supervisor: R. Gribonval and E. Vincent METISS.

A Weighted Average of Sparse Representations is Better than the Sparsest One Alone Michael Elad and Irad Yavneh SIAM Conference on Imaging Science ’08.

Zhilin Zhang, Bhaskar D. Rao University of California, San Diego March 28,

Li-Wei Kang and Chun-Shien Lu Institute of Information Science, Academia Sinica Taipei, Taiwan, ROC {lwkang, April IEEE.

Spatial Covariance Models For Under- Determined Reverberant Audio Source Separation N. Duong, E. Vincent and R. Gribonval METISS project team, IRISA/INRIA,

Siemens Corporate Research Rosca et al. – Generalized Sparse Mixing Model & BSS – ICASSP, Montreal 2004 Generalized Sparse Signal Mixing Model and Application.

Compressive Sensing Techniques for Video Acquisition EE5359 Multimedia Processing December 8,2009 Madhu P. Krishnan.

Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.

From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images Alfred M. Bruckstein (Technion), David L. Donoho (Stanford), Michael.

Super-resolution MRI Using Finite Rate of Innovation Curves Greg Ongie*, Mathews Jacob Computational Biomedical Imaging Group (CBIG) University of Iowa.

Compressive Coded Aperture Video Reconstruction

Müjdat Çetin Stochastic Systems Group, M.I.T.

Basic Algorithms Christina Gallner

CNNs and compressive sensing Theoretical analysis

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

A Motivating Application: Sensor Array Signal Processing

Linglong Dai, Jintao Wang, Zhaocheng Wang

Sudocodes Fast measurement and reconstruction of sparse signals

Sparse Regression-based Hyperspectral Unmixing

Optimal sparse representations in general overcomplete bases

INFONET Seminar Application Group

Sudocodes Fast measurement and reconstruction of sparse signals

Sebastian Semper1 and Florian Roemer1,2

Presentation transcript:

SCALE Speech Communication with Adaptive LEarning Computational Methods for Structured Sparse Component Analysis of Convolutive Speech Mixtures Volkan Cevher Joint work with Afsaneh Asaei, Mike Davies, Hervé Bourlard, École Polytechnique Fédérale de Lausanne The University of Edinburgh Idiap Research Institute, Martigny, Switzerland ICASSP 2012 International Conference on Acoustics Speech and Signal Processing Kyoto, Japan, March 29 th, 2011

Key idea 2  We cast the under-determined speech separation problem as a sparse signal recovery where we leverage compressive sensing theory to solve it Incorporating the structures underlying the spectro-temporal representation in sparse component analysis Speech Recovery Speech Spectrographic Structures Sparse Component Analysis Model-based Sparse Component Analysis

Compressive Sensing (CS) 3

In a nutshell  CS is sensing via dimensionality reduction  Dimensionality reduction naturally happens in many problems. So, we can leverage the CS theory and algorithms. 4

Sparse signal acquisition and recovery (in theory) I. Sparse representation  Only N out of G coordinates are nonzero N<<G II. Compressive measurement  Information/Distance preserving; M < G III. Signal recovery  Given the observation and measurement matrix, finds out the sparsest signal matching those observation ‏ 5 N-planes

6 Model-based CS, in practice …  Compressible representation  Sorted coordinates decay according to the power-law with the rate r < 1  Sparse representation of speech is obtained by Gabor expansion  Model-based signal recovery  Leveraging the structure underlying the sparse coefficients improve the recovery performance and reduces the number of required measurements ‏

Convolutive Speech Separation via Model-based Sparse Component Analysis 7

Insights from 2000’s  Sparse component analysis [Yilmaz, Rickard ; IEEE TSP’04 | Zibulevsky, Bofill; SP’01 | Saab et al. IEEE TSP’07 | Gribonval, ICASSP’02 | O’Grady, Pearlmutter; ICA’04 | Georgiev et al.; IEEE TNN’05]  Source localization by sparse recovery [Cevher et al. IPSN’09 | Model and Zibulevsky; SP’06 | Malioutov, Cetin, and Willsky; IEEE TSP’05 | Guo et al. MSSP’10 | Chen et al.; Proc. of IEEE’03] Contribution of this work Model-based sparse recovery Model-based characterization of the convolutive acoustic measurements Importance of the ad-hoc microphone set-up 8

I. Sparse representation  Spatial sparsity  discretize the room into G dense grids  only very few have speech activity  Spatio-spectral representation  Process the signal in spectro-temporal domain  Block-dependency model  Harmonicity model ‏ 9

II. Measurement matrix  Natural compressive measurements are manifested by the media Green’s function [Carin’09]  Image Model of multi-path effect source at ; sensor at  Microphone array measurement matrix ‏ 10 Reflection coefficient Speed of sound

III. Signal recovery  Objective: recover N-sparse signal o Array observation: o Measurement matrix: Challenge: Sparsity gives enough prior information to overcome the ill-posed nature of the inverse problem The recovery algorithm seeks the sparsest solution 11

 Iterative Hard Thresholding (IHT)  Orthogonal Matching Pursuit (OMP)  Convex optimization (L 1 L 2 )  Structures  Block-dependency  Harmonicity 12 III. Signal recovery, cont.

Speech separation set-up  Reverberation time: 200ms  Grid resolution: 0.6m×0.6m and room dimension = 3m×3m×3m 13 Interference 2 1.4m 1.5m 1.3m 1.5m 1.3m 0.2m Target speech Interference 1 Interference 3 Interference m 1m 0.86m 0.44m

Quality of the recovered speech  Source to Distortion Ratio (SDR) obtained by different sparse recovery approaches  Baseline SDR = -3dB 14

Quality of the recovered speech, cont  PESQ: Perceptual Evaluation of Speech Quality  PESQ ranges from 0.5 to 4.5 (clean speech)  Baseline PESQ = TopologyB-IHTH-IHTB-OMPH-OMPB-L 1 L 2 H-L 1 L 2 uniform Ad-hoc

Conclusions 1. Information bearing components of speech are sparse in spectro-temporal domain  Sparse component analysis is a potential approach to deal with the problem of overlapping speech in realistic scenarios 2. Structured sparsity models provide more efficient signal estimation from very few measurements  Motivates incorporation of speech models in multi-channel sparse component analysis 3. Ad-hoc microphone arrays offer substantial improvement over the compact microphones Thank You!

17 II. Measurement matrix, cont.  First-and-second generation of echoes is a unique signature of the room geometry*  We identify the early support of the RIR based on sparse approximation of a single source and its images in a free-space model  Room geometry is estimated by the best fit of the estimated early support of RIR and the first-and-second generation of the virtual sources using the Image model in least-squares sense * “Can one hear the shape of a room: The 2-D polygonal case”, I. Dokmanic, Y. M. Lu and M. Vetterli, ICASSP ‏