Approaches of Interest in Blind Source Separation of Speech

Slides:

Advertisements

Similar presentations

Independent Component Analysis

Advertisements

Color Imaging Analysis of Spatio-chromatic Decorrelation for Colour Image Reconstruction Mark S. Drew and Steven Bergner

Robustness issues in Biometrics: Blind Source Separation and Cluster Ensembles Jugurta Montalvão Universidade Federal de Sergipe – UFS Núcleo de Engenharia.

Digital Audio Processing Lab, Dept. of EEThursday, June 17 th Data-Adaptive Source Separation for Audio Spatialization Supervisors: Prof. Preeti Rao and.

AES 120 th Convention Paris, France, 2006 Adaptive Time-Frequency Resolution for Analysis and Processing of Audio Alexey Lukin AES Student Member Moscow.

Manifold Sparse Beamforming

G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.

Independent Component Analysis & Blind Source Separation

Independent Component Analysis (ICA)

To Understand, Survey and Implement Neurodynamic Models By Farhan Tauheed Asif Tasleem.

Independent Component Analysis & Blind Source Separation Ata Kaban The University of Birmingham.

Speech Enhancement Based on a Combination of Spectral Subtraction and MMSE Log-STSA Estimator in Wavelet Domain LATSI laboratory, Department of Electronic,

3/24/2006Lecture notes for Speech Communications Multi-channel speech enhancement Chunjian Li DICOM, Aalborg University.

Subband-based Independent Component Analysis Y. Qi, P.S. Krishnaprasad, and S.A. Shamma ECE Department University of Maryland, College Park.

Independent Component Analysis (ICA) and Factor Analysis (FA)

Audio Source Separation And ICA by Mike Davies & Nikolaos Mitianoudis Digital Signal Processing Lab Queen Mary, University of London.

To Understand, Survey and Implement Neurodynamic Models By Farhan Tauheed Asif Tasleem.

1 Blind Separation of Audio Mixtures Using Direct Estimation of Delays Arie Yeredor Dept. of Elect. Eng. – Systems School of Electrical Engineering Tel-Aviv.

Spectral Analysis Spectral analysis is concerned with the determination of the energy or power spectrum of a continuous-time signal It is assumed that.

Multidimensional Data Analysis : the Blind Source Separation problem. Outline : Blind Source Separation Linear mixture model Principal Component Analysis.

HCSNet December 2005 Auditory Scene Analysis and Automatic Speech Recognition in Adverse Conditions Phil Green Speech and Hearing Research Group, Department.

Multiresolution STFT for Analysis and Processing of Audio

SPECTRO-TEMPORAL POST-SMOOTHING IN NMF BASED SINGLE-CHANNEL SOURCE SEPARATION Emad M. Grais and Hakan Erdogan Sabanci University, Istanbul, Turkey  Single-channel.

Eigenstructure Methods for Noise Covariance Estimation Olawoye Oyeyele AICIP Group Presentation April 29th, 2003.

Heart Sound Background Noise Removal Haim Appleboim Biomedical Seminar February 2007.

Sep.2008DISP Time-Frequency Analysis 時頻分析  Speaker: Wen-Fu Wang 王文阜  Advisor: Jian-Jiun Ding 丁建均教授   Graduate.

“A fast method for Underdetermined Sparse Component Analysis (SCA) based on Iterative Detection- Estimation (IDE)” Arash Ali-Amini 1 Massoud BABAIE-ZADEH.

Independent Component Analysis Zhen Wei, Li Jin, Yuxue Jin Department of Statistics Stanford University An Introduction.

2010/12/11 Frequency Domain Blind Source Separation Based Noise Suppression to Hearing Aids (Part 1) Presenter: Cian-Bei Hong Advisor: Dr. Yeou-Jiunn Chen.

Blind Separation of Speech Mixtures Vaninirappuputhenpurayil Gopalan REJU School of Electrical and Electronic Engineering Nanyang Technological University.

May 3 rd, 2010 Update Outline Monday, May 3 rd 2  Audio spatialization  Performance evaluation (source separation)  Source separation  System overview.

Adaptive Methods for Speaker Separation in Cars DaimlerChrysler Research and Technology Julien Bourgeois

SCALE Speech Communication with Adaptive LEarning Computational Methods for Structured Sparse Component Analysis of Convolutive Speech Mixtures Volkan.

Basics of Neural Networks Neural Network Topologies.

Nico De Clercq Pieter Gijsenbergh.  Problem  Solutions  Single-channel approach  Multichannel approach  Our assignment Overview.

Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.

An Introduction to Blind Source Separation Kenny Hild Sept. 19, 2001.

2010/12/11 Frequency Domain Blind Source Separation Based Noise Suppression to Hearing Aids (Part 2) Presenter: Cian-Bei Hong Advisor: Dr. Yeou-Jiunn Chen.

Full-rank Gaussian modeling of convolutive audio mixtures applied to source separation Ngoc Q. K. Duong, Supervisor: R. Gribonval and E. Vincent METISS.

A Study of Sparse Non-negative Matrix Factor 2-D Deconvolution Combined With Mask Application for Blind Source Separation of Frog Species 1 Reporter ：

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:

2010/12/11 Frequency Domain Blind Source Separation Based Noise Suppression to Hearing Aids (Part 3) Presenter: Cian-Bei Hong Advisor: Dr. Yeou-Jiunn Chen.

Independent Component Analysis Independent Component Analysis.

Analysis of Traction System Time-Varying Signals using ESPRIT Subspace Spectrum Estimation Method Z. Leonowicz, T. Lobos

An Introduction of Independent Component Analysis (ICA) Xiaoling Wang Jan. 28, 2003.

Spatial Covariance Models For Under- Determined Reverberant Audio Source Separation N. Duong, E. Vincent and R. Gribonval METISS project team, IRISA/INRIA,

Spatial vs. Blind Approaches for Speaker Separation: Structural Differences and Beyond Julien Bourgeois RIC/AD.

Siemens Corporate Research Rosca et al. – Generalized Sparse Mixing Model & BSS – ICASSP, Montreal 2004 Generalized Sparse Signal Mixing Model and Application.

Xiaoying Pang Indiana University March. 17 th, 2010 Independent Component Analysis for Beam Measurement.

Benedikt Loesch and Bin Yang University of Stuttgart Chair of System Theory and Signal Processing International Workshop on Acoustic Echo and Noise Control,

Acoustic source tracking using microphone array R 羅子建 R 林祺豪.

Motorola presents in collaboration with CNEL Introduction  Motivation: The limitation of traditional narrowband transmission channel  Advantage: Phone.

By: Soroosh Mariooryad Advisor: Dr.Sameti 1 BSS & ICA Speech Recognition - Spring 2008.

HIGH-RESOLUTION SINUSOIDAL MODELING OF UNVOICED SPEECH GEORGE P. KAFENTZIS, YANNIS STYLIANOU MULTIMEDIA INFORMATICS LABORATORY DEPARTMENT OF COMPUTER SCIENCE.

Research Process. Information Theoretic Blind Source Separation with ANFIS and Wavelet Analysis 03 February 2006 서경호.

Speech Enhancement Summer 2009

Compressive Coded Aperture Video Reconstruction

Estimation Techniques for High Resolution and Multi-Dimensional Array Signal Processing EMS Group – Fh IIS and TU IL Electronic Measurements and Signal.

A Unifying Framework for Acoustic Localization

LECTURE 11: Advanced Discriminant Analysis

Machine Learning Independent Component Analysis Supervised Learning

Spectral Analysis Spectral analysis is concerned with the determination of the energy or power spectrum of a continuous-time signal It is assumed that.

Estimation Techniques for High Resolution and Multi-Dimensional Array Signal Processing EMS Group – Fh IIS and TU IL Electronic Measurements and Signal.

Orthogonal Subspace Projection - Matched Filter

VII. Other Time Frequency Distributions (II)

Application of Independent Component Analysis (ICA) to Beam Diagnosis

Information-Theoretic Listening

Extracting Individual Tracks from Polyphonic Music

Blind Source Separation: PCA & ICA

Govt. Polytechnic Dhangar(Fatehabad)

Presentation transcript:

Approaches of Interest in Blind Source Separation of Speech Julien Bourgeois DAIMLERCHRYSLER AG Research and Technology, RIC/AD 1

Background - Need of speech-based Human-Machine Interface in cars. - Road noise, passengers speech create adverse conditions to Automatic Speech Recognition. 2

4 Approaches to the Cocktail Party Problem 1 - Computational Auditory Scene Analysis (CASA) 2 - Sparse Decomposition Approach 3 - Statistical Blind Source Separation 4 - Beamforming Conclusion & Future plans 3

Computational Auditory Scene Analysis (CASA) Generalities Aim: get an algorithmic description of higher auditory functions. Strong biological inspiration. One or two sensors (microphones) are considered. Mic signal is filtered like in a human ear. Variations on a Segmentation - Grouping scheme. 4

Segmentation is based on temporal continuity. CASA - Segmentation Frequency Index Time Segmentation is based on temporal continuity. 5

CASA - Grouping Frequency Index Time Grouping rules are (1) harmonicity and (2) synchronous start or end. These rules agree with certain psychoacoustical phenomena. 6

CASA - Audio example mixture separated

4 Approaches to the Cocktail Party Problem 1 - Computational Auditory Scene Analysis (CASA) 2 - Sparse Decomposition Approach 3 - Statistical Blind Source Separation 4 - Beamforming Conclusion & Future plans

Sparse Decomposition - Generalities 2 sensors x1 and x2 of N acoustic sources si are given. Aim : Find an invertible transform T so that the N sources are disjoint in the transformed domain. DUET : T = STFT works !! (Windowed Short Term Fourier Transform) Indeed, statistically S1(w,t) S2(w,t) is small. 7

Sparse Decomposition - DUET Assumption : “At each point (w,t) of the spectrogram, only one source is active.” Angle(X1(w,t)/X2(w,t))/w [Group delay] Group delay 1 Group delay 2 Which source Si is active at (w,t) ? Look at the phase between X1(w,t) and X2(w,t). Frequency Index Time Then set Si(w,t) = X1(w,t) 8

Sparse Decomposition - Audio Example Mix 1 Mix 2 Out 1 Out 2

4 Approaches to the Cocktail Party Problem 1 - Computational Auditory Scene Analysis (CASA) 2 - Sparse Decomposition Approach 3 - Statistical Blind Source Separation 4 - Beamforming Conclusion & Future plans

Statistical Blind Source Separation Assumption: “The sources are decorrelated.” or “The sources are independent.” ICA = Independent Component Analysis Generally needs (at least) as many sensors as sources. Permutation and scale ambiguities: If s1 and s2 are independent, so are s2 and b s1 9

Statistical Blind Source Separation Mixture model: x(n) = A(0)s(n) + ... + A(K)s(n-K) = A* s (n) (TF) X(w,t) = A(w)S(w,t) Separation filters W: find W(w) so that the components of Y(w,t) = W(w)X(w,t) are independent or decorrelated. (Y estimates the sources S). For a decorrelation criterion, the output Y is decorrelated at each t. One can find W minimizing the off-diagonal terms of RYY(w,t) = E[Y(w,t)YH(w,t)] jointly for all t. 10

Statistical Blind Source Separation Very few assumption on the sources. But: In frequency domain, the ambiguities occur independently at each frequency bin w. Can be CPU-expensive because of iterative optimization. 11

Statistical Blind Source Separation Audio example Mix 1 Mix 2 Out 1 Out 2

4 Approaches to the Cocktail Party Problem 1 - Computational Auditory Scene Analysis (CASA) 2 - Sparse Decomposition Approach 3 - Statistical Blind Source Separation 4 - Beamforming Conclusion & Future Plans

Beamforming - Array signal processing Spatial locations of the sources (direction of arrival - D.O.A.) are mapped on delays between sensors. Array signal processing addresses 3 estimation problems: 1) number of sources, 2) their spatial locations, 3) spatial filtering. Can require more sensors than sources, depending on the spatial resolution. s1 s2 x1 xi xN xi(t) = s1(t-d1,i ) + s2(t-d2,i ) 12

Beamforming - Source Location 1/ Energy-Based: Search for the delays di that maximize sy2 y(t) = x1(t+d1 ) + ... + xN(t+dN ) [output of a delay-sum beamformer] 2/ Correlation Based: Search for the delay d that maximizes E[xi (t)xj (t-d )], for some pairs (i,j) 3/ High Resolution: X(w,t) = A(w)S(w,t) The eigendecomposition of RXX=A RSS AH provides information on A, i.e. on the source location. diagonal if the sources are decorrelated 13

Beamforming - Spatial Filtering xi x1 xN di dN d1 ... Fi FN F1 + Beamforming - Spatial Filtering direction of interest 1/ Data-Independant: e.g. delay sum beamforming 2/ Statistically optimal: Constrain the response in the direction of interest and minimize the output power 14

Beamforming - Audio example Mix 1 Mix 2 Out 1 Out 2

4 Approaches to the Cocktail Party Problem 1 - Computational Auditory Scene Analysis (CASA) 2 - Sparse Decomposition Approach 3 - Statistical Blind Source Separation 4 - Beamforming Conclusion & Future plans

Conclusion & Questions Different definitions of “source”. Perceptual,Topological, Statistical, Spatial: Complementary approaches. No perfect solution to the cocktail party problem. 15

Future plans in Hoarse Combination of existing methods: DUET if the sources are disjoint ICA or beamforming if they overlap Investigation of specific open questions Estimation of the number of sources at each (w,t) point. Sparse Decomposition: Optimal transform T ? Extension to more than 2 mics ? Theoretical Boundaries ? Equivalencies between these approaches (e.g. Second Order BSS and Beamforming) ? 16

Short Bibliography CASA Guy J Brown, Martin Cooke. Computational Auditory Scene Analysis. Computer Speech and Language, vol. 8, no. 4, pp. 297-336, 1994. A. S. Bregman. “Auditory Scene Analysis”, MIT Press, Cambridge, MA, 1990. Guoning Hu and DeLiang Wang, Monaural speech separation, NIPS 2002

Sparse Decomposition - DUET Short Bibliography Sparse Decomposition - DUET M. Zibulevsky, B. A. Pearlmutter, P. Bofill, and P. Kisilev, "Blind Source Separation by Sparse Decomposition", chapter in the book: S. J. Roberts, and R.M. Everson eds., Independent Component Analysis: Principles and Practice, Cambridge, 2001. O. Yilmaz and S. Rickard, Blind Separation of Speech Mixtures via Time-Frequency Masking, Submitted to the IEEE Transactions on Signal Processing, November 4, 2002 Jourjine, S. Rickard, and O. Yilmaz, Blind Separation of Disjoint Orthogonal Signals: Demixing N Sources from 2 Mixtures, Proceedings of the 2000 IEEE Conference on Acoustics, Speech, and Signal Processing (ICASSP2000), Volume 5, Pages 2985-2988, Istanbul, Turkey, June 2000

Statistical Blind Source Separation - ICA Short Bibliography Statistical Blind Source Separation - ICA Lucas Parra, Clay Spence, "Convolutive blind source separation of non-stationary sources", IEEE Trans. on Speech and Audio Processing pp. 320-327, May 2000 Te-Won Lee, Independent Component Analysis: Theory and Applications Kluwer Academic Publishers, September 1998

Short Bibliography Beamforming B.D. van Veen and K.M. Buckley, ``Beamforming: A Versatile Approach to Spatial Filtering,'' IEEE ASSP Magazine, vol.5, pp. 4-24, Apr. 1988. M. Brandstein and H. Silverman, "A practical methodology for speech source localization with microphone arrays," Computer, Speech and Language, vol. 11, no. 2, pp. 91--126, 1997. D. Ward and M. Brandstein (Eds.), 'Microphone Arrays: Techniques and Applications', Springer, Berlin, 2001, pp. 231-256.