May 3 rd, 2010 Update Outline Monday, May 3 rd 2  Audio spatialization  Performance evaluation (source separation)  Source separation  System overview.

Slides:

Advertisements

Similar presentations

Digital Audio Processing Lab, Dept. of EEThursday, June 17 th Data-Adaptive Source Separation for Audio Spatialization Supervisors: Prof. Preeti Rao and.

Advertisements

Motivation Application driven -- VoD, Information on Demand (WWW), education, telemedicine, videoconference, videophone Storage capacity Large capacity.

2008 SIAM Conference on Imaging Science July 7, 2008 Jason A. Palmer

G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.

Volkan Cevher, Marco F. Duarte, and Richard G. Baraniuk European Signal Processing Conference 2008.

Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec

Multichannel Phonocardiogram Source Separation PGBIOMED University of Reading 20 th July 2005 Conor Fearon and Scott Rickard University College Dublin.

Time-Frequency and Time-Scale Analysis of Doppler Ultrasound Signals

Wavelet Transform 國立交通大學電子工程學系陳奕安 Outline Comparison of Transformations Multiresolution Analysis Discrete Wavelet Transform Fast Wavelet Transform.

Short Time Fourier Transform (STFT)

3/24/2006Lecture notes for Speech Communications Multi-channel speech enhancement Chunjian Li DICOM, Aalborg University.

Subband-based Independent Component Analysis Y. Qi, P.S. Krishnaprasad, and S.A. Shamma ECE Department University of Maryland, College Park.

Effects in frequency domain Stefania Serafin Music Informatics Fall 2004.

Project Presentation: March 9, 2006

Audio Source Separation And ICA by Mike Davies & Nikolaos Mitianoudis Digital Signal Processing Lab Queen Mary, University of London.

Zhengyou Zhang, Qin Cai, Jay Stokes

Multiscale transforms : wavelets, ridgelets, curvelets, etc.

QUASI MAXIMUM LIKELIHOOD BLIND DECONVOLUTION QUASI MAXIMUM LIKELIHOOD BLIND DECONVOLUTION Alexander Bronstein.

Adaptive Signal Processing

LE 460 L Acoustics and Experimental Phonetics L-13

GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.

SPECTRO-TEMPORAL POST-SMOOTHING IN NMF BASED SINGLE-CHANNEL SOURCE SEPARATION Emad M. Grais and Hakan Erdogan Sabanci University, Istanbul, Turkey  Single-channel.

Brian King, Advised by Les Atlas Electrical Engineering, University of Washington This research was funded by Air Force Office.

Nico De Clercq Pieter Gijsenbergh Noise reduction in hearing aids: Generalised Sidelobe Canceller.

Blind Separation of Speech Mixtures Vaninirappuputhenpurayil Gopalan REJU School of Electrical and Electronic Engineering Nanyang Technological University.

Survey of ICASSP 2013 section: feature for robust automatic speech recognition Repoter: Yi-Ting Wang 2013/06/19.

Name : Arum Tri Iswari Purwanti NPM :

1.Processing of reverberant speech for time delay estimation. Probleme: -> Getting the time Delay of a reverberant speech with severals microphone. ->Getting.

STRUCTURED SPARSE ACOUSTIC MODELING FOR SPEECH SEPARATION AFSANEH ASAEI JOINT WORK WITH: MOHAMMAD GOLBABAEE, HERVE BOURLARD, VOLKAN CEVHER.

SCALE Speech Communication with Adaptive LEarning Computational Methods for Structured Sparse Component Analysis of Convolutive Speech Mixtures Volkan.

Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.

Timo Haapsaari Laboratory of Acoustics and Audio Signal Processing April 10, 2007 Two-Way Acoustic Window using Wave Field Synthesis.

Signal Processing Algorithms for Wireless Acoustic Sensor Networks Alexander Bertrand Electrical Engineering Department (ESAT) Katholieke Universiteit.

Speech Enhancement Using a Minimum Mean Square Error Short-Time Spectral Amplitude Estimation method.

2010/12/11 Frequency Domain Blind Source Separation Based Noise Suppression to Hearing Aids (Part 2) Presenter: Cian-Bei Hong Advisor: Dr. Yeou-Jiunn Chen.

Sparse Signals Reconstruction Via Adaptive Iterative Greedy Algorithm Ahmed Aziz, Ahmed Salim, Walid Osamy Presenter : 張庭豪 International Journal of Computer.

Full-rank Gaussian modeling of convolutive audio mixtures applied to source separation Ngoc Q. K. Duong, Supervisor: R. Gribonval and E. Vincent METISS.

A Study of Sparse Non-negative Matrix Factor 2-D Deconvolution Combined With Mask Application for Blind Source Separation of Frog Species 1 Reporter ：

Dr. Galal Nadim.  The root-MUltiple SIgnal Classification (root- MUSIC) super resolution algorithm is used for indoor channel characterization (estimate.

Automatic Equalization for Live Venue Sound Systems Damien Dooley, Final Year ECE Progress To Date, Monday 21 st January 2008.

Dongxu Yang, Meng Cao Supervisor: Prabin.  Review of the Beamformer  Realization of the Beamforming Data Independent Beamforming Statistically Optimum.

2010/12/11 Frequency Domain Blind Source Separation Based Noise Suppression to Hearing Aids (Part 3) Presenter: Cian-Bei Hong Advisor: Dr. Yeou-Jiunn Chen.

Sung-Won Yoon, David ChoiEE368C Project Proposal Bandwidth Extrapolation of Audio Signals Sung-Won Yoon David Choi February 8 th, 2001.

IIT Bombay 17 th National Conference on Communications, Jan. 2011, Bangalore, India Sp Pr. 1, P3 1/21 Detection of Burst Onset Landmarks in Speech.

A. R. Jayan, P. C. Pandey, EE Dept., IIT Bombay 1 Abstract Perception of speech under adverse listening conditions may be improved by processing it to.

Analysis of Traction System Time-Varying Signals using ESPRIT Subspace Spectrum Estimation Method Z. Leonowicz, T. Lobos

Li-Wei Kang and Chun-Shien Lu Institute of Information Science, Academia Sinica Taipei, Taiwan, ROC {lwkang, April IEEE.

MINUET Musical Interference Unmixing Estimation Technique Scott Rickard, Conor Fearon Department of Electronic & Electrical Engineering University College.

Time Compression/Expansion Independent of Pitch. Listening Dies Irae from Requiem, by Michel Chion (1973)

Spatial Covariance Models For Under- Determined Reverberant Audio Source Separation N. Duong, E. Vincent and R. Gribonval METISS project team, IRISA/INRIA,

Siemens Corporate Research Rosca et al. – Generalized Sparse Mixing Model & BSS – ICASSP, Montreal 2004 Generalized Sparse Signal Mixing Model and Application.

Benedikt Loesch and Bin Yang University of Stuttgart Chair of System Theory and Signal Processing International Workshop on Acoustic Echo and Noise Control,

Motorola presents in collaboration with CNEL Introduction  Motivation: The limitation of traditional narrowband transmission channel  Advantage: Phone.

Image Contrast Enhancement Based on a Histogram Transformation of Local Standard Deviation Dah-Chung Chang* and Wen-Rong Wu, Member, IEEE IEEE TRANSACTIONS.

Frequency Domain Representation of Biomedical Signals.

HIGH-RESOLUTION SINUSOIDAL MODELING OF UNVOICED SPEECH GEORGE P. KAFENTZIS, YANNIS STYLIANOU MULTIMEDIA INFORMATICS LABORATORY DEPARTMENT OF COMPUTER SCIENCE.

MAIN PROJECT IMAGE FUSION USING MATLAB

Approaches of Interest in Blind Source Separation of Speech

Introduction to Audio Watermarking Schemes N. Lazic and P

Stereo Mix Source Identification and Separation

PLIP BASED UNSHARP MASKING FOR MEDICAL IMAGE ENHANCEMENT

Outline Linear Shift-invariant system Linear filters

Outline Linear Shift-invariant system Linear filters

A Motivating Application: Sensor Array Signal Processing

Optimal sparse representations in general overcomplete bases

Chapter 8 The Discrete Fourier Transform

INTRODUCTION TO THE SHORT-TIME FOURIER TRANSFORM (STFT)

Recap In previous lessons we have looked at how numbers can be stored as binary. We have also seen how images are stored as binary. This lesson we are.

Lec.6:Discrete Fourier Transform and Signal Spectrum

COPYRIGHT © All rights reserved by Sound acoustics Germany

Presentation transcript:

May 3 rd, 2010 Update

Outline Monday, May 3 rd 2  Audio spatialization  Performance evaluation (source separation)  Source separation  System overview  Demonstration (system)  Concentration measure and W-disjoint orthogonality  Adaptive time-frequency representation (TFR)  Demonstration (adaptive TFR)

Audio spatialization Monday, May 3 rd 3  Audio spatialization – a spatial rendering technique for conversion of the available audio into desired listening configuration  Analysis – separating individual sources  Re-synthesis – re-creating the desired listener-end configuration Available spatial audio (speakers) Analysis (source separation) separated sources Re-synthesis (convolving with HRIRs) Desired listener-end configuration (headphones)

Performance evaluation [1] Monday, May 3 rd 4 Estimated source and Original source Performance evaluation block Performance measures (ISR, SIR, SAR, SDR)  ISR = Image to Spatial-distortion Ratio  SIR = Source to Interference Ratio  SAR = Source to Artifacts Ratio  SDR = Source to Distortion Ratio

Performance evaluation Monday, May 3 rd 5  Estimated source image can be decomposed as  true source image,  error components  spatial distortion,  interference,  artifacts,

Performance evaluation Monday, May 3 rd 6

Source separation [2,3] Monday, May 3 rd 7 Mixtures (stereo) Time- frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)  Source separation – obtaining the estimates of the underlying sources, from a set of observations from the sensors  Time-frequency transform  Source analysis – estimation of mixing parameters  Source synthesis – estimation of sources  Inverse time-frequency representation

Mixing model Monday, May 3 rd 8  Anechoic mixing model  Mixtures, x i  Sources, s j  Under-determined (M < N)  M = Number of mixtures  N = Number of sources Figure: Anechoic mixing model – Audio is observed at the microphones with differing intensity and arrival times (because of propagation delays) but with no reverberations Source:P. O. Grady, B. Pearlmutter and S. Rickard, “Survey of sparse and non-sparse methods in source separation,” International Journal of Imaging Systems and Technology, 2005

Mixtures Monday, May 3 rd 9 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2) Source 1 Source 2Source 3 Mixtures (stereo)

function – TFRStereo  Mixture (stereo)  Sampling frequency  DFT size  Window size  Hop size  Mixture TFRs InputsOutputs Monday, May 3 rd 10 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)

Time-frequency transform Monday, May 3 rd 11 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)

function – SourceAnalysis  Mixture TFRs  2-D histogram  Mixing parameters InputsOutputs Monday, May 3 rd 12 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)

Source analysis (estimation of mixing parameters) Monday, May 3 rd 13 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)

function – SourceSynthesis  Mixing parameters  Mixture TFRs  Estimation technique  DUET/LQBP  Estimated source masks  Estimated source TFRs InputsOutputs Monday, May 3 rd 14 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)

Source synthesis (estimation of sources) Monday, May 3 rd 15 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)

Monday, May 3 rd 16 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2) Source synthesis (estimation of sources)

Monday, May 3 rd 17 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2) Source synthesis (estimation of sources)

function – InverseTFR  Estimated source TFRs  Sampling frequency  Estimated sources InputsOutputs Monday, May 3 rd 18 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)

Inverse time-frequency transform Monday, May 3 rd 19 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2) Orig. source 1 Orig. source 2 Orig. source 3 Source 1 Source 2 Source 3

Demonstration (system) Monday, May 3 rd 20 No. of sources (2)No. of sources (3) Mixture Original SAR SDR SIR ISR SAR SDR SIR ISR DFT size = 2048 Window size = 50 ms Hop size = 25 ms Sampling frequency = Hz all the values are in dB

Concentration measure Monday, May 3 rd 21  Requirement for source separation  W-disjoint orthogonality  Sparsity is an indicator of WDO [4]  Thus a sparser TFR is expected to satisfy WDO criterion to a greater extent  Commonly used sparsity measures [5]  Kurtosis  Gini Index

Monday, May 3 rd 22  Source separation demands (WDO)  Sparse time-frequency representation (TFR)  Some observations  Music/speech signals – different frequency components present at different time instants  Different analysis window lengths provide different sparsity [4]  Therefore, to obtain a sparser TFR  Use that analysis window length for a particular time-instant, which gives highest sparsity [6] Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2) Adaptive TFR

Monday, May 3 rd 23

Adaptive TFR Monday, May 3 rd 24

function – TFRStereo (modified)  Mixture (stereo)  Sampling frequency  DFT size  Window size  Window size default  Concentration measure  Mixture TFRs  Adapted window sequence InputsOutputs Monday, May 3 rd 25 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)

Monday, May 3 rd 26  Constraint  TFR should be invertible  Solution  Select analysis windows such that they satisfy constant over-lap add (COLA) criterion [7] Inverse adaptive TFR Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)

Analysis windows (COLA) Monday, May 3 rd 27

function – InverseTFR (modified)  Estimated source TFRs  Sampling frequency  Adapted window sequence  Window size default  Estimated sources InputsOutputs Monday, May 3 rd 28 Mixtures (stereo) Time-frequency transform Source analysis Source synthesis Inverse time-frequency transform Separated sources (>=2)

Demonstration (adaptive TFR) Monday, May 3 rd 29 Source 1Source 2Source 3 Original ATFR (20:10:90 ms) SAR SDR SIR ISR TFR (60 ms) SAR SDR SIR ISR all the values are in dB

Demonstration (adaptive TFR) Monday, May 3 rd 30 Source 1Source 2Source 3 Original ATFR (20:10:90 ms) SAR SDR SIR ISR TFR (60 ms) SAR SDR SIR ISR all the values are in dB

References Monday, May 3 rd 31 1.E. Vincent, R. Gribonval and C. Fevotte, “Performance measurement in blind audio source separation,” IEEE Transactions on Audio, Speech and Language Processing, A. Jourjine, S. Rickard and O. Yilmaz, “Blind separation of disjoint orthogonal signals: demixing n sources from 2 mixtures,” IEEE Conference on Acoustics, Speech and Signal Processing, R. Saab, O. Yilmaz, M. J. Mckeown and R. Abugharbieh, “Underdetermined anechoic blind source separation via l q basis pursuit with q<1,” IEEE Transactions on Signal Processing, 2007

References Monday, May 3 rd 32 4.S. Rickard, “Sparse sources are separated sources,” European Signal Processing Conference, N. Hurley and S. Rickard, “Comparing measures of sparsity,” IEEE Transactions on Information Theory, D. L. Jones and T. Parks, “A high resolution data-adaptive time-frequency representation,” IEEE Transactions on Acoustics, Speech and Signal Processing, P. Basu, P. J. Wolfe, D. Rudoy, T. F. Quatieri and B. Dunn, “Adaptive short- time analysis-synthesis for speech enhancement,” IEEE Conference on Acoustics, Speech and Signal Processing, 2008

Questions ? Thank you