Liverpool Keele Contribution.

Slides:



Advertisements
Similar presentations
Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.
Advertisements

Advances in WP1 Trento Meeting January
August 2004Multirate DSP (Part 2/2)1 Multirate DSP Digital Filter Banks Filter Banks and Subband Processing Applications and Advantages Perfect Reconstruction.
Speech Enhancement through Noise Reduction By Yating & Kundan.
Combining Heterogeneous Sensors with Standard Microphones for Noise Robust Recognition Horacio Franco 1, Martin Graciarena 12 Kemal Sonmez 1, Harry Bratt.
Advanced Speech Enhancement in Noisy Environments
USING COMPUTATIONAL MODELS OF BINAURAL HEARING TO IMPROVE AUTOMATIC SPEECH RECOGNITION: Promise, Progress, and Problems Richard Stern Department of Electrical.
VOICE CONVERSION METHODS FOR VOCAL TRACT AND PITCH CONTOUR MODIFICATION Oytun Türk Levent M. Arslan R&D Dept., SESTEK Inc., and EE Eng. Dept., Boğaziçi.
Source Localization in Complex Listening Situations: Selection of Binaural Cues Based on Interaural Coherence Christof Faller Mobile Terminals Division,
Advances in WP1 Turin Meeting – 9-10 March
To Understand, Survey and Implement Neurodynamic Models By Farhan Tauheed Asif Tasleem.
3/24/2006Lecture notes for Speech Communications Multi-channel speech enhancement Chunjian Li DICOM, Aalborg University.
Goals of Adaptive Signal Processing Design algorithms that learn from training data Algorithms must have good properties: attain good solutions, simple.
1 Manipulating Digital Audio. 2 Digital Manipulation  Extremely powerful manipulation techniques  Cut and paste  Filtering  Frequency domain manipulation.
Advances in WP1 and WP2 Paris Meeting – 11 febr
Sound source segregation (determination)
Why is ASR Hard? Natural speech is continuous
Speech Segregation Based on Sound Localization DeLiang Wang & Nicoleta Roman The Ohio State University, U.S.A. Guy J. Brown University of Sheffield, U.K.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
HCSNet December 2005 Auditory Scene Analysis and Automatic Speech Recognition in Adverse Conditions Phil Green Speech and Hearing Research Group, Department.
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
Speech Enhancement Using Spectral Subtraction
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
HOARSE Mid Term Review Coordinator’s Report Phil Green University of Sheffield, UK.
Survey of ICASSP 2013 section: feature for robust automatic speech recognition Repoter: Yi-Ting Wang 2013/06/19.
Adaptive Methods for Speaker Separation in Cars DaimlerChrysler Research and Technology Julien Bourgeois
Speech Perception 4/4/00.
1.Processing of reverberant speech for time delay estimation. Probleme: -> Getting the time Delay of a reverberant speech with severals microphone. ->Getting.
IMPROVING RECOGNITION PERFORMANCE IN NOISY ENVIRONMENTS Joseph Picone 1 Inst. for Signal and Info. Processing Dept. Electrical and Computer Eng. Mississippi.
Sounds in a reverberant room can interfere with the direct sound source. The normal hearing (NH) auditory system has a mechanism by which the echoes, or.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Recognition of Speech Using Representation in High-Dimensional Spaces University of Washington, Seattle, WA AT&T Labs (Retd), Florham Park, NJ Bishnu Atal.
Jens Blauert, Bochum Binaural Hearing and Human Sound Localization.
Hearing Research Center
Look who’s talking? Project 3.1 Yannick Thimister Han van Venrooij Bob Verlinden Project DKE Maastricht University.
Listeners weighting of cues for lateral angle: The duplex theory of sound localization revisited E. A. MacPherson & J. C. Middlebrooks (2002) HST. 723.
Automatic Equalization for Live Venue Sound Systems Damien Dooley, Final Year ECE Progress To Date, Monday 21 st January 2008.
The Relation Between Speech Intelligibility and The Complex Modulation Spectrum Steven Greenberg International Computer Science Institute 1947 Center Street,
RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
HOARSE Workshop Wentworth Castle, 19/2/04 Participants Sheffield: Phil Green, Jana Eggink, Sue Harding, Andre Coy, Ning Ma, Martin Cooke Bochum: Juha Merimaa,
Feel the beat: using cross-modal rhythm to integrate perception of objects, others, and self Paul Fitzpatrick and Artur M. Arsenio CSAIL, MIT.
Motorola presents in collaboration with CNEL Introduction  Motivation: The limitation of traditional narrowband transmission channel  Advantage: Phone.
Predicting Speech Intelligibility Where we were… Model of speech intelligibility Good prediction of Greenberg’s bands Data.
What can we expect of cochlear implants for listening to speech in noisy environments? Andrew Faulkner: UCL Speech Hearing and Phonetic Sciences.
UNIT-IV. Introduction Speech signal is generated from a system. Generation is via excitation of system. Speech travels through various media. Nature of.
Speech and Singing Voice Enhancement via DNN
Speech Enhancement Summer 2009
Auditory Localization in Rooms: Acoustic Analysis and Behavior
Introduction to CAPD: From A – Z, Referrals to Treatment
ARTIFICIAL NEURAL NETWORKS
Consistent and inconsistent interaural cues don't differ for tone detection but do differ for speech recognition Frederick Gallun Kasey Jakien Rachel Ellinger.
Digital Communications Chapter 13. Source Coding
Speech Enhancement with Binaural Cues Derived from a Priori Codebook
بررسي روش انسان در تشخيص صحبت و شبيه‌سازي آن
– Workshop on Wideband Speech Quality in Terminals and Networks
Volume 62, Issue 1, Pages (April 2009)
Volume 62, Issue 1, Pages (April 2009)
John H.L. Hansen & Taufiq Al Babba Hasan
A maximum likelihood estimation and training on the fly approach
Govt. Polytechnic Dhangar(Fatehabad)
Auditory Demonstrations
Speech / Non-speech Detection
Auditory perception: The near and far of sound localization
Auditory Demonstrations
3 primary cues for auditory localization: Interaural time difference (ITD) Interaural intensity difference Directional transfer function.
INTRODUCTION TO ADVANCED DIGITAL SIGNAL PROCESSING
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
COPYRIGHT © All rights reserved by Sound acoustics Germany
Presentation transcript:

Liverpool Keele Contribution

Task 1.4. Envelope information and binaural processing (KEELE, RUB, PATRAS): KEELE will implement the use of envelope information (within & between channels) within an artificial system (CTK) and model its effects with respect to human listeners' data. They will study the effect of envelope information in other conditions and consider the respective contributions of envelope and other cues (pitch, ILD, ITD) to binaural processing, in conjunction with RUB and PATRAS.

Steve Greenberg: Band Expts.

Filter into bands Present individual bands and combinations to listeners, measure intelligibility Delay individual channels

Greenberg Combinations of bands carry more information than linear sum of individual bands Band 2 / Band 3 << 10 % intelligibility Band 2 + band 3 >> 60 % intelligibility

Intelligibility vs MI meausre

Spectrum rather than inst amp

Delayed Bands & Spectrum ?

Delayed Bands & Time

Delayed Bands & sm. Phase

Next steps Modulation spectrum / phase as input Modulation spectrum to get time invariance LF phase info to model delay data Expts running – more later…

Task 4.2 Informing Speech Recognition (KEELE, DCAG, IDIAP) The aim is to combine classical and new noise estimation methods with a predictive element to allow the prediction and removal of time varying background noise. Novel noise estimation techniques will also be used to inform missing data techniques to obtain better recognition. In Blind Source Separation, the intention is to develop semi-blind algorithms which address the problems of echo compensation, noise reduction and de-reverberation.

Task 4.2 Our Interest CASA / AMaps etc good technique for noise estimation (and then spectral subtraction) BUT: significant processing delays, Interference with lip reading (100ms) Interference with self monitoring (40ms) Aim is to use prediction to compensate for processing delays Prediction can also help bridge segments where CASA fails (fricatives etc)

Key Questions 1) Can / Do we use a predictive element in speech perception (Elvira’s Talk) 2) Is ‘noise’ predictable? Matched training / testing in ASR Noise adaptation in human listeners First results:

Predicting noise Use Linear Prediction to estimate noise spectrum 12.5ms into future. (single channel) Noise attenuation – difference between subtracting estimate and long term avergae ‘history size’ noise type 63ms 125ms 250ms 625ms 1 car 3.03 3.16 3.20 3.62 2 station 3.04 3.78 4.07 5.13 3 street 3.86 4.60 5.28 7.15 4 exhibition 4.35 5.34 6.33 7.82 5 airport * 5.61 7.10 8.52 10.06 6 station * 6.77 9.06 11.26 12.48

Tasks Collect database of environmental noises Sennheiser in ear microphones / DAT tape (extract cct diags from Bochum again...) Develop prediction algorithms Multichannel Linear Prediction Neural Network

Crouzet, O. & Ainsworth, W. A. (2001) Crouzet, O. & Ainsworth, W.A. (2001). Envelope information in speech processing: Acoustic-phonetic analysis vs. auditory figure-ground segregation. Proceedings of Eurospeech 2001 (ESCA 8th European Conference on Speech Communication and Technology), 3rd-7th September 2001, Aalborg, Scandinavia.

Crouzet, O. & Ainsworth, W. A. (2001) Crouzet, O. & Ainsworth, W.A. (2001). On the various influences of envelope information on the perception of speech in adverse conditions: An analysis of between-channel envelope correlation. CRAC Workshop (Consistent and Robust Acoustic Cues for sound analysis), 2nd September 2001, Aalborg, Scandinavia.