The Fully Networked Car Geneva, 4-5 March 2009 1 Automotive Speech Enhancement of Today: Applications, Challenges and Solutions Tim Haulick Harman/Becker.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

Wideband Speech Coding for CDMA2000® Systems
Envelope Detector Conventional DSB-AM signals are easily demodulated by an envelope detector It consists of a diode and an RC circuit, which is a simple.
Acoustical parameters ISO 3382
6: Opportunistic Communication and Multiuser Diversity
1. Introduction.
Learning Introductory Signal Processing Using Multimedia 1 Outline Overview of Information and Communications Some signal processing concepts Tools available.
Iterative Equalization and Decoding
International Telecommunication Union The Fully Networked Car Geneva, 4-5 March 2009 Wrap-up Session Conclusions for Session 5 Session 5: Voice and audiovisual.
The Fully Networked Car Geneva, 4-5 March Jean-Pierre Jallet Car Active Noise Cancellation for improved car efficiency, From/In/To car voice communication.
Colombia, September 2013 QoS assessment for mobile speech services including user interfaces in vehicles and hands-free communication Dr.-Ing. H.
Addition Facts
Researches and Applications for Automotive Field Andrea Azzali, Eraldo Carpanoni, Angelo Farina University of Parma.
Sample questions from the Motorway Rules units of the Driving Theory syllabus. ESOL for Driving.
SIMS-201 The Telephone System Wired and Wireless.
(Data and Signals - cont)
Alertness and Attitude
1 Presenters notes... to help you … At the bottom right of each page there is a page number, when the slide has finished an automated sequence a small.
Acoustic Echo Cancellation for Low Cost Applications
Driving Theory Session 2 Cut and Paste answers
College of Engineering Capacity Allocation in Multi-cell UMTS Networks for Different Spreading Factors with Perfect and Imperfect Power Control Robert.
نیمسال اوّل افشین همّت یار دانشکده مهندسی کامپیوتر مخابرات سیّار (626-40) ظرفیت انتقال اطلاعات.
VMR-WB – Operation of the 3GPP2 Wideband Speech Coding Standard M. Jelinek†, R. Salami‡ and S. Ahmadi * †University of Sherbrooke, Canada ‡VoiceAge Corporation,
Speech Coding Workshop 2000 Jean-Marc Valin, Roch Lefebvre 1 IEEE Speech Coding Workshop Sept 17–20, 2000 Lake Lawn Resort Delavan, WI Jean-Marc Valin,
S Transmission Methods in Telecommunication Systems (5 cr)
Doc.: IEEE /1234r0 Submission November 2009 Sameer Vermani, QualcommSlide 1 Interference Cancellation for Downlink MU-MIMO Date: Authors:
ST/SEU-CO | | © Robert Bosch GmbH reserves all rights even in the event of industrial property rights. We reserve all rights of disposal such as copying.
univox Loop Systems, since 1965.
Addition 1’s to 20.
SUPRESSION OF LORAN-C NAVIGATION SIGNAL IN DIGITAL CAVE RADIOS (AN EXPERIMENTAL APPROACH) Mr. Antonio Muñoz Group of Technologies in hostile Environments.
Week 1.
Security Systems BU Communication Systems ST/SEU-CO 1 DCN SA SD Mix-M Mode Mix-Minus Mode  Prevent echo due to this feedback.  Audio-signals,
Acousteen, Herman Steeneken 1 Past, Present and Future of STI Herman J. M. Steeneken (
4/11/20151 Mobile Computing COE 446 Wireless Multiple Access Tarek Sheltami KFUPM CCSE COE Principles.
AVQ Automatic Volume and eQqualization control Interactive White Paper v1.6.
August 2004Multirate DSP (Part 2/2)1 Multirate DSP Digital Filter Banks Filter Banks and Subband Processing Applications and Advantages Perfect Reconstruction.
Speech Enhancement through Noise Reduction By Yating & Kundan.
How ClearOne Microphone Technology Improves Speech Recognition Results SpeechTEK 2007 Kurt Olsen Director of Product Marketing ClearOne.
Speech Compression. Introduction Use of multimedia in personal computers Requirement of more disk space Also telephone system requires compression Topics.
Room Acoustics: implications for speech reception and perception by hearing aid and cochlear implant users 2003 Arthur Boothroyd, Ph.D. Distinguished.
Abstract Binaural microphones were utilised to detect phonation in a human subject (figure 1). This detection was used to cut the audio waveform in two.
Analysis and Synthesis of Shouted Speech Tuomo Raitio Jouni Pohjalainen Manu Airaksinen Paavo Alku Antti Suni Martti Vainio.
© 2006 AudioCodes Ltd. All rights reserved. AudioCodes Confidential Proprietary Signal Processing Technologies in Voice over IP Eli Shoval Audiocodes.
1 TAC2000/ IP Telephony Lab Perceptual Evaluation of Speech Quality (PESQ) Speaker: Wen-Jen Lin Date: Dec
1 New Technique for Improving Speech Intelligibility for the Hearing Impaired Miriam Furst-Yust School of Electrical Engineering Tel Aviv University.
The use of FM systems with Cochlear Implants- How has research had an impact on practice? Sarah Flynn and Elizabeth Wood South of England Cochlear Implant.
1 Recent development in hearing aid technology Lena L N Wong Division of Speech & Hearing Sciences University of Hong Kong.
Department of Electrical Engineering | University of Texas at Dallas Erik Jonsson School of Engineering & Computer Science | Richardson, Texas ,
For 3-G Systems Tara Larzelere EE 497A Semester Project.
The Fully Networked Car Geneva, 4-5 March Wideband Speech Communications: the Good, the Bad, and the Ugly Scott Pennock Sr. Hands-Free Standards.
Microphone Integration – Can Improve ARS Accuracy? Tom Houy
Technical Seminar Presented by :- Debabandana Apta (EC ) National Institute of Science and Technology [1] “ECHO CANCELLATION” Presented.
Noise Canceling Headphones Team Members Doan Thanh Khiet Tran Jasmine Khadem Kalina Guentcheva.
Colombia, September 2013 The importance of models and procedures for planning, monitoring and control in the provision of communications services.
Definition and Coordination of Signal Processing Functions for telephone connections involving automotive speakerphones Scott Pennock Senior Hands-Free.
New Developments in Hearing Technology Dave Gordey, M. Sc. AUD (c)
Nico De Clercq Pieter Gijsenbergh.  Problem  Solutions  Single-channel approach  Multichannel approach  Our assignment Overview.
RockSat-C 2012 ISTR Individual Subsystem Testing Report Minnesota Sound Wreckers University of Minnesota 2/13/12 1 Alexander Richman Jacob Schultz Justine.
Hearing Amplification. Hearing loss due to Inner ear pathologies.
Timo Haapsaari Laboratory of Acoustics and Audio Signal Processing April 10, 2007 Two-Way Acoustic Window using Wave Field Synthesis.
1 Reconfigurable Acceleration of Microphone Array Algorithms for Speech Enhancement Ka Fai Cedric Yiu, Yao Lu, Xiaoxiang Shi The Hong Kong Polytechnic.
In-car Speech Recognition Using Distributed Microphones Tetsuya Shinde Kazuya Takeda Fumitada Itakura Center for Integrated Acoustic Information Research.
Laboratory for Experimental ORL K.U.Leuven, Belgium Dept. of Electrotechn. Eng. ESAT/SISTA K.U.Leuven, Belgium Combining noise reduction and binaural cue.
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
Motorola presents in collaboration with CNEL Introduction  Motivation: The limitation of traditional narrowband transmission channel  Advantage: Phone.
Speech Enhancement Summer 2009
The Importance of In-Mask Communications
– Workshop on Wideband Speech Quality in Terminals and Networks
Govt. Polytechnic Dhangar(Fatehabad)
Presentation transcript:

The Fully Networked Car Geneva, 4-5 March Automotive Speech Enhancement of Today: Applications, Challenges and Solutions Tim Haulick Harman/Becker Automotive Systems

The Fully Networked Car Geneva, 4-5 March Automotive Speech Enhancement Applications Phone Speech dialog system Send side processing: beamforming, echo and noise suppression Receive side processing: gain control In-car communication: feedback suppression, automatic gain adjustment Communication channel to vehicle Communication channel from vehicle Communication channel in vehicle

The Fully Networked Car Geneva, 4-5 March Extended Speech Enhancement Mobile Phone Speech dialog system Send side processing: echo & noise suppression, wind-noise suppression speech reconstruction, beamforming/postfilter Receive side processing: gain control, bandwidth extension, adaptive equalization Speaker independent speech knowledge Speaker (in-)dependent speech knowledge

The Fully Networked Car Geneva, 4-5 March Bandwidth extension for wideband speech signals (bandwidth 7 kHz, e.g. AMR wideband codec G.722.2) – extension of high frequency components up to 11kHz. Narrowband connection: Wideband connection: Bandwidth extension for narrowband speech signals (bandwidth 3.4 …3.8 kHz) – extension of low frequency components and extension of high frequency components up to 5.5 or 8 kHz. Wideband input Wideband output Narrowband output Narrowband input Bandwidth Extension – Examples

The Fully Networked Car Geneva, 4-5 March Bandwidth Extension - Enhancements o Adaptation to different transmission channels (variable bandwidth, different signal-to-noise ratios) o Adaptation to different sound amplifiers o Adaptation depending on the background noise in the vehicle

The Fully Networked Car Geneva, 4-5 March Speech Reconstruction Motivation: At medium and high driving speed low frequency speech components are often masked by the background noise. However, standard noise suppression methods of today often fail in these situations. As a result the processed speech signal sounds thin and distorted. For further improvement of the speech quality a reconstruction approach is an alternative. However, speech recon- struction starts where conventional noise reduction fails …

The Fully Networked Car Geneva, 4-5 March Phone (downlink) Phone (uplink) Loud- speaker Micro- phone Receive side processing Speaker (in-)dependent speech knowledge Analysis filter bank Residual echo and noise suppression Mixer Echo cancellation Synthesis filter bank Speech reconstruction Speech Reconstruction – Algorithmic Overview

The Fully Networked Car Geneva, 4-5 March Microphone signal Conven- tional Recon- structed Time in seconds Frequency in Hz Time in seconds Before CDMA coding Time in seconds Frequency in Hz Time in seconds Frequency in Hz Microphone signal Conven- tional Recon- structed Conven- tional Recon- structed Conven- tional Recon- structed After CDMA coding 120 km/h 160 km/h Speech Reconstruction – Audio Examples

The Fully Networked Car Geneva, 4-5 March Speech Reconstruction – Evaluation (CDMA) o A subjective evaluation was performed by 10 trained listeners. The signals have been coded and decoded with the CDMA enhanced variable rate codec (EVRC) prior to listening. o The test set comprised 20 different test scenarios with SNRs ranging from 3 dB to 12 dB.

The Fully Networked Car Geneva, 4-5 March Beamformer/Postfilter o Beamformers perform a directional filtering: signals from the desired speaker direction are passed while signals arriving from other directions are attenuated. o The achievable noise reduction is dependent on the spatio-temporal properties of the soundfield and the number of microphones. o By extending the beamformer with a spatial postfilter a high directionality can even be achieved with a small number of microphones. Spatial postfilter Ratio computat. MAP approxi- mation Fixed beamf. Micro- phone spectra Blocking matrix Interference canceller Noise power estimation Output spectrum |…|² GSC BeamformerPostfilter

The Fully Networked Car Geneva, 4-5 March © 2008 Harman International Industries, Incorporated. All rights reserved. Page 11 Signal of the first microphone Beamformer/Postfilter – Audio Example 2-channel processing (beamformer/postfilter) Audio example: 2-channel microphone array, 120 km/h, driver and passenger are talking

The Fully Networked Car Geneva, 4-5 March By extending the adaptive beamformer with a spatial postfilter the voice recognition performance can be improved significantly without impairing the speech quality. Reference system: 2-channel adaptive beamformer (GSC) without postfilter Beamformer/Postfilter – Evaluation Results 120 km/h100 km/h window 1/4 open 120 km/h double-talk relative enhancement of WER [%] Voice Recognition Performance 120 km/h100 km/h window 1/4 open 120 km/h double-talk Log-Spectral Distance Speech Distortion Adaptive Beamformer with Postfilter Adaptive Beamformer

The Fully Networked Car Geneva, 4-5 March Wind Noise Suppression o Problem: Due to design reasons and lack of space the standard wind shield of hands-free microphones is often insufficient. For this reason the microphone signal is often impaired by wind noise caused by the fan or an open top of a convertible. o Solution: Suppression of wind noise by algorithmic means taking advantage of the statistical properties of the noise Microphone Signal 2 channel processed output signal

The Fully Networked Car Geneva, 4-5 March In-Car Communication (ICC) Current Situation: o Communication between passengers is difficult, because of the acoustic loss (especially front to back). o Front passengers have to speak louder than normal – longer conversations will be tiring. o Driver turns around – road safety is reduced. Solution: o Improve the speech quality and intelligibility by means of an intercom system. Application: o Mid and high class automobiles, which are already equipped with the necessary audio and signal processing components. o Minibuses, Vans, etc. (cars more than 2 rows of seats). Passenger compartment *Acoustic loss (referred to the ear of the driver) -5…-15dB*

The Fully Networked Car Geneva, 4-5 March Configurations One-Way System o 2-4 microphones o 2-4 loudspeakers Two-Way System o 4-8 microphones o 6-8 loudspeakers ICC System ICC System

The Fully Networked Car Geneva, 4-5 March Subjective Evaluation Driving Scenarios o 0 km/h beside motorway o 130 km/h on motorway o Prerecorded speech sentences with different Lombard levels were played back via an artificial mouth. o Binaural recordings were made by means of a HEAD acoustics NoiseBook on the seat behind the driver.

The Fully Networked Car Geneva, 4-5 March Results of the Comparison Mean Opinion Score Test 0 km/h, vehicle parked close to a motorway: o 19.7 % prefer the system to be switched off o 29.7 % have no preference o 50.6 % prefer an activated system 130 km/h, motorway: o 4.3 % prefer the system to be switched off o 7.1 % have no preference o 88.6 % prefer an activated system (25 signal pairs for each driving situation / 15 listeners per scenario):

The Fully Networked Car Geneva, 4-5 March Results of Modified Rhyme Tests (MRT) 0 km/h, vehicle parked close to a motorway: o No significant difference (95.2 % system off versus 95.0 % system on) o Due to the automatic gain adjustment the intercom system operates with only very small gain at these noise levels 130 km/h, motorway: o Significant improvement of the MRT error rate o Nearly 50 % error reduction (85.4 % correct answers increased to 92.2 % correct answers) (48 utterances were presented to each listener per driving situation):