The Fully Networked Car Geneva, 4-5 March 2009 1 Wideband Speech Communications: the Good, the Bad, and the Ugly Scott Pennock Sr. Hands-Free Standards.

Slides:



Advertisements
Similar presentations
International Telecommunication Union The Fully Networked Car Geneva, 4-5 March 2009 Wrap-up Session Conclusions for Session 5 Session 5: Voice and audiovisual.
Advertisements

The Fully Networked Car Geneva, 4-5 March Jean-Pierre Jallet Car Active Noise Cancellation for improved car efficiency, From/In/To car voice communication.
The Fully Networked Car Geneva, 4-5 March Automotive Speech Enhancement of Today: Applications, Challenges and Solutions Tim Haulick Harman/Becker.
Acoustic Echo Cancellation for Low Cost Applications
HD Voice Unprecedented mobile voice quality
UBIFone & The Technology Ahead 25 th June 2006 This presentation is the property of UbiFone. Distributors or any other individuals or entities are not.
AVQ Automatic Volume and eQqualization control Interactive White Paper v1.6.
VistaPlus TM AP15 Audio Processor Enhanced Contact Center Productivity October 2008.
How ClearOne Microphone Technology Improves Speech Recognition Results SpeechTEK 2007 Kurt Olsen Director of Product Marketing ClearOne.
Improvement of Audio Capture in Handheld Devices through Digital Filtering Problem Microphones in handheld devices are of low quality to reduce cost. This.
Live Music Mode: Case Study and Development Performing Arts Production Workshop Trieste, Italy 14 July 2009 Stefan Karapetkov Emerging Technologies Director.
© 2006 AudioCodes Ltd. All rights reserved. AudioCodes Confidential Proprietary Signal Processing Technologies in Voice over IP Eli Shoval Audiocodes.
1 Video for Live Music Performance and Education Stefan Karapetkov Emerging Technologies Director.
1 Voice Quality Enhancements 2 Outline Acoustic and network echo Noise Reduction (NR) Mobile Cross-talk Control (MCC) Noise Level Compensation (NLC)
1 ITU Workshop on “Quality of Service and Quality of Experience of Multimedia Services in Emerging Networks” (Istanbul, Turkey, 9-11 February 2015) Overview.
SWE 423: Multimedia Systems Chapter 7: Data Compression (1)
VoIP on the iPhone: Imagine the Possibilities Jan Linden, VP of Engineering.
IP Telephony (Article Presentation) by Samir Goswami Source: Rivier College, CS699 Professional Seminar.
Zhengyou Zhang, Qin Cai, Jay Stokes
Research Directions for the Internet of Things Supervised by: Dr. Nouh Sabry Presented by: Ahmed Mohamed Sayed.
1 New Technique for Improving Speech Intelligibility for the Hearing Impaired Miriam Furst-Yust School of Electrical Engineering Tel Aviv University.
Data Communication and Networking Physical Layer and Media.
Leveraging Existing Application Processors in Mobile Devices to Implement VoIP Client.
The Importance of Quality VoIP for Web Conferencing and Collaboration Jan Linden, Vice president of Engineering Global IP Sound, Inc.
Department of Electrical Engineering | University of Texas at Dallas Erik Jonsson School of Engineering & Computer Science | Richardson, Texas ,
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
DIGITAL VOICE NETWORKS ECE 421E Tuesday, October 02, 2012.
VIRTUAL PROTOTYPING of ROBOTS DYNAMICS E. Tarabanov.
Chapter 5. Sound Intensity (db) = 20 log (P1/P2)
OCTOBER 23-24, 2012 VOCODER TECHNOLOGY
Knowledge Base approach for spoken digit recognition Vijetha Periyavaram.
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
SoundSense by Andrius Andrijauskas. Introduction  Today’s mobile phones come with various embedded sensors such as GPS, WiFi, compass, etc.  Arguably,
UmeVoice, Inc. Vikas Rangarajan, Sr. Software Engineer Adithya M.R. Padala, President & CEO. Noise cancellation technology for delivering clear audio.
Copyright 1998, S.D. Personick. All Rights Reserved1 Telecommunications Networking I Lectures 2 & 3 Representing Information as a Signal.
Technical Seminar Presented by :- Debabandana Apta (EC ) National Institute of Science and Technology [1] “ECHO CANCELLATION” Presented.
Super Power BTE A great new Trimmer Family. The new & complete, fully digital Trimmer family ReSound is proud to introduce the complete new trimmer family,
Lector: Aliyev H.U. Lecture №15: Telecommun ication network software design multimedia services. TASHKENT UNIVERSITY OF INFORMATION TECHNOLOGIES THE DEPARTMENT.
SMART ANTENNA under the guidance of Mr. G.V.Kiran Kumar EC
USDOT, RITA RITA: Oversight of USDOT’s R&D programs  University Transportation Centers $100M  UTC Consortia $80M  UTC Multimodal R&D $40M  Intelligent.
Colombia, September 2013 The importance of models and procedures for planning, monitoring and control in the provision of communications services.
17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World.
What’s new in Wideband Audio?
International Telecommunication Union No 1 The Executive Round Tables High-level perspectives and strategies regarding the present and future use of ICT.
The Fully Networked Car Geneva, 3-4 March 2010 Electrical mobility and Climate Change Ziva Patir VP Standards,Regulations and Compliance Better Place Mass.
Definition and Coordination of Signal Processing Functions for telephone connections involving automotive speakerphones Scott Pennock Senior Hands-Free.
1.Processing of reverberant speech for time delay estimation. Probleme: -> Getting the time Delay of a reverberant speech with severals microphone. ->Getting.
© Siemens AG, 2002 s CP RS Agenda The Role of IT for Accident-free Driving Interaction with driver’s physical condition Interaction with the roadside environment.
© 2006 Cisco Systems, Inc. All rights reserved. Optimizing Converged Cisco Networks (ONT) Module 2: Cisco VoIP Implementations.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
Timo Haapsaari Laboratory of Acoustics and Audio Signal Processing April 10, 2007 Two-Way Acoustic Window using Wave Field Synthesis.
ITU-T Workshop “ICTs: Building the green city of the future” - EXPO-2010, 14 May 2010, Shanghai, China Committed to Connecting the World ITU-T Workshop.
ARTIFICIAL INTELLIGENCE FOR SPEECH RECOGNITION. Introduction What is Speech Recognition?  also known as automatic speech recognition or computer speech.
ELECTRONIC SOUND SYSTEMS INTRODUCTION PRINCIPAL USES DESIGN FACTORS SYSTEM COMPONENTS LOUDSPEAKER ARRANGEMENTS DESCRIPTION: ELECTRONIC SYSTEM WHICH REINFORCES.
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
August 3-4, 2004 San Jose, CA Successfully Offering VoIP- Enabled Applications Services Jan Linden Vice President of Engineering.
Speech Recognition Created By : Kanjariya Hardik G.
Motorola presents in collaboration with CNEL Introduction  Motivation: The limitation of traditional narrowband transmission channel  Advantage: Phone.
Spectral subtraction algorithm and optimize Wanfeng Zou 7/3/2014.
International Telecommunication Union The Fully Networked Car Geneva, 3-4 March 2010 Human Machine Interface (HMI) and signal processing for Intelligent.
Speech Enhancement Summer 2009
The Importance of In-Mask Communications
Objective and Subjective Audio Assessment of MP3 Players’ Quality
Jack Pokrzywa Director Ground Vehicle Standards, SAE International
Reinhard Scholl Deputy to the Director,
– Workshop on Wideband Speech Quality in Terminals and Networks
Scott Pennock Senior Hands-Free Standards Specialist
ITU-T SG16’s involvement in Car Communications
Presenter: Shih-Hsiang(士翔)
VoiceXML An investigation Author: Mya Anderson
Presentation transcript:

The Fully Networked Car Geneva, 4-5 March Wideband Speech Communications: the Good, the Bad, and the Ugly Scott Pennock Sr. Hands-Free Standards Specialist QNX Software Systems (Wavemakers)

The Fully Networked Car Geneva, 4-5 March Outline o Introduction o The Good o The Bad o The Ugly o Conclusions

The Fully Networked Car Geneva, 4-5 March Introduction o What is wideband (WB) speech? Speech has energy from around Hz Traditional narrowband (NB) terminals and networks bandlimit speech down to around Hz WB speech in this presentation refers to a bandwidth of Hz o Why is WB speech important to automotive? More robust to vehicle noise Reduces driver distraction Helps enable spatial auditory displays o This presentation will review the benefits, challenges, and unresolved issues with WB speech in an automotive environment

The Fully Networked Car Geneva, 4-5 March The “Good” o Improves task performance Better speech comprehension Reduced driver distraction Improved talker identification Better speech localization Other potential task improvements o Preferred by users Higher quality Less listening-effort More comfortable loudness-level Other factors influencing preference o Task performance benefits alone make a compelling argument for deploying WB speech in the vehicle

The Fully Networked Car Geneva, 4-5 March WB speech provides extra frequency and temporal information This “difference spectrogram” was calculated by subtracting the NB from WB spectrogram of someone saying “the juice of lemons makes fine punch”.

The Fully Networked Car Geneva, 4-5 March WB speech increases intelligibility and is more robust to vehicle noise Probability of correct response by bandwidth and SNR

The Fully Networked Car Geneva, 4-5 March WB speech improves speech comprehension and reduces driver distraction This figure illustrates auditory streaming of speech. Shapes represent phonetic units that have been recognized. Dotted lines show information that would be missing without wideband speech.

The Fully Networked Car Geneva, 4-5 March The “Bad” o Users are more sensitive to WB echo and noise due to perceptual effects Ear is most sensitive to high frequency region of WB speech Loudness of echo and noise in new frequency regions will add to loudness in narrowband region High frequency echo is not masked as effectively by one’s own voice o Acoustic Echo Cancellers (AEC) have a more difficult time removing high frequency echo Poor excitation signal makes it harder to drive echo canceller to convergence High frequency distortion is falsely classified as driver’s speech and can prevent AEC from training

The Fully Networked Car Geneva, 4-5 March The challenges presented by WB speech can be addressed o Good electro-acoustic design of vehicle platforms Careful acoustic design of vehicle cabin Proper selection, placement, orientation, and mounting of microphones and loudspeakers High quality signal transport (e.g., optical, differential) o High performance speech enhancement algorithms AEC Noise Reduction (NR) Low-complexity compression for devices with limited resources

The Fully Networked Car Geneva, 4-5 March The “Ugly” o Interoperability issues WB terminal users will experience inconsistent loudness and quality NB terminal users will become less satisfied with quality because of exposure to WB speech o Long transition period

The Fully Networked Car Geneva, 4-5 March Users of WB terminals will experience inconsistent loudness and quality o Solution for inconsistent loudness is to use Receive Automatic Gain Control (AGC) based on perceived loudness instead of RMS or peak levels o Differences in quality can be reduced by using BandWidth Extension (BWE) and High Frequency Encoding (HFE) techniques

The Fully Networked Car Geneva, 4-5 March There will be a long transition period o Deployment has already started o Not clear when WB speech will take-off, but automotive is already well positioned Vehicle Audio Systems are currently wideband capable WB microphones available and easy to drop-in Several WB speech coders are already standardized o Even after WB speech takes hold, hybrid WB/NB connections will be around for a long time NB network equipment and terminals are built to last Continued use in certain areas

The Fully Networked Car Geneva, 4-5 March Conclusions o WB speech improves task performance o Users prefer WB speech o WB speech is important to automotive More robust to vehicle noise Reduces driver distraction Helps enable spatial auditory displays o WB speech will be a key differentiator for automotive OEMs and service providers

The Fully Networked Car Geneva, 4-5 March Conclusions (continued) o Successful automotive deployment depends on: Attention to the design of vehicle platforms High performance speech enhancement algorithms (e.g., AEC, NR, etc.) o Interoperability issues will eventually get worked out o NB network equipment/terminals will be in use in certain areas for a long time