17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World.

Slides:



Advertisements
Similar presentations
Neural networks Introduction Fitting neural networks
Advertisements

Speech Processing for NSR Vs DSR Veeru Ramaswamy PhD CTO, Vianix LLC
Masters Presentation at Griffith University Master of Computer and Information Engineering Magnus Nilsson
Histogram-based Quantization for Distributed / Robust Speech Recognition Chia-yu Wan, Lin-shan Lee College of EECS, National Taiwan University, R. O. C.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
Streaming Video over the Internet: Approaches and Directions Dapeng Wu, Yiwei Thomas Hou et al. Presented by: Abhishek Gupta
Compressive Oversampling for Robust Data Transmission in Sensor Networks Infocom 2010.
SCHOOL OF COMPUTING SCIENCE SIMON FRASER UNIVERSITY CMPT 820 : Error Mitigation Schaar and Chou, Multimedia over IP and Wireless Networks: Compression,
Digital Data Transmission ECE 457 Spring Information Representation Communication systems convert information into a form suitable for transmission.
Department of Communication Technology 30/08/ A Comparative Study of Feature-Domain Error Concealment Techniques for Distributed Speech Recognition.
Efficient Motion Vector Recovery Algorithm for H.264 Based on a Polynomial Model Jinghong Zheng and Lap-Pui Chau IEEE TRANSACTIONS ON MULTIMEDIA, June.
Spatial and Temporal Data Mining
Bandwidth-Efficient Method for Adaptive Forward Error Correction on Wireless Local Area Network  Co-Presenters: David R. Pollard, Graduate Student, Eastern.
Video Quality Evaluation for Wireless Transmission with Robust Header Compression P. Seeling and M. Reisslein and F.H.P. Fitzek and S. Hendrata Fourth.
The Chinese University of Hong Kong Department of Computer Science and Engineering Lyu0202 Advanced Audio Information Retrieval System.
專題研究 WEEK3 LANGUAGE MODEL AND DECODING Prof. Lin-Shan Lee TA. Hung-Tsung Lu.
H.264/AVC for Wireless Applications Thomas Stockhammer, and Thomas Wiegand Institute for Communications Engineering, Munich University of Technology, Germany.
Cooperative Header Compression F.H.P. Fitzek and T. K. Madsen and P. Popovski and R. Prasad and M. Katz. Cooperative IP Header Compression for Parallel.
Department of Communication Technology A Subvector-Based Error Concealment Algorithm for Speech Recognition over Mobile Networks - ICASSP 2004, Montreal,
A Low-Power Low-Memory Real-Time ASR System. Outline Overview of Automatic Speech Recognition (ASR) systems Sub-vector clustering and parameter quantization.
Multiple Sender Distributed Video Streaming Nguyen, Zakhor IEEE Transactions on Multimedia April 2004.
May 20, 2006SRIV2006, Toulouse, France1 Acoustic Modeling of Accented English Speech for Large-Vocabulary Speech Recognition ATR Spoken Language Communication.
1 Robust HMM classification schemes for speaker recognition using integral decode Marie Roch Florida International University.
Clustering methods Course code: Pasi Fränti Speech & Image Processing Unit School of Computing University of Eastern Finland Joensuu,
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
An Error – Concealment Technique for Wireless Digital Audio Delivery N. Tatlas, A. Floros, T. Zarouchas and J. Mourjopoulos.
AUDIO COMPRESSION msccomputerscience.com. The process of digitizing audio signals is called PCM PCM involves sampling audio signal at minimum rate which.
Introduction to Multimedia Networking (2) Advanced Multimedia University of Palestine University of Palestine Eng. Wisam Zaqoot Eng. Wisam Zaqoot October.
Lector: Aliyev H.U. Lecture №15: Telecommun ication network software design multimedia services. TASHKENT UNIVERSITY OF INFORMATION TECHNOLOGIES THE DEPARTMENT.
ETSI STQ-Aurora Distributed Speech Recognition (DSR) Bernhard Noé Distributed Speech Recognition.
UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.
November 1, 2005IEEE MMSP 2005, Shanghai, China1 Adaptive Multi-Frame-Rate Scheme for Distributed Speech Recognition Based on a Half Frame-Rate Front-End.
Abhik Majumdar, Rohit Puri, Kannan Ramchandran, and Jim Chou /24 1 Distributed Video Coding and Its Application Presented by Lei Sun.
X1X1 X2X2 Encoding : Bits are transmitting over 2 different independent channels.  Rn bits Correlation channel  (1-R)n bits Wireless channel Code Design:
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Adaptive Multi-path Prediction for Error Resilient H.264 Coding Xiaosong Zhou, C.-C. Jay Kuo University of Southern California Multimedia Signal Processing.
Sadaf Ahamed G/4G Cellular Telephony Figure 1.Typical situation on 3G/4G cellular telephony [8]
Geospatial Data Integration Peer-to-Peer Discovery Music Information Processing Automatedtranscription ExpressivePerformance Music Info Retrieval APPLICATIONS.
Distribution of Multimedia Data Over a Wireless Network (DMDoWN): An Introduction Presented By: Rafidah Md Noor Faculty of Computer Science & Information.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
RIDA: A Robust Information-Driven Data Compression Architecture for Irregular Wireless Sensor Networks Nirupama Bulusu (joint work with Thanh Dang, Wu-chi.
1 Blind Channel Identification and Equalization in Dense Wireless Sensor Networks with Distributed Transmissions Xiaohua (Edward) Li Department of Electrical.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.
Scalable Video Coding and Transport Over Broad-band wireless networks Authors: D. Wu, Y. Hou, and Y.-Q. Zhang Source: Proceedings of the IEEE, Volume:
1.INTRODUCTION The use of the adaptive codebook (ACB) in CELP-like speech coders allows the achievement of high quality speech, especially for voiced segments.
Dr. Sudharman K. Jayaweera and Amila Kariyapperuma ECE Department University of New Mexico Ankur Sharma Department of ECE Indian Institute of Technology,
BY KALP SHAH Sentence Recognizer. Sphinx4 Sphinx4 is the best and versatile recognition system. Sphinx4 is a speech recognition system which is written.
Wireless Networks Standards and Protocols & x Standards and x refers to a family of specifications developed by the IEEE for.
金聲玉振 Taiwan Univ. & Academia Sinica 1 Spoken Dialogue in Information Retrieval Jia-lin Shen Oct. 22, 1998.
Noise Reduction in Speech Recognition Professor:Jian-Jiun Ding Student: Yung Chang 2011/05/06.
A hybrid error concealment scheme for MPEG-2 video transmission based on best neighborhood matching algorithm Li-Wei Kang and Jin-Jang Leou Journal of.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
Fundamentals of Information Systems, Sixth Edition
Speech recognition in mobile environment Robust ASR with dual Mic
Scalable Speech Coding for IP Networks
Vocoders.
BITS Pilani Pilani Campus EEE G612 Coding Theory and Practice SONU BALIYAN 2017H P.
3.0 Map of Subject Areas.
Research Topic Error Concealment Techniques in H.264/AVC for Wireless Video Transmission Vineeth Shetty Kolkeri EE Graduate,UTA.
On the Integration of Speech Recognition into Personal Networks
LECTURE 15: REESTIMATION, EM AND MIXTURES
Kyoungwoo Lee, Minyoung Kim, Nikil Dutt, and Nalini Venkatasubramanian
Unequal Error Protection for Video Transmission over Wireless Channels
Listen Attend and Spell – a brief introduction
Huawei CBG AI Challenges
Presentation transcript:

17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World Wide Web”, IEEE Journal on Selected Areas in Communications, Jan “Exploiting Temporal Correlation of Speech for Error Robust and Bandwidth Flexible Distributed Speech Recognition,” IEEE Trans. on Audio, Speech and Language Processing, May “Robust speech recognition over mobile and IP networks in burst-like packet loss,” IEEE Trans. on Audio, Speech and Language Processing, Jan “An Integrated Solution for Error Concealment in DSR Systems over Wireless Channels,” Interspeech “A Unified Probabilistic Approach to Error Concealment for Distributed Speech Recognition,” Interspeech 2005

An Example Partition of Speech Recognition Processes into Client/Sever Distributed Speech Recognition (DSR) and Wireless Environment –compressed and encoded feature parameters transmitted in packets  Client/Server Structure Server Clients Wireless Network Front-end Signal Processing Acoustic Models Lexicon Feature Vectors Linguistic Decoding and Search Algorithm Output Sentence Speech Corpora Acoustic Model Training Language Model Construction Text Corpora Lexical Knowledge-base Language Model Input Speech Grammar Server Client

Problems with Wireless Networks –limited/dynamic bandwidth, low/time-varying bit rates –higher/time-varying error rates, random/bursty errors Client-Only Model Possible Models for Distributed Speech Recognition (DSR) under Wireless Environment  Server-Only Model Feature Extraction Recognition Wireless Network Server: Applications Hand-held device-Client Speech Signal recognition results –speech recognition independent of wireless environment –limitation by computation requirements for hand-held device Client-Server Model compressed feature parameters –proper division of computation requirements on client/server –bandwidth saving –not compatible to existing wireless voice communications –original speech can’t be recovered from MFCC –compatible to existing voice communications –seriously degraded recognition accuracy (A) –need to find recognition efficient feature parameters out of perceptually efficient feature parameters (B) Feature Extraction Feature Compression Wireless Network Hand-held device-Client Speech Signal Feature Recovery Recognition Server Applications Speech Encoder Wireless Network Hand-held device-Client Speech Signal for voice communications Feature Extraction Recognition (B) Feature Extraction Recognition (A) Speech Decoder Recovered speech Applications Server

Client-Server Model Split Vector Quantization for Feature Parameters (as an example) Error Protection –different error correction/detection schemes applied to V.Q. bit patterns for different parameters or different bits for scalar quantization –important bits well protected while extra bit rate for error protection minimized –correct identification of errors very helpful but at higher cost –compatibility with existing wireless networking platform needed Error Concealment Examples –extrapolation –interpolation also possible –performed with those sub-vectors with errors (if identifiable) only x t : feature vector at time index t –delta parameters evaluated at server –bit rate minimized –computation requirements minimized –recognition accuracy degradation minimized with acoustic models trained by quantized parameters (matched condition) C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 C8C8 C9C9 C 10 C 11 C 12 E C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 C8C8 Sub-vectors C9C9 C 10 C 11 C 12 E V.Q. Scalar Quantization (with optimized bit allocation)