Presentation is loading. Please wait.

Presentation is loading. Please wait.

17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World.

Similar presentations


Presentation on theme: "17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World."— Presentation transcript:

1 17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World Wide Web”, IEEE Journal on Selected Areas in Communications, Jan 1999 2. “Exploiting Temporal Correlation of Speech for Error Robust and Bandwidth Flexible Distributed Speech Recognition,” IEEE Trans. on Audio, Speech and Language Processing, May 2007 3. “Robust speech recognition over mobile and IP networks in burst-like packet loss,” IEEE Trans. on Audio, Speech and Language Processing, Jan. 2006 4. “An Integrated Solution for Error Concealment in DSR Systems over Wireless Channels,” Interspeech 2006 5. “A Unified Probabilistic Approach to Error Concealment for Distributed Speech Recognition,” Interspeech 2005

2 An Example Partition of Speech Recognition Processes into Client/Sever Distributed Speech Recognition (DSR) and Wireless Environment –compressed and encoded feature parameters transmitted in packets  Client/Server Structure Server Clients Wireless Network Front-end Signal Processing Acoustic Models Lexicon Feature Vectors Linguistic Decoding and Search Algorithm Output Sentence Speech Corpora Acoustic Model Training Language Model Construction Text Corpora Lexical Knowledge-base Language Model Input Speech Grammar Server Client

3 Problems with Wireless Networks –limited/dynamic bandwidth, low/time-varying bit rates –higher/time-varying error rates, random/bursty errors Client-Only Model Possible Models for Distributed Speech Recognition (DSR) under Wireless Environment  Server-Only Model Feature Extraction Recognition Wireless Network Server: Applications Hand-held device-Client Speech Signal recognition results –speech recognition independent of wireless environment –limitation by computation requirements for hand-held device Client-Server Model compressed feature parameters –proper division of computation requirements on client/server –bandwidth saving –not compatible to existing wireless voice communications –original speech can’t be recovered from MFCC –compatible to existing voice communications –seriously degraded recognition accuracy (A) –need to find recognition efficient feature parameters out of perceptually efficient feature parameters (B) Feature Extraction Feature Compression Wireless Network Hand-held device-Client Speech Signal Feature Recovery Recognition Server Applications Speech Encoder Wireless Network Hand-held device-Client Speech Signal for voice communications Feature Extraction Recognition (B) Feature Extraction Recognition (A) Speech Decoder Recovered speech Applications Server

4 Client-Server Model Split Vector Quantization for Feature Parameters (as an example) Error Protection –different error correction/detection schemes applied to V.Q. bit patterns for different parameters or different bits for scalar quantization –important bits well protected while extra bit rate for error protection minimized –correct identification of errors very helpful but at higher cost –compatibility with existing wireless networking platform needed Error Concealment Examples –extrapolation –interpolation also possible –performed with those sub-vectors with errors (if identifiable) only x t : feature vector at time index t –delta parameters evaluated at server –bit rate minimized –computation requirements minimized –recognition accuracy degradation minimized with acoustic models trained by quantized parameters (matched condition) C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 C8C8 C9C9 C 10 C 11 C 12 E C1C1 C2C2 C3C3 C4C4 C5C5 C6C6 C7C7 C8C8 Sub-vectors C9C9 C 10 C 11 C 12 E V.Q. Scalar Quantization (with optimized bit allocation)


Download ppt "17.0 Distributed Speech Recognition and Wireless Environment References: 1. “Quantization of Cepstral Parameters for Speech Recognition over the World."

Similar presentations


Ads by Google