Voice Coding in 3G Networks S-38.130 Postgraduate Course in Telecommunications Spring 2001 Tommi Koistinen Nokia Networks
Contents PART I Short introduction to 3GPP reference architecture models Media Gateway (MG) Multimedia Resource Functions (MRF) PART II Speech compression – why ? Tandem avoidance Adaptive Multirate (AMR) speech codec Wideband speech coding (AMR-WB) Demonstrations
3GPP Release 99 R99; first phase of 3G entities involved with speech processing are circled with red
3GPP R4 separates MSC to MSC Server and to Media Gateway
3GPP R4…R5 IP Multimedia Subsystem (IMS)
Media Gateway support for several interfaces (A-interface for 2G and Iu-interface for 3G) and for several transmission protocols (ATM, IP, TDM) support for several codecs including the Adaptive Multirate (AMR) codec and future coming wideband codecs electric and acoustic echo cancellation announcement services DTMF and call progress tone generation and detection support for fax/modem/data protocols support for Tandem Free Operation (TFO) and Transcoder Free Operation (TrFO) bad frame handling IP protocol handling (RTP/RTCP, encryption, QoS support)
Media Resource Functions Unit audio/video conferencing services speech enhancements ?
Tandem Avoidance in 2G Current status: no Tandem Free Operation (TFO)
Tandem Avoidance in 2G Better speech quality with Tandem Free Operation (TFO)
Tandem Avoidance in 3G Transcoder Free Operation (TrFO) AMR modes are negotiated by inband procedure.
Speech Compression – Why ? to save transmission capacity to save radio resources to save storage capacity more compression (40%) with voice activity detection (VAD) and discontinuous transmission (DTX) error robustness with bad frame handling (BFH)
Speech coding techniques Waveform coders correlation between adjacent samples G.711, G.726 ADPCM etc. Analysis-by-synthesis types of coders Code Excited Linear Prediction (CELP) G.723, G.729, GSM EFR, GSM AMR
The CELP model ”vocal tract” ”glottis”
Adaptive Multirate (AMR) speech codec only mandatory codec for 3G improved speech quality in both half-rate and full-rate modes by means of codec mode adaptation i.e. varying the balance between speech and channel coding for the same gross bit-rate ability to trade speech quality and capacity smoothly and flexibly by a combination of channel and codec mode adaptation; this can be controlled by the network operator on a cell by cell basis
AMR source rates
Structure of AMR encoder
Encoder output
Structure of AMR decoder
Demostration I: Full Rate vs. AMR-NB Erroneous channel (C/I= 26…4 dB) : 1. sample: FR 13 kbps 2. sample: AMR-NB 5.9-12.2 kbps
Wideband speech coding Narrowband 300 – 3400 Hz Wideband 50 – 7000 Hz Wideband AMR speech codec (3GPP R5)
AMR-WB source rates
EFR vs. AMR-NB vs. AMR-WB (in 16 kbps full rate traffic channel) Excellent AMR-NB Very good Subjective speech quality EFR Good Poor Unacceptable Error-free 13 10 7 4 Carrier-to-interface ratio (dB)
Demostration II: AMR-NB vs. AMR-WB Clean speech (highest modes): 1. sample: AMR-NB 12.2 kbps 2. sample: AMR-WB 23.85 kbps
Demostration III: GSM EFR vs. AMR-WB Erroneous channel: 1. sample: GSM EFR 12.2 kbps 2. sample: AMR-WB 6.6-14.25 kbps
Demostration IV: AMR-NB vs. AMR-WB Music (highest modes): 1. sample: AMR-NB 12.2 kbps 2. sample: AMR-WB 23.85 kbps