Presentation is loading. Please wait.

Presentation is loading. Please wait.

CELP / FS-1016 – 4.8kbps Federal Standard in Voice Coding

Similar presentations


Presentation on theme: "CELP / FS-1016 – 4.8kbps Federal Standard in Voice Coding"— Presentation transcript:

1 CELP / FS-1016 – 4.8kbps Federal Standard in Voice Coding
Mehmet Umut Demircin

2 History Code excited linear prediction (CELP) first introduced by B.S. Atal and M.A Schroeder at the 1984 ICC. In 1988 DoD selected the CELP algorithm developed by AT&T Bell Laboratories as the basis for Federal Standard 4.8 kbps voice coder. (FS-1016) Produced low-rate coded speech comparable to that of medium-rate waveform coders.

3 What is new in CELP? Analysis-by-Synthesis Linear Prediction
Excitation sequence is selected from a codebook by closed-loop optimization. Adaptive and stochastic codebooks. Long-term Linear Prediction Pitch (fine) structure of the speech is predicted. Perceptual Weighting (Filtering) Shapes the error such that quantization noise is masked by high-energy formants. A Hybrid Coder Other Standards: VSELP, LD-CELP -Closed Loop Long Term prediction is a milestone and improved the quality significantly. -Human auditory system is exploited. It is a hybrid coder in the sense that it combines the properties of model-based coders by representing the formant and pitch structure of the speech and properties of the waveform coders by matching the input speech waveform. Vector sum excited linear prediction. 8kbps version is adopted for North American Digital Cellular system. Low-delay CELP used in 16 kbps G.728 coder.

4 Generic CELP Coder Block Diagram
Perceptual Filtering Innovation sequence is used as excitation sequence. Open-loop prediction searches the auto-correlation. Closed-loop is expensive. T (20-147). Generally first open-loop and than refine by closed-loop. Short-term Linear Prediction: Long-term Prediction: Open Loop: Closed-Loop: 20≤  ≤ 147 Choosing the Excitation Sequence

5 Linear Prediction (LP)
Computed for 30ms frames. Captures the formant structure. 10th order autocorrelation LPC is performed. LP parameters are represented with Line Spectrum Pairs (LSP). Quantize using 4 bits for each of f2 – f5 and 3 bits for each of the others (34 bits in total) from empirically determined probability density functions. Smooth filter transitions by linearly interpolating a new set of LSP frequencies every ¼ frame. No pre-emphasis is used. Pre-emphasis is used in order to reduce the effects of fixed-point arithmetic used in LPC-10. First order high-pass FIR filter. LSP code the frequencies of zeros of a modified polynomials generated from A(z). They are less sensitive to quantization errors. And can be sorted in terms of significance. More bits can be used to code more significant ones.

6 Code Books Codebook originally consisted of Gaussian sequences; 1024 vectors 40-samples (5ms) Schroeder and Atal claim prediction errors after short-term and long-term prediction have Gaussian pdf. Exhaustive search. Sampled at 8khz. Unit variance.

7 FS-1016 Standard Block Diagram

8 FS-1016 Codebook Contains two parts: Adaptive Codebook
Stochastic Codebook Adaptive codebook is also called long-term predictor. It extracts the pitch information.

9 Adaptive Codebook (ACB)
It is a delayed version of previous excitation samples multiplied by a gain, f. The value of m is in the range 20 ≤ m ≤ 147 ⇒ 7 bits (400Hz > f0 > 54Hz). Some coders use fractional m for improved resolution at high frequencies: this requires interpolation.

10 Stochastic Codebook (SCB)
Each code-vector contain 60 samples. Stochastic codebook contains 1082 independent random values from the set {–1, 0, +1} with probabilities {0.1, 0. 8, 0.1}. The values of k is in the range 0 ≤ k ≤ 511 ⇒ 9 bits are needed. The code-vector (k-value) that gives the minimum weighted error is selected.

11 Weighting Filter Listener will not notice noise at formant frequencies due to higher energy. Errors at noticeable frequencies are emphasized.

12 Error Protection Not all the bits of CELP parameters affect the speech intelligibility to the same degree. Most significant bits of the ACB are protected with (15,11) Hamming code.

13 FS-1016 Bit Allocation

14 Quality Comparison Continuously Variable Slope Delta Modulation (CVSD) 16,000 bps H250 Microphone 1% Random Bit Errors 0.5% Random Block Errors P3C - Orion Aircraft Mobile Command Environment MOS MOS-2.95

15 Implementation Issues
CODEBOOK SIZE CODE SEARCH TOTAL COMPL. QUALITY (DAM) 128 2.4 MIPS 6.8 MIPS 65 256 4.8 MIPS 9.2 MIPS 66 512 9.5 MIPS 13.9 MIPS 67 1024 18.9 MIPS 23.3 MIPS 68 120 ms delay, Average complexity ~16 MIPS LPC-10e: ms delay and ~7 MIPS MELP: ms delay and ~40 MIPS

16 References A. Spanias, ``Speech coding: A tutorial review,'' Proceedings of the IEEE, vol. 82, pp.  , October 1994. Welch, Vanoy C., Thomas E. Tremain and Joseph P. Campbell, Jr., "A Comparison of U.S. Government Standard Voice Coders", IEEE Military Communications Conference (MILCOM) Conference Record, 1989, p Manfred R. Schroeder and Bishnu S. Atal, "Code-Excited Linear Prediction (CELP): High-Quality Speech at Very Low Bit Rates," ICASSP 85 Campbell, J. P., Jr., T. E. Tremain, and V. C. Welch. "The Federal Standard bps CELP Voice Coder." Digital Signal Processing 1, no. 3 (1991):


Download ppt "CELP / FS-1016 – 4.8kbps Federal Standard in Voice Coding"

Similar presentations


Ads by Google