Download presentation
Presentation is loading. Please wait.
Published byIrene Gaines Modified over 9 years ago
1
DSP C5000 Chapter 23 Mobile Communication Speech Coders Copyright © 2003 Texas Instruments. All rights reserved.
2
ESIEE, Slide 2Outline Speech Coding, CELP Coders Speech Coding, CELP Coders Implementation using C54x Implementation using C54x
3
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 3 Outline – Speech Coding Generalities on speech and coding Generalities on speech and coding Generalities on speech and coding Linear Prediction based coders Linear Prediction based coders Linear Prediction based coders Short term and long term prediction Vector Quantization CELP coders CELP coders CELP coders Structure and calculations Standards Standards
4
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 4 Applications of Speech Coding Digital Transmissions On wired telephone: Multiplexing Integration of services On wireless channels: Spectral efficiency For better protection against errors Voice mail/messaging Storage: telephone answering machine Secure phone
5
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 5 Characteristics of Coders Bit Rate D: 50 bps < D < 96 kbps Coding Delay ~ frame delay Quality Objective measurements: SNR, PSQM Subjective measurements: MOS (excellent,good,fair,poor,unacceptable) Intelligibility: Objective measure STI or subjective DRT Acceptability: E model of ETSI standard, communicability Immunity to noise Complexity
6
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 6 Objective Evaluation of the Quality The PSQM method: Objective evaluation Based on a model of auditive perception Takes into account the masking effects Good correlation with the MOS grade in « basic » conditions: Low bit rate speech coding, tandem, transmission errors,... But sometimes not very reliable : Loss of frames, effect of the automatic control Still under development (PSQM+)
7
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 7 Subjective Evaluation of Quality using the ACR Method yielding MOS score A great number of auditors give grades to a great number of speech sequences. Database with phonetically balanced sentences Presentation in random order Naive auditors Statistical processing of results gives the MOS. MOS = Mean Opinion Score
8
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 8 Speech Production
9
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 9 Speech Signal
10
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 10 Speech Spectrum for a Voiced Sound
11
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 11 Speech Spectrogram Non stationary Voiced / unvoiced
12
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 12 Calculation of Spectrograms Preac = Preaccentuation, enhances high freqeuncies Window = limits the edge effects
13
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 13 Example: Time Signal and Spectrogram Time Frequency Time SPECTROGRAM TIME SIGNAL
14
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 14 Equivalent Electrical Model
15
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 15 Simplified Speech Production Model y(t)=h(t)*e(t) - Y(z)= H(z)E(z)
16
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 16 All Pole Model of the Spectrum Shaping Filter The filter H(z) represents the spectral envelope since the excitation has a white spectrum.
17
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 17 Short Term Linear Prediction The coefficients of H(z)=1/A(z) can be obtained by linear prediction. Short term analysis on x(n) speech signal Frames of 10 to 30 ms. Least square error criterion:
18
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 18 Determination of the Spectral Envelope by Linear Prediction Prediction error e(n) = residual is nearly white, so the spectral envelope of x(n) can be approximated by Sx(f):
19
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 19 Calculation of the Prediction Coeffcients The prediction coefficients a i are the solution of the «normal equations»: The Levinson Durbin algorithm is often used to solve these equations
20
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 20 Example of Linear Prediction Amplitude of the speech signal Amplitude of residual signal
21
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 21 Example of Linear Prediction: Spectral Envelope Estimation Formants
22
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 22 Estimation of the Pitch Period Pitch Period T 0 estimated by correlation of the speech signal or residual. Other methods exist (e.g. cepstrum) F 0 = fundamental frequency = 1/T 0 Fractional pitch estimation if the precision is better than the sampling period.
23
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 23 Long Term Prediction (LTP) The idea is to predict one period of signal from the preceding one: 2 unknowns: b and M. M is the pitch period (when voiced). Least square error criterion is used.
24
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 24 Long Term Prediction (LTP) For a given value of M, optimal b is: The best M value maximizes: All possible values of M must be tested.
25
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 25 Example of Long Term Prediction
26
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 26 LPC 10 Vocoder One of the oldest speech coder is the LPC10 vocoder: The analysis (coder) calculates each frame: Pitch period, prediction coefficients, energy, voicing. The synthesis (decoder) uses these parameters to synthesize speech from the electrical equivalent model.
27
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 27 LPC 10 Vocoder (Order 10) Frame= 22,5 ms
28
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 28 Prediction Spectral Parameters The a i coefficients are sensitive to coding and interpolation. They are replaced by other coefficients: Reflexion coefficients k i, log area ratio LARi. Line spectrum frequencies LSF i. In the LPC10 vocoder The pitch and voicing are coded on 7 bits The log of energy on 5 bits The 10 prediction coefficients ai (transformed in ki and LARi) are coded on 41 bits. A total of 53 bits per frame of 22,5ms = 2400bps
29
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 29 Vector Quantization (2-dimensional example ) Bit rate can be decreased by applying VQ to the coefficients.
30
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 30 Line Spectrum Frequencies LSF, LSP The Line Spectrum Frequencies fi and Line spectrum pairs cos(fi) have good properties for quantization and interpolation. The LSF and LSP are derived from the inverse filter A(z). Build F 1 (z) and F 2 (z) symetrical and antisymmetrical polynomials by (for order 10):
31
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 31 LSF and LSP Roots of F 1 and F 2 on lie on the unit circle and are interleaved. 5 conjugate roots exp(j i ), f i = i /(2 ).
32
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 32 Coders using Short Term and Long Term Prediction RELP MPE LP CELP
33
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 33 RPE-LTP GSM Full Rate Coders GSM Full Rate Coder is called: RPE LTP= Regular Pulse Excited, Long Term Prediction coder The signal u = the best down-sampled version ( 4) of the residual signal r. In CELP coders, vector quantization is applied on the signal. CELP = Code Excited Linear Prediction coder Each frame of residual signal is compared to sequences of signal stored in a codebook. The codebook sequences are white and the codebook is called stochastic codebook.
34
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 34 CELP Coder Basic Scheme Analysis by synthesis (closed loop) to find the best excitation sequence.
35
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 35 Structure of CELP Coder: Perceptual Filter Perceptual filter: the reconstruction error is spectrally weighted exploiting noise masking properties of formants. W(z)=A(z/ 1 )/A(z/ 2 ), 0 1, 2 1 A*(z)=A(z/ ) (poles towards zero)
36
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 36 CELP Coder with Perceptual Filter
37
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 37 Basic CELP Structure: Perceptual Filter Inserted in the 2 Branches H(z)=W(z)/A(z)
38
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 38 CELP Structure: Memory of H(z) Memory of H(z) = Output for a zero input h i = impulse response of H(z)
39
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 39 CELP Coder: Memory of H(z)
40
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 40 CELP: Adaptive Codebook LTP can be realized by an adaptive codebook
41
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 41 CELP with Stochastic Codebook The adaptive codebook stores the past residual frames. It is called adaptive because its content changes with time.
42
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 42 CELP Decoder
43
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 43 CELP Equations Example: Searching through Codebooks The main load is the filtering of all the codebook vectors.
44
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 44 Filtering Matrix H H(n) is the impulse response corresponding to H(z). N = length of the codebook vectors.
45
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 45 Finding the Best Excitation in the Coder: Equation of the Solution J least square criterion For a set of 2 vectors c j,i(j), F is the 2 column matrix of filtered vectors f j,i(j)
46
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 46 CELP Optimal Solution Optimal algorithm finds the best combination of code vectors maximizing the norm and finds the optimal gains g j. But the number of combinations of codebook vectors is very high and the complexity is also great. Example: M=1024 for the stochastic codebook and M=256 for the adaptive codebook Leads to 262 144 solutions to test and 1280 vectors to filter.
47
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 47 Iterative Suboptimal Algorithm for 2 Codebooks First step: Target vector = p Find the best vector in the adaptive codebook and its gain. Calculate the new target vector p1: Second step: Target vector = p1 Find the best vector in the stochastic codebook and its gain.
48
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 48 Iterative Algorithm
49
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 49 Operations of the Iterative Algorithm At step j, the optimal codebook vector has index i:
50
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 50 Iterative Algorithm Numerical Example : F S =8000Hz, M=256 size of the stochastic codebook M a =128 size of the adaptive codebook Frame size N T =160, 20ms Frames split in 4 subframes of N=40 samples p=10 linear prediction order 10 Mips to filter the stochastic codebook.
51
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 51 Iterative Algorithm The main processing load is the filtering of the codebooks vectors. Many algorithms have been proposed to decrease the computation load: Special structures of the codebook: VSELP: Vector Sum Algebraic codebook: ACELP Linear codebook (the adaptive codebook is linear). Structure of H avoiding the filtering: Diagonalization of H T H
52
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 52 CELP Coding Standards from 4.8 kbps to 16 kbps Federal standard (DOD) (4.8 kbps) frame = 260 samples (30 ms) LPC 8 --> (LSP coding 34 bits) adaptive codebook (256 vectors (fractional pitch)) stochastic codebook (512 vectors (-1,0,1))
53
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 53 VSELP (Vector Sum Excitation Coding) Codebook vectors v are combinations of basis vectors (b1,b2,...,bk) v=+/- b1 +/- b2 +/-... +/- bk Only the basis vectors are filtered Motorola ( 8 kbps) GSM (half rate)(5.6 kbps)
54
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 54 Fractional Pitch The precision of the pitch period is a fraction of sample T S. An interpolation filter is used. B(z)=1-bz- M f with M f =M+ x(n-M- ) can be written as: TF -1 (X(f)*e (-j2 f(M+ )Te) ) = x(n-M)* TF -1 (e (-j2 f Te) ) =x(n-M)* h(n)
55
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 55Standards
56
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 56 Standards Wired Telephony UIT-T G 711 (1972): PCM64 kbps G 721 (1984): ADPCM32 kbps G 728 (1991): LD_CELP16 kbps G 729: CS-ACELP 8 kbps Mobile communications (ETSI - CTIA) GSM (FR ) :RPE_LTP 13 kbps GSM (HR) :VSELP5.6 kbps GSM (EFR) : ACELP 12.2 kbps UMTS (AMR) : ACELP12.2 to 4.75 Kbps Military applications (NATO) FS 1015 (1976): LPC102.4 kbps FS 1016 (1991): CELP4.8 kbps
57
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 57 AMR Adaptive MultiRate Coder for 3G Application 8 Narrow Band NB AMR source coders 12.2 10.2 7.95 7.40 6.70 5.90 5.15 4.75 kbps 9 Wide Band coders WB AMR coders Based on ACELP Frame of 20 ms, fs=8000 Hz
58
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 58 IUT Civil Standards
59
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 59 ETSI and Inmarsat Standards
60
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 60 TIA and RCR Standards
61
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 61 Implementation of CELP Coders on C54x Example of the G729 Annex A. Specific instruction for codebook search Some functions of DSPLIB
62
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 62 Profiling Example for G729 Annex A using C Compiler G729 is a CS-ACELP Coder (ITU 1995) 8Kbps with quality of ADPCM at 32Kbps G726. DSVD: G729 Annex A voice over internet, voice e-mail Digital Simultaneous Voice & Data
63
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 63 G729 Annex A Main Blocks of the Coder Algorithm Frame = 10 ms = 80 Samples. Short term LPC analysis on 40ms frame LSP derived from ai coefficients and quantized using Split VQ. Long Term LTP analysis, 2 subframes of 40 samples. LTP lag and gain. LTP fractional lag (1/3) 8 bits 1 rst subframe and 5 bits for the 2 nd. Search fixed codebook: 2 subframes of 40 samples. Index and gains Code length = 40 with 4 non-zero pulses 1.
64
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 64 Structures of Frames
65
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 65 G729 Annex A, Bit Allocation
66
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 66 G729 Annex A Main Blocks of the Decoder Algorithm The serial received bits are converted into parameters: LSP vector, 2 fractional pitch lags and gains, 2 fixed codebook index and gains. LSP are converted to LP filter coefficients ai and interpolated at each subframe. At each subframe: The excitation is constructed and scaled. The speech is synthesized by filtering the excitation by the LP synthesis filter. Postprocessing by an adaptive postfilter.
67
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 67 Using the C Compiler Use the C program of the standard and C compiler with maximum optimization. Autocorrelation = 488 445 cycles Levinson = 164 843 cycles Conversion ai LSF= 410 404 cycles LSF Quantization = 883 853 cycles Synthesis filtering= 501 472 cycles Pitch open loop= 793 533 cycles Fractional Pitch= 2 x 618 354 cycles Search Algebraic code= 2x 617 582 cycles Gains quantization= 2x 108 480 cycles
68
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 68 Assembly Language Instructions for Codebook Search Better results can be obtained with assembly language than C. Specific instructions for codebook search: Conditional stores.
69
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 69 Assembly Language Codebook Search
70
Copyright © 2003 Texas Instruments. All rights reserved. ESIEE, Slide 70 Assembly Language Codebook Search A=C(i) 2 B= C(i) 2 Gopt T=Gopt B= C(i) 2 Gopt-G(i)Copt 2 If (B 0) then: BRC Gopt T Iopt A Copt 2
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.