Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mohamed Chibani, Roch Lefebvre and Philippe Gournay

Similar presentations


Presentation on theme: "Mohamed Chibani, Roch Lefebvre and Philippe Gournay"— Presentation transcript:

1 RESYNCHRONIZATION OF THE ADAPTIVE CODEBOOK IN A CONSTRAINED CELP CODEC AFTER A FRAME ERASURE
Mohamed Chibani, Roch Lefebvre and Philippe Gournay Université de Sherbrooke, Sherbrooke, Québec, Canada Good morning. I’m pleased to present you our work concerning a technique that permits to speed up the recovery of a constrained CELP codec after a frame erasure. This work is an extension to a previous work in which the excitation search in a CELP is constrained to reduce the overall contribution of the Adaptive codebook.

2 Outline Basic CELP model Constrained optimization
Resynchronization at the decoder Open-loop search of the shift (drift) of the ACB Closed-loop search of the shift Pitch contour modification Experimental results Conclusions I’ll start by rapidly presenting the conventional excitation in CELP model, and the constraint applied on the excitation search at the encoder. In the second part of the presentation I’ll describe then the resynchronization of the Adaptive codebook after a frame erasure. Then I’ll talk about dealing with the abrupt change in the pitch period at the frame boundary after the resynchronization. Then I’ll present you the experimental results and finally I’ll end this presentation with some conclusions.

3 Excitation Model in CELP Coding
In CELP coding, the excitation is a sum of a long term excitation provided by the adaptive codebook and an innovative excitation. The adaptive codebook contains the excitation of the past frames. So, the problem is when a frame is lost, the adaptive codebook is not properly updated and the error is propagated over the following frames, especially when the contribution of the ACB is high. The concealment procedure, even if it succeeds to a certain extent to perceptually conceal the lost frame, generally it fails to maintain the synchrony between the encoder and the decoder.

4 Excitation Search in CELP Coding
To search for the excitation parameters, first a target signal x1(n) is built by weighting the input speech and removing the zero-input response of the weighting-synthesis filter. The optimization is done sequentially. First, the optimal delay of the ACB is found then the optimal gain. Y0 is the ACB contribution obtained by filtering the long term excitation through the weighting-synthesis filter. Once the scaled ACB contribution is found, it is removed from the target Then the target for the Innovative codebook search is obtained by removing the scaled ACB contribution.

5 At the Encoder…

6 Constrained Search of the Excitation Parameters
To allow the decoder to recover faster after a frame erasure, the optimization is modified as follows: First, the ACB contribution is evaluated by calculating the ratio between its energy and the energy of the target. If the is greater than the threshold, the gain of the ACB is modified to limit the contribution of the ACB. The result of this is that a part of the long term contribution already modeled by the ACB will remain in the new target x2, and then it will be modeled as well by the ICB. This leads to a redundant representation of the pitch excitation.

7 At the Decoder… Using only the constraint already improves the recovery after a frame erasure without modifying the decoder. However, the constraint can be further exploited at the decoder to speed up the recovery after a frame erasure.

8 Prelude to the Resynchronization Algorithm
After a frame erasure, both the waveform and the position of the pitch pulses in the ACB memory are erroneous. For voiced speech, the pitch pulse waveform evolves slowly. If the expected position of the last pitch pulse in the ACB memory can be determined, the ACB memory can be corrected. Due to the constraint, a good approximation of the pitch pulse can be obtained using only the parameters of the current frame. After a frame erasure, the problem with the adaptive codebook is that its memory was not properly updated.

9 The Excitation Signal Obtained After Setting to Zero the ACB Memory
First, we build an excitation signal using only the parameters of the current frame, in other words after setting to zero the ACB memory. The ICB excitation combines with the ACB excitation at every subframe to progressively shape the pitch pulse. Due to the constraint we obatin a good approximation of the last pitch pulse, as illustrated.

10 Block Diagram of the Resynchronization Algorithm
This block diagram summarizes the resynchronization algorithm. Firs we determine the position of last pulse in the erroneous ACB memory Then, we estimate the expected position of the last pitch pulse in ACB memory. We use these two positions to have an estimate of the shift to be used to correct the ACB memory. The shift is then refined in closed-loop.

11 Determination of the Expected Pitch Pulse Position in the Erroneous ACB Memory
ACB delays ) ( P ) 1 ( P The last pulse in the ACB memory P(-1) The excitation e0(n) To determine the expected position of the last pitch pulse in the ACB memory, we go backward starting from the position of the last pitch pulse of the e0 excitation. Here, we assume that the distances between successive pitch pulses are equal the ACB delay of the corresponding subframe. The correct excitation

12 Estimation of the Shift 0
P(0) : The expected position of the last pitch pulse in the ACB memory P(-1) : The actual position of the last pitch pulse in the ACB memory The shift is then simply the difference between the position P(0) and P(-1). Now, the expected position P(0) may not be very reliable especially if the pitch contour evolves within a subframe. Thus, the value of the shift is refined in closed-loop.

13 Closed-loop Search for the Optimal Shift
e is the excitation signal built after correcting the ACB for every shift candidate L_FRM=256 The measure we use to determine the optimal shift, is the correlation e0 built after setting to zero the ACB memory and the excitation signal built for every shift candidate (limited to 5 values around d0). The optimal shift is the one that maximize the correlation. The calculation of the correlation is limited to the two last subframe to the one pitch period if the pitch period is greater than two subframes. L=max(2*L_SBFR,T(3)) T(3) is the ACB delay of the 4th subframe

14 Example of a Resynchronized Excitation
The correct excitation The excitation e0(n) The excitation signal built using the erroneous ACB memory In this example we see clearly the effect of the resynchronization on the excitation signal. In c) we have the excitation signal built using the erroneous ACB memory and with the resynchronized ACB memory. Compare both excitations to the correct excitation. The excitation signal built after correcting the ACB memory

15 Modification of the Pitch Contour After the Resynchronization
The correct excitation The excitation after the resynchronization The excitation after the modification of the pitch contour The correction of the ACB memory, will cause an abrupt change in the pitch contour at the frame boundary. One solution is to modify pitch contour by distributing the shift over the pitch periods in the frame to allow a smooth evolution of the pitch contour. In this example d is negative. i is the shift of each interval Np is the number of pitch periods

16 The Effect of the Resynchronization Algorithm when Applied on Voiced Speech Segment
Error-free signal Standard codec Constrained codec We see in this example, that for the standard codec, even if it manages to maintain a good waveshape during and even after the frame erasure, it is completely desynchronized right after. When using only the constraint, the decoder recovers after only three frames, but still suffers from the synchrony loss after the frame erasure. And finally, combining the constraint and the resynchronization, we manage to maintain a good waveshape all along the frames following the erased frame. Constr. + resynchro.

17 Experimental Results Test features: Standard codec Constrained codec
42.36 29.99 74.88 Standard codec 73.67 49.18 36.70 Constrained codec 73.67 53.90 40.40 Constr. + Resynchro. 10 20 30 40 50 60 70 80 90 MUSHRA Score Test features: AMR-WB at mode 2 (12.65 kb/s) 10 listeners 14 pairs of sentences for each condition Listening using binaural headphones To evaluate the performance of the method, we carried out subjective listening tests. Three conditions were used: error-free channel, 5% of frame erasure, and 10% of frame erasure. The improvement in quality when using both the constraint and the constraint and the resynchronization is really significant compared to the standard, for the 2 frame erasure rates. 0% 5% 10% Frame erasure rate

18 Conclusions The resynchronization allows to speed up the recovery of the decoder after a frame erasure. The method (constraint + resynchronization) needs neither extra bits nor extra delay. The modified codec is completely interoperable with the standard (the bitstream is not modified). Only 10 to 15% of the frames following an erased frame are resynchronized. The only drawback is a minor loss of quality in error-free channels. The main attraction of the method, is that it is completely interoperable with the standard. The coder or the decoder in either side of a communication line, needs not to know explicitly if the standard or the modified coder or decoder is used. The only drawback is minor quality loss in clear-channel, around 1 point in the MUSHRA scale. Thank you.

19 Thank you


Download ppt "Mohamed Chibani, Roch Lefebvre and Philippe Gournay"

Similar presentations


Ads by Google