Download presentation
Presentation is loading. Please wait.
1
A Study on Scalable CELP
研究生 :鄒昇龍 指導教授:尤信程 博士 2018/12/3
2
Outline LPC and The Fundamentals of CELP(Code-Excited Linear Predictive) Scalable CELP The Proposed Scalable Structure Experiment Results Conclusion
3
LPC (Linear Predictive Coding)
LPC is an important technique in speech coding. Why? Speech signal is highly correlated in time domain.
5
LPC Analysis Auto-correlation method
Levinson-Durbin’s recursive algorithm Prediction order vs. prediction gain
6
Prediction order vs. prediction gain
7
Pitch Prediction (1) The prediction error includes some periodical signal.
8
Pitch Prediction (2) We can use a pitch predictor to find pitch delay t. The error of pitch predictor is random noise. There are some approaches to encode the random noise. CELP SELP(Self-Excitation Linear Prediction) MPLPC(Multi-Pulse LPC) RPLPC(Regular-Pulse LPC)
9
Fundamental of CELP The concept of CELP is to encode the random noise with codebook. AbS(Analysis by Synthesis) coding and PW(Perceptual Weighting) filter are two important procedures in CELP.
10
AbS Coding Adjust the parameters to minimize the error signal.
The main disadvantage of AbS Coding is much more computational.
11
Perceptual Weighting Filter (1)
Perceptual masking effect: When the signal energy is greater than the noise energy, we are not sensitive to the noise. When the signal energy is less than the noise energy, we are sensitive to the noise. PW filter is set according to
12
Perceptual Weighting Filter (2)
13
Perceptual Weighting Filter (3)
14
Combination of LPC and PW
Because the denominator of LP Synthesis filter is equal to the numerator of PW filter, we can combine LP Synthesis filter with PW filter for computation reduction.
15
Steps of CELP (1) Calculate LPC coefficients
Determine the prediction error(LP Synthesis). Search the pitch delay. Determine the prediction error(Pitch Predictor). Search the optimum code vector by AbS iteration. Pack all the parameters and send them out.
17
Scalable CELP (1)
18
Scalable CELP (2)
19
Scalable CELP (3) Bitrate Scalable Coder consist of Core Coder and Bitrate Scalable Tool. The disadvantage of Bitrate Scalable Coder is that it is fixed in decoding sequence.
20
Scalable CELP (4)
21
Performance of Scalable CELP (1)
The quality of different core bitrate: High core bitrate(8k bps) Low core bitrate(3850 bps) Why the quality of low core bitrate is bad? Error signal Pulse position
22
Performance of Scalable CELP (2)
23
Performance of Scalable CELP (3)
24
Experiment of Compensation
Which area needs compensation? Area with larger error The start of a syllable What is a syllable? A syllable consists of a consonant and a vowel. The detection of a syllable is important to our proposed scalability structure.
25
The Proposed Scalable Structure (1)
We define a unit, block, to detect syllable and the length of a block is 1600 samples(0.2 8kHz). We set the length of a compensation area be 200 samples. There are 400 bits available in each block. Fixed or variable bitrate?
26
The Proposed Scalable Structure (2)
There are 4 procedures in our approach. Error Buffer Syllable Detection and Classification Transform and Quantization Source Coding
27
Error Buffer Because the block length and frame length are different, we need a buffer to store the error signal and the source signal. When 5 frame(source signal and error signal) are collected, then we do Syllable Detection.
28
Syllable Detection and Classification (1)
block Source ~ 1 2 3 4 5 6 7 8 9 10 38 39 40 40 samples ~ Error 1 2 3 4 5 6 7 8 9 10 38 39 40 Energy ratio ~ 1 2 3 4 5 6 40 ~ > 0.1 … Slope 1 2 3 4 40 1 2
29
Syllable Detection and Classification (3)
compensation area block n block n+1 block n block n+1
30
Syllable Detection and Classification (4)
31
Transform and Quantization
Concentration vs. Uniform
32
How to decision? Calculate the DCT coefficients. Divide the coefficients into 10 segments and calculate the energy for each segment. If the result of energy sum of any 4 segments divided by total energy > 0.8, we say that the area is concentrative, else uniform.
33
Quantization (1) There are 3 situations may occur. When 1 or 2 occurs
All compensation areas are concentrative All compensation areas are uniform Both concentrative and uniform exist simultaneously. When 1 or 2 occurs Quantization step size is Sum all the coefficients and divide by 240.
34
Quantization (2) When 3 occur concentration concentration uniform
Quantization step size is Sum all the coefficients and divide by 180. Quantization step size is Sum all the coefficients and divide by 60.
35
sign bit + (5 bits magnitude)
Source Coding (1) DCT coef. Symbol Format Length of Symbol Total bits 0、±1 00 00 + (2 bits length) 01 + (4 bits length) 10 + (8 bits length) 11 + (16 bits length) 1 ~ 4 5 ~ 20 21 ~ 276 277 ~ 65812 n+1 ±2 01 sign bit + (1 bit length) 1 ~ 2 4 ±3 10 ±4 ~ 35 11 sign bit + (5 bits magnitude) 1 8
36
Source Coding (2) 00 01 1011 00000000000 -2 -2 01 1 1 11 symbols
37
Bitrate Control We adjust the step size to control the number of bits.
Multiply or divide by 0.98 to change the step size.
38
Bitstream Formatting (1)
We must record some information in header. The number of compensation areas Quantization step size The start position of compensation area The range of compensation area
39
Bitstream Formatting (2)
Number of compensation area, n, 2 bits What kind of the step size be selected every area, n bits Step size, 8 or 16 bits The start of every compensation area, 12 or 10 or 6 bits Range of every compensation area, 3 or 6 or 9 bits header
40
Bitstream Formatting (3)
補償區域之個數 起始點排列之總數 編碼之位元數 3 4060 12 2 595 10 1 40 6
41
Coding Delay Encoding delay: 5 frame + look ahead
Decoding delay: 5 frame Applications: Broadcast Recorder
42
How to obtain variable bitrate in ISO CELP? (1)
43
How to obtain variable bitrate in ISO CELP? (2)
There are n frames must be adjusted. There are 5 frames in a block. Turn on [ceil(5/n)] layers. 待修正frame個數 啟動bitrate-enhancement bitstream的layer數 2 3 4 5 1
44
Experiment Result (1) We use CMOS(Comparison Mean Opinion Score) to test the performance of our approach. Ref: The source signal A: Our approach B: Compared Target There are 15 persons helping us in the experiment.
45
Experiment Result (2) 語音編號 語音名稱 描述 1 Spfc 中文、女聲 2 Spfe 英文、女聲 3 Spff
法文、女聲 4 Spfg 德文、女聲 5 Spfj 日文、女聲 6 Spmc 中文、男聲 7 Spme 英文、男聲 8 Spmf 法文、男聲 9 Spmg 德文、男聲 10 Spmj 日文、男聲
46
Experiment Result (3) 比較結果 分數 A is better than B +1 A is the same as B
A is worse than B -1
47
Experiment Result (4) 實驗名稱 A的描述 B的描述 實驗一 新方法,core為3850 bps
variable bitrate 實驗二 high bitrate,6300 bps 實驗三 high bitrate,8300 bps 實驗四 新方法,core為6300 bps 實驗五 實驗六 新方法,core為8300 bps
48
A : Our approach, core bitrate = 3850 bps B : Variable bitrate @ core bitrate = 3850 bps
語音編號 Variable enh-bitrate 評+1的個數 評0的個數 評-1的個數 平均分數 1 2411 11 4 0.73 2 2437 13 0.87 3 2422 9 0.47 2407 6 8 0.33 5 2400 12 0.80 2433 7 2415 10 0.60 2410 0.67
49
A : Our approach, core bitrate = 3850 bps B : High bitrate, 6300 bps
語音編號 評+1的個數 評0的個數 評-1的個數 平均分數 1 4 5 6 -0.13 2 0.13 3 -0.07 7 0.33 8 -0.33 9 -0.27 10 0.20
50
A : Our approach, core bitrate = 3850 bps B : High bitrate, 8300 bps
語音編號 評+1的個數 評0的個數 評-1的個數 平均分數 1 2 6 7 -0.33 9 5 -0.27 3 10 -0.53 4 8 -0.13 -0.40
51
A : Our approach, core bitrate = 6300 bps B : Variable bitrate @ core bitrate = 6300 bps
語音編號 Variable enh-bitrate 評+1的個數 評0的個數 評-1的個數 平均分數 1 2494 2 8 5 -0.20 2403 6 4 0.07 3 2465 7 0.33 2460 9 -0.27 2408 0.20 2427 -0.07 2393 2402 2415 10 2419
52
A : Our approach, core bitrate = 6300 bps B : High bitrate, 8300 bps
語音編號 評+1的個數 評0的個數 評-1的個數 平均分數 1 4 10 -0.60 2 3 12 -0.80 9 -0.33 5 8 7 -0.47 6 -0.53 -0.13
53
A : Our approach, core bitrate = 8300 bps B : Variable bitrate @ core bitrate = 8300 bps
語音編號 Variable enh-bitrate 評+1的個數 評0的個數 評-1的個數 平均分數 1 2468 5 10 -0.67 2 2438 9 -0.53 3 2444 6 -0.60 4 2437 2410 8 -0.40 2475 7 -0.33 2443 2416 11 2470 -0.47 2400
54
Conclusion Our approach is effective in low bitrate situation.
The limitation of our approach is approximate 6k bps(core bitrate). Our approach also useful with other CELP standard.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.