Download presentation
1
Minjie Xie, Dave Lindbergh, and Peter Chu
ITU-T G ANNEX C A NEW LOW-COMPLEXITY 14 KHZ AUDIO CODING STANDARD Minjie Xie, Dave Lindbergh, and Peter Chu ICASSP 2006
2
G.722.1C: First ITU-T Super-wideband Audio Coding Standard
Audio bandwidth: kHz Sample rate: kHz Bit rate: , 32, and 48 kbit/s Algorithm: Transform coding (Siren14TM) Frame size: ms Algorithmic delay: ms Complexity: <11 WMOPS (encoder+decoder) Very high audio quality Suitable for video and teleconferencing and Internet streaming Available on royalty-free licensing terms ICASSP 2006
3
Overview of Main G Mode Wideband coding standard approved by ITU-T in 1998 Provides Hz audio bandwidth at 24 and 32 kbit/s Based on transform coding, using a Modulated Lapped Transform (MLT) Operates on frames of 20 ms corresponding to 320 samples at a 16 kHz sampling rate A Look-ahead of 20 ms due to 50% overlap between frames Total algorithmic delay of 40 ms Very low computational complexity (about 5.3 WMOPS) ICASSP 2006
4
G.722.1C : Extension Mode of G.722.1 Audio signal sampled at 32 kHz
Double the audio bandwidth from 7 kHz to 14 kHz Same algorithmic steps as the main mode of G.722.1 Same frame size as G – 20 ms Total algorithmic delay of 40 ms ICASSP 2006
5
Block Diagram of the G.722.1C Encoder
ICASSP 2006
6
Block Diagram of the G.722.1C Decoder
ICASSP 2006
7
Encoder of G Annex C Double the MLT transform length from 320 to 640 samples Double the number of frequency regions from 14 to 28 Double the Huffman coding tables for encoding quantized region power indices Double the threshold for adjusting the number of available bits from 320 to 640 ICASSP 2006
8
Decoder of G Annex C Double the number of frequency regions from 14 to 28 Double the threshold for adjusting the number of available bits from 320 to 640 Extend the centroid table for reconstruction of MLT coefficients Double the IMLT transform length from 320 to 640 samples ICASSP 2006
9
Computational Complexity and Memory Requirements of G.722.1C
Bit rate (kbit/s) Encoder (WMOPS) Decoder Enc.+Dec. 24 4.5 5.3 9.7 32 4.8 5.5 10.3 48 5.1 5.9 10.9 Memory requirements RAM (K bytes) 18 ROM (K bytes) 30 ICASSP 2006
10
Computational Complexity of G.722.1C versus the 3GPP Audio Codecs
Bit rate (kbit/s) G.722.1C (WMOPS) eAAC+ AMR-WB+ 24 9.7 40.8 80.1 32 10.3 42.6 86.7 ICASSP 2006
11
Algorithmic Delay of G.722.1C versus the 3GPP Audio Codecs
(ms) eAAC+ AMR-WB+ 40.0 129.9[1] 113. 8[2] Note 1: Without bit-reservoir (see 3GPP TR V6.1.0) Note 2: ISF = 25.6 kHz (see 3GPP TR V6.1.0) ICASSP 2006
12
ITU-T Subjective Characterization Tests
Subjective tests performed by France Telecom according to a test plan designed by ITU-T SG12 SQEG Characterization test Phase 1 : Speech - ACR for clean speech and DCR for noisy speech Characterization test Phase 2 : Music and mixed content - MUSHRA method Reference codec : MPEG-4 AAC-LD PCEnc/DecPro Additional reference Codecs : 3GPP eAAC+ and AMR-WB+ Requirements : Not worse than the reference codec for a 99% confidence interval ICASSP 2006
13
ITU-T Subjective Test Results (Phase 1)
(MOS) ICASSP 2006
14
ITU-T Subjective Test Results (Phase 1)
(DMOS) ICASSP 2006
15
ITU-T Subjective Test Results (Phase 1)
(DMOS) ICASSP 2006
16
ITU-T Subjective Test Results (Phase 1)
(DMOS) ICASSP 2006
17
ITU-T Subjective Test Results (Phase 2)
(MUSHRA) ICASSP 2006
18
ITU-T Subjective Test Results (Phase 2)
(MUSHRA) ICASSP 2006
19
ITU-T Subjective Test Results (Phase 2)
(MUSHRA) ICASSP 2006
20
Conclusion G.722.1C met all performance requirements
Phase 1 (clean and noisy speech) - 24 kbit/s: Better than AAC-LD and Not Worse than eAAC+ - 32 kbit/s: Better than AAC-LD, Not Worse than eAAC+, and Not Worse than AMR-WB+ in most of tests - 48 kbit/s: Not Worse than AAC-LD at 48 and 64 kbit/s Phase 2 (music and mixed content) - 24 kbit/s: Better than AAC-LD - 32 kbit/s: Better than AAC-LD - 48 kbit/s: Better than AAC-LD at 48 and 64 kbit/s Executables, audio samples, and more information available at : ICASSP 2006
21
Acknowledgment The authors would like to acknowledge Claude Lamblin, ITU-T Q.10/SG16 Rapporteur, and Catherine Quinquis, ITU-T Q.7/SG12 Rapporteur, for their great work guiding this project to a completion. In addition, the authors would like to thank the speech quality experts and staff who performed the subjective characterization tests at France Telecom. ICASSP 2006
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.