PAC/AAC audio coding standard A. Moreno Georgia Institute of Technology ECE8873-Spring/2004
Overview Audio Recording Coding-ultimate goal AAC Encoder Block Diagram Principles of Psychoacoustics Perceptual Entropy Quantization and Coding Samples
Introduction "If a tree falls in the forest with no one around to hear it, does it make a sound?"
Audio Recording Edison, 1877
Audio Recording Philips, 1978 A/D Converter PCM
Coding Ultimate Goal: reduce the number of bits needed to represent the data. Bitrate = F sa x Wordlength
AAC Encoder Block Diagram Perceptual Model Gain Control MDCTTNS Multi-Channel M/S, Intensity Predictionz^-1 Quant Scale Factor Extract Iterative Rate Control Loop Entropy Coding Side information coding, Bitstream channel s(n)
Principles of Psychoacoustics Source localization. Two ears are necessary. Brain uses intensity differences, and time delays between the two perceived signals.
Principles of Psychoacoustics inaudible audible Absolute Hearing Threshold
Principles of Psychoacoustics Human Ear Loudness characteristic Robinson and Dadson equi-loudness contours.
Principles of Psychoacoustics Critical Bands Concept introduced by Harvey Fletcher Frequency to Place Transform. Function of frequency that quantifies the cochlear filter passbands. Example: The critical band for a 1kHz is about 160Hz in width. A narrow band noise centered at 1kHz is perceived with the same loudness as long as the width < 160Hz.
Principles of Psychoacoustics Simultaneous Masking: Frequency inaudible audible
Principles of Psychoacoustics Simplified Paradigms: Noise Masking Tone Tone Masking Noise 1Bark TH N 1Bark TH T K=3dB...5dB (constant)
Principles of Psychoacoustics 1Bark th Spread of Masking
Principles of Psychoacoustics Masking: Temporal
Perceptual Entropy Perceptual Entropy, objective metric of perceptually relevant introduced by J. Johnston The perceived information from an audio signal is only a fraction of the total information emanated by the source.
Perceptual Entropy Procedure: 1.Window and transform to frequency. 2.Masking Threshold is computed using perceptual rules 3.A determination is made of the number of bits required to quantize the spectrum, without injecting perceptible noise.
Perceptual Entropy s(n) Hann Window MDCT Determine nature (Noise-like) (Tone-like) Apply Thresholding rules Spectral Flatness Measure Coefficient of ‘Tonality’ Offset JND Estimates
Perceptual Entropy i:index of critical band; bli, blh: lower and upper bounds of band i; ki:number of transform component in band i; Ti:masking threshold in band i; nint:rounding to the nearest integer.
Returning "If a tree falls in the forest with no one around to hear it, does it make a sound?" From a Perceptual Coding standpoint, if no one can hear it, THERE IS NO TREE.
AAC Encoder Block Diagram Perceptual Model Gain Control MDCTTNS Multi-Channel M/S, Intensity Predictionz^-1 Quant Scale Factor Extract Iterative Rate Control Loop Entropy Coding Side information coding, Bitstream channel s(n)
Quantization and Coding Power-law quantizer Huffman Coding (table can be chosen) Global Gain -> Quantization step size Scale Factors-> noise shaping factor
Quantization and Coding while NOISE_CTL while FINDING_RATE Nr_bits= get_bits_needed(); if (Nr_bits > max_bits) adjust_global_gain(); else FINDING_RATE=0; end q_noise=get_quant_noise_level(); if (q_noise> Th(band)) adjust_band_scale_factor(); else NOISE_CTL=0; end
Samples Castanets Original 48kHz Stereo 128kbps AAC Stereo (48kHz) Piano Timpani
References [1] Ted Painter and Andreas Spanias. Perceptual coding of digital audio. Proceedings of the IEEE, 88(4): Abril [2] Karlheinz Brandenburg, MP3 and AAC explained, AES 17 th International Conference on High Quality Audio Coding, [3] J.D. Johnston, A.J. Ferreira, Sum-Difference Stereo Transform Coding, Proc. ICASSP [4] Deepen Sinha, James D. Johnston. Audio Compression at low bit rates using a Signal Adaptive switched Filterbank. Proc. of the ICASSP 1996, pp