Download presentation
Presentation is loading. Please wait.
1
III Digital Audio III.7 (W Nov 04) The MP3 frame format
2
3. Psychoacoustical Model (Perceptual-Audio-Coding Model PAC)
The MP3 encoder chain Audio Data Filter Bank 32 Subbands Subbands Psychoacoustical Model Quantization and Encoding (Check of Quantization loop) External Check Encoding Encoding of Additional Information Datastream Formatting to Frames etc. Additional Data Data Stream 2*16 to Line 1. Digital Datastream 2. FFT with Filter Bank 3. Psychoacoustical Model (Perceptual-Audio-Coding Model PAC) 4. Quantization 5. Huffman Compression 6. Frame Outputstream Formatting
3
6. Frame Outputstream Formatting
The MP3 encoder chain 6. Frame Outputstream Formatting Audio Data Filter Bank 32 Subbands Subbands Psychoacoustical Model Quantization and Encoding (Check of Quantization loop) External Check Encoding Encoding of Additional Information Datastream Formatting to Frames etc. Additional Data Data Stream 2*16 to Line
4
MP3 file Identifier = ID3 Tag
The MP3 encoder chain MP3 file Identifier = ID3 Tag At the beginning of the MP3 file, we have a 128 Byte identifier (ID3 tag), which is not an official standard, but very often appears: Bytes Content 3 Tag = identification as ID3 tag 30 title of piece name of interpreter(s) name of album 4 year of publication comment 1 genre identification
5
The MP3 encoder chain Frame Outputstream Formatting The MP3 format, when used for streaming or for saving purposes, is built from units that are called frames. A frame is an autonomous information package. This means that all encoding data is provided within every frame to enable playing a file from any given time onset. A frame’s duration is 1/ ~ 1/40 sec. This enable virtually continuous playing for humans. Each frame has these parts: a 32-bit header indicating the layer number (1-3), the bitrate, and the sample frequency; the Cycle Redunancy Check (CRC) with 16 bits for error detection (without correction option) but frame repetition until correct frame appears; 12 bits for additional information for Huffman trees and quatization info; main data sample block of 3344 bits for Huffman-encoded data.
6
you can only play entire frames!
The MP3 encoder chain 32 bit Frame Header Position Task Length in bits A Frame-SYNC (for playing and “jumping around”) 11 B MPEG Audio Version (MPEG-1, -2, etc.) 2 C MPEG Layer (Layer I, II, III, etc.) D Protection 1 E Bitrate Index 4 F Sampling Frequency (e.g kHz) G Padding bit (compensates incomplete allocation) H Private bit (application-specific trigger) I Channel mode (Stereo, Joint Stereo) J Mode Extension (for Joint Stereo) K Copyright L Original (“0” if copy, “1” if original) M Emphasis (outdated) important: you can only play entire frames!
7
Frame Sequence with reservoir technique
The MP3 encoder chain Frame Sequence with reservoir technique Bits in reservoir for Block 1 = 0 Bits in reservoir for Block 2 Bits in reservoir for Block 3 Bits in reservoir for Block 4 Bits in reservoir for Block 5 Main data for block 1 for block 2 for block 3 for block 4 for block 5 Header/ Add. info block 1 Header/ Add. info block 2 Header/ Add. info block 3 Header/ Add. info block 4 Header/ Add. info block 5 3344 bits
8
Recall that time-samples ~ frequency-samples
The MP3 encoder chain Important formulas relating to frame capacities Fixed data: # frames/sec = maximal audio data capacity per frame = 3,344 bit/frame # frequency bands = 32 Recall that time-samples ~ frequency-samples First formula: maximal bitrate 3,344 bit/frame × frame/sec = 128 kbit/sec guarantees CD quality. Second formula: frequency samples per frame 44,100 time-sample/sec / frame/sec = 1152 frequency-sample/frame guarantees CD quality. This yields 1152/32 = 36 frequency-samples/band Observe: 625 Hz/band / Hz = frequ.-samples/band, we have overlapping info, but this is ok to minimize measurement errors.
9
Some performance values
The MP3 encoder chain Some performance values MPEG procedure compression quality bitrate kbit/sec bandwidth kHz mode MPEG-1 layer-3 14:1 – 12:1 CD 128 >15 stereo 16:1 Approximately CD 96-112 15 MPEG-2 layer-3 16:1-24:1 Radio quality 56-64 11 24:1 Language 32 7.5 mono 48:1 Shortwave radio 16 4.5 MPEG-2.5 layer-3 96:1 Telephone 8 2.5 Input bitrate (2×768) / output bitrate (128) = 12
10
The MP3 encoder chain Remarks on Joint Stereo Coding MP3 implements the Joint Stereo Coding compression method, which is based on these two principles: Mid/Side Stereo Coding (MSSC), where we take instead of the left and right channels (L,R) the equivalent data (L+R, L-R) and make use of the fact that L and R are usually strongly correlated and that therefore the difference is quite “tame”. Intensity Stereo Coding (ISC), where the sum L+R and the direction of the signal are encoded (replacing the L-R information). This coding method also uses the fact that the human ear is weak in localizing deep frequencies. Since the direction is detected by phase differences that are difficult to retreave for deep frequencies, they are encoded mono!
11
The MP3 encoder chain Legal aspects The license rights of Fraunhofer IIS are represented by the French company Technicolor SA, formerly Thomson Multimedia. Here are the figures: 0.50 USD per decoder 5.- USD per encoder 15, USD annual lump-sum This means that an enterprise which sells a total of annually 25,000 copies of the encoder software, pays 25,000 × ,000.- = 140,000.- for the first year and then 15,000.- annual fees for every successive year.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.