Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu
What is JPEG? JPEG is a method for compressing image data so it takes less space to store or transmit across a network. JPEG is very efficient. A file that was 1Mb in size could be compressed to as little 25Kb (1:40)! JPEG achieves such good compression ratios because it is lossy - but the loss is not visually perceptible.
Overview Images contain different frequencies; low frequencies correspond the slowly varying colors, high frequencies correspond to fine detail. The low frequencies are much more important than the high frequencies; we can throw away some high frequencies to compress our data!
Overview Note that we aren’t talking about the frequencies of light, but of the light and dark areas in the image! We need a way to go from the color of pixels, which is essentially a number, to frequencies… This way is called the Discrete Cosine Transform (DCT).
A JPEG Encoder Entropy Encoder DCT Quantizer
The Discrete Cosine Transform =
= x 1 + x 2 + … + x 15 + x 16
The Discrete Cosine Transform
The 2D DCT So far we’ve been talking about one- dimensional images, just one line of the picture… but an image has two dimensions. We can talk about frequencies in two dimensions, although it’s much harder to visualize.
Basis Remember we saw that every 16-pixel line can be written as the sum of 16 different waves? Those 16 waves formed a basis for the set of 16-pixel lines.
Basis When we are compressing a JPEG, we work in blocks of 8x8 pixels. That’s 64 numbers, so there are 64 different basis images. This means we can describe any 8x8 image as a combination (a sum) of those 64 images.
Basis
The 2D DCT
Summary The Discrete Cosine Transform (DCT) allows us to determine what frequencies make up an image. Into this stage we have 8x8 numbers that are the values of each pixel. Out of this stage we have 8x8 numbers that represent how much of each frequency (or how much of each basis) is in the image.
A JPEG Encoder Entropy Encoder DCT Quantizer
Quantization So we still have 64 numbers to work with - we haven’t reduced the size at all! The reason we wanted the numbers as frequencies was because some frequencies are more important than others. The low frequencies are the most important, the high frequencies are not very important (think back to building up the image).
Quantization Before quantization, each frequency can be between 0 and 255. To quantize, we divide frequencies by a number so that the range is reduced. For example, it becomes 0 to 31. For high frequencies we divide by a higher number.
Quantization Before we had, say: 134,113,145,117,32,11, 17,5… 4. After quantization, we might have: 116, 55, 55, 30, 1, 0, 0, … 0.
Quantization
Quantization 300 kB75 kB Original Medium Quality JPEG
Quantization 300 kB35 kB OriginalLow Quality JPEG
Summary The degree of quantization, dictates the amount of information “thrown away”. If you throw away more information, you will get better compression, but the picture will start to look bad. When you adjust the quality of a JPEG save from Photoshop, you are changing the quantization!
A JPEG Encoder Entropy Encoder DCT Quantizer
Entropy Encoding Entropy encoding is another stage of compression, that relies on statistical properties of the data, e.g. most frequently occuring numbers, lots of the same number in a row. So the take the 64 numbers, do Run Length Encoding, then follow that with Huffman Coding! (Remember yesterday?)
Entropy Encoding These compression schemes now work very well, because quantization turns numbers like 132, 117, 78 into numbers more like 31, 31, 15. After quantization, the range of numbers is smaller, and there are often large runs of numbers - so it can be highly compressed! This is where all of the compression happens!
Summary Entropy Encoder DCT Quantizer
Summary We break up the image into 8x8 blocks. We calculate the frequencies in each block, this allows us to identify the important and less important data. We throw away some less important data. We compress the resulting data. The result: ~ 1:40 compression!