JPEG Compresses real images Standard set by the Joint Photographic Experts Group in 1991
Stages of JPEG We will consider (monochrome) black and white. Images. Colour images are treated similarly, but the colour information is extracted and processed separately using a reduced sample set. The whole image is divided into 8 x 8 blocks.
Stages of JPEG Transform each 8 x 8 block into an 8 x 8 Discrete cosine transform. Quantise DCT co-efficients (values) giving finer quantisation to certain values. Run length code zero values. Apply Huffman coding to the run length coded sequence.
Take 8x8 blocks and produce 2 dimensional (8 x 8) DCT The image is split up into 8 x 8 pixel blocks giving 64 values. The following (scary) formula is applied. Which gives an 8 x 8 DCT block.
The 2 dimensional (8 x 8) DCT However, as before, all the equation does, is calculate what amount of certain cosine patterns make up the 8 x 8 pixel block. This time the patterns are in two dimensions and are shown in a following slide.
The 2 dimensional (8 x 8) DCT They are called basis functions or pictures, because they form the basis (building blocks) of any image. That is any picture can be completely reconstructed from a combination of these patterns.
The 2 dimensional (8 x 8) DCT Each co-efficient (value) in the DCT represents a spatial frequency in the image. That is how many times it goes from light to dark (or from dark to light) in the block.
The 2 dimensional (8 x 8) DCT The block can be see to be separated into horizontal and vertical spatial frequencies. The value in the upper left corner represents zero frequency, dc or the average pixel brightness in the original 8x8 block.
The 2 dimensional (8 x 8) DCT It is processed separately to the other (called ac co- efficients) and is not subject to the lossy compression process which follows.
DCT basis functions (Basis pictures).
Why DCT? We have produced this complex transformation, but it still contains 8 X 8 or 64 values.
Why DCT? However, while any the original 8 x 8 pixel values could take on any value between 0 – 255, it turns out that for most images, that the DCT values are high and significant towards the top left hand corner and low (and less significant) towards the lower right hand corner of the block. We use this statistical information to our advantage.
Typical DCT co- efficients.
Re-quantisation The eye is less sensitive to high frequency noise (error). We can afford to code the high frequencies less accurately. We use larger quantisation steps. Since the high frequency values are usually small anyway, large quantisation steps reduce these values to zero. That is they are not counted in the coded representation and we have achieved (lossy) compression.
Re-quantisation To re- quantise to larger steps we divide the actual value by a values larger than 1 and round the result. The following table shows a typical set of 8x8 quantisation values which we divide the dct values by to achieve re- quantisation. Note that the lower values in this case receive no re-quantisation. Compression may be increased by increasing the re-quantisation values.
Typical re-quantisation values
DCT coefficients after re-quantisation
The effect of loosing the high frequency co- efficients completely Matlab program metdct1.m Syntax –Metdct(‘image.bmp’, mask) ‘image’ is an image to be compressed. Mask is an 8x8 matrix 1’s and 0’s which determines which co- efficients are used and not used (unlike quantising which allows higher co-efficients to play a limited part)..
The effect of loosing the high frequency co- efficients completely Use zeros(8, 8), ones(8,8) Ones(8, 8) gives no deterioration (and no compression) Observe the result of using the dc component only. Experiment by adding ones to the mask to see the minimum number of co-efficients required to give a reasonable compressed image. Estimate the degree of compression.
Zig-Zag scan The dc component and the re-quantised coefficients are placed in a string using a zig- zag scan. This cunning stunt leaves long runs of zeroes.
Zig-zag scan
Run-length encoding The string of co-efficients has many zeroes at the end 168, 45, 67, 12, 32, 7, 3, 3, 5, 3, 1, 1, 5, 3, 2, 0, 1, 3, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 Long runs of zeros can be run- length coded. However, a special (single) code is allocated to a run of 15 zeros and another end of block (EOB) code is used if there are no more non-zero values in the block..
Variable length coding As we approach the end of the the zig-zag scan, it is more likely that any non-zero value will have a run of zeros before it. Also the non-zero value is more likely to have a low value than a high one, particularly if the run of zeroes is large. These combinations of the non- zero value and the preceding run of zeros can therefore be Huffman coded.
Further reading mp.pdf Art of Digital Video, Watkinson, Focal Press Digital image processing, Gonzalez and Woods, Prentice Hall