Download presentation
Presentation is loading. Please wait.
1
Huffman Coding, Arithmetic Coding, and JBIG2
Illustrations Arber Borici 2010 University of N British Columbia
2
Huffman Coding Entropy encoder for lossless compression
Input: Symbols and corresponding probabilities Output: Prefix-free codes with minimum expected lengths Prefix property: There exists no code in the output that is a prefix of another code Optimal encoding algorithm
3
Huffman Coding: Algorithm
Create a forest of leaf nodes for each symbol Take two nodes with lowest probabilities and make them siblings. The new internal node has a probability equal to the sum of the probabilities of the two child nodes. The new internal node acts as any other node in the forest. Repeat steps 2–3 until a tree is established.
4
Huffman Coding: Example
Consider the string ARBER The probabilities of symbols A, B, E, and R are: The initial forest will thus comprise four nodes Now, we apply the Huffman algorithm Symbol A B E R Frequency 1 2 Probability 20% 40%
5
Generating Huffman Codes
0.2 B 0.2 E 0.2 R 0.4 1 2 0.6 1 1 0.4 1
6
Generating Huffman Codes
B E R 1 2 R 0.4 0.6 Symbol Code 1 1 E 0.2 0.4 1 A 0.2 B 0.2
7
Generating Huffman Codes
B E R 1 2 R 0.4 0.6 Symbol Code 1 A 0 0 0 1 E 0.2 0.4 1 A 0.2 B 0.2
8
Generating Huffman Codes
1 2 R 0.4 0.6 Symbol Code 1 A 0 0 0 1 E 0.2 B 0 0 1 0.4 1 A 0.2 B 0.2
9
Generating Huffman Codes
1 2 R 0.4 0.6 Symbol Code 1 A 0 0 0 1 E 0.2 B 0 0 1 0.4 E 0 1 1 A 0.2 B 0.2
10
Generating Huffman Codes
1 2 R 0.4 0.6 Symbol Code 1 0 0 0 A 1 E 0.2 B 0 0 1 0.4 E 0 1 1 R 1 A 0.2 B 0.2
11
Huffman Codes: Decoding
1 1 1 1 r 1 2 R 0.4 0.6 1 1 E 0.2 0.4 1 A 0.2 A B 0.2
12
Huffman Codes: Decoding
1 1 1 1 r A 1 2 R 0.4 R 0.6 1 1 E 0.2 0.4 1 A 0.2 B 0.2
13
Huffman Codes: Decoding
1 1 1 r A R 1 2 R 0.4 0.6 1 1 E 0.2 0.4 1 A 0.2 B 0.2 B
14
Huffman Codes: Decoding
1 1 r A R B 1 2 R 0.4 0.6 1 1 E 0.2 E 0.4 1 A 0.2 B 0.2
15
Huffman Codes: Decoding
1 r A R B E 1 2 R 0.4 R 0.6 1 1 E 0.2 0.4 1 A 0.2 B 0.2
16
Huffman Codes: Decoding
1 1 1 1 r A R B E R 1 2 R 0.4 0.6 The prefix property ensures unique decodability 1 1 E 0.2 0.4 1 A 0.2 B 0.2
17
Arithmetic Coding Entropy coder for lossless compression
Encodes the entire input data using a real interval Slightly more efficient than Huffman Coding Implementation is harder: practical implementation variations have been proposed
18
Arithmetic Coding: Algorithm
Create an interval for each symbol, based on cumulative probabilities. The interval for a symbol is [low, high). Given an input string, determine the interval of the first symbol Scale the remaining intervals: New Low = Current Low + Sumn-1(p)*(H – L) New High = Current High + Sumn(p)*(H – L)
19
Arithmetic Coding: Example
Consider the string ARBER The intervals of symbols A, B, E, and R are: A: [0, 0.2); B: [0.2, 0.4); E: [0.4, 0.6); and R: [0.6, 1); Symbol A B E R Low 0.2 0.4 0.6 High 1
20
Arithmetic Coding: Example
B E R 0.12 A 20% of (0, 0.2) 20% of (0.12, 0.2) 0.2 0.04 0.136 B 20% of (0, 0.2) 20% of (0.12, 0.2) 0.4 0.08 0.152 20% of (0, 0.2) E 20% of (0.12, 0.2) 0.6 0.12 0.168 R 40% of (0, 0.2) 40% of (0.12, 0.2) 0.2 1 0.2
21
Arithmetic Coding: Example
B E R 0.12 0.136 A 20% of (0.136, 0.152) 0.2 0.04 0.136 0.1392 B 20% of (0.136, 0.152) 0.4 0.08 0.152 0.1424 E 20% of (0.136, 0.152) 0.6 0.12 0.168 0.1456 R 40% of (0.136, 0.152) 0.2 1 0.2 0.152
22
Arithmetic Coding: Example
0.12 0.136 0.1424 A 20% of (0.1424, ) 0.2 0.04 0.136 0.1392 B 20% of (0.1424, ) 0.4 0.08 0.152 0.1424 E 20% of (0.1424, ) 0.6 0.12 0.168 0.1456 R 40% of (0.1424, ) 0.2 1 0.2 0.152 0.1456
23
Arithmetic Coding: Example
0.12 0.136 0.1424 A 0.2 0.04 0.136 0.1392 B 0.4 0.08 0.152 0.1424 E 0.6 0.12 0.168 0.1456 R 0.2 0.1456 1 0.2 0.152 0.1456
24
Arithmetic Coding: Example
The final interval for the input string ARBER is [ , ). In bits, one chooses a number in the interval and encodes the decimal part. For the sample interval, one may choose point , which in binary is: (51 bits)
25
Arithmetic Coding Practical implementations involve absolute frequencies (integers), since the low and high interval values tend to become really small. An END-OF-STREAM flag is usually required (with a very small probability) Decoding is straightforward: Start with the last interval and divide intervals proportionally to symbol probabilities. Proceed until and END-OF-STREAM control sequence is reached.
26
JBIG-2 Lossless and lossy bi-level data compression standard
Emerged from JBIG-1 Joint Bi-Level Image Experts Group Supports three coding modes: Generic Halftone Text Image is segmented into regions, which can be encoded using different methods
27
JBIG-2: Segmentation The image on the left is segmented into a binary image, text, and a grayscale image: binary text grayscale
28
JBIG-2: Encoding Arithmetic Coding (QM Coder) Context-based prediction
Larger contexts than JBIG-1 Progressive Compression (Display) Predictive context uses previous information Adaptive Coder A A X = Pixel to be coded A = Adaptive pixel (which can be moved) A A X
29
JBIG-2: Halftone and Text
Halftone images are coded as multi-level images, along with pattern and grid parameters Each text symbol is encoded in a dictionary along with relative coordinates:
30
Color Separation Images comprising discrete colors can be considered as multi-layered binary images: Each color and the image background form one binary layer If there are N colors, where one color represents the image background, then there will be N-1 binary layers: A map with white background and four colors will thus yield 4 binary layers
31
Color Separation: Example
The following Excel graph comprises 34 colors + the white background:
32
Layer 1
33
Layer 5
34
Layer 12
35
Comparison with JBIG2 and JPEG
Our Method: 96% JBIG2: 94% JPEG: 91% Our Method: 98% JBIG2: 97% JPEG: 92%
36
Encoding Example Original size: 64 * 3 = 192 bits Codebook RCRC
Uncompressible The compression ratio is the size of the encoded stream over the original size: 1 – ( ) / 192 = 56%
37
Definitions (cont.) Compression ratio is defined as the number of bits after a coding scheme has been applied on the source data over the original source data size Expressed as a percentage, or usually is bits per pixel (bpp) when source data is an image JBIG-2 is the standard binary image compression scheme Based mainly on arithmetic coding with context modeling Other methods in the literature designed for specific classes of binary images Our objective: design a coding method notwithstanding the nature of a binary image
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.