Huffman Coding, Arithmetic Coding, and JBIG2

Huffman Coding, Arithmetic Coding, and JBIG2
Illustrations Arber Borici 2010 University of N British Columbia

Huffman Coding Entropy encoder for lossless compression
Input: Symbols and corresponding probabilities Output: Prefix-free codes with minimum expected lengths Prefix property: There exists no code in the output that is a prefix of another code Optimal encoding algorithm

Huffman Coding: Algorithm
Create a forest of leaf nodes for each symbol Take two nodes with lowest probabilities and make them siblings. The new internal node has a probability equal to the sum of the probabilities of the two child nodes. The new internal node acts as any other node in the forest. Repeat steps 2–3 until a tree is established.

Huffman Coding: Example
Consider the string ARBER The probabilities of symbols A, B, E, and R are: The initial forest will thus comprise four nodes Now, we apply the Huffman algorithm Symbol A B E R Frequency 1 2 Probability 20% 40%

Generating Huffman Codes
0.2 B 0.2 E 0.2 R 0.4 1 2 0.6 1 1 0.4 1

B E R 1 2 R 0.4 0.6 Symbol Code 1 1 E 0.2 0.4 1 A 0.2 B 0.2

B E R 1 2 R 0.4 0.6 Symbol Code 1 A 0 0 0 1 E 0.2 0.4 1 A 0.2 B 0.2

1 2 R 0.4 0.6 Symbol Code 1 A 0 0 0 1 E 0.2 B 0 0 1 0.4 1 A 0.2 B 0.2

1 2 R 0.4 0.6 Symbol Code 1 A 0 0 0 1 E 0.2 B 0 0 1 0.4 E 0 1 1 A 0.2 B 0.2

1 2 R 0.4 0.6 Symbol Code 1 0 0 0 A 1 E 0.2 B 0 0 1 0.4 E 0 1 1 R 1 A 0.2 B 0.2

Huffman Codes: Decoding
1 1 1 1 r 1 2 R 0.4 0.6 1 1 E 0.2 0.4 1 A 0.2 A B 0.2

1 1 1 1 r A 1 2 R 0.4 R 0.6 1 1 E 0.2 0.4 1 A 0.2 B 0.2

1 1 1 r A R 1 2 R 0.4 0.6 1 1 E 0.2 0.4 1 A 0.2 B 0.2 B

1 1 r A R B 1 2 R 0.4 0.6 1 1 E 0.2 E 0.4 1 A 0.2 B 0.2

1 r A R B E 1 2 R 0.4 R 0.6 1 1 E 0.2 0.4 1 A 0.2 B 0.2

1 1 1 1 r A R B E R 1 2 R 0.4 0.6 The prefix property ensures unique decodability 1 1 E 0.2 0.4 1 A 0.2 B 0.2

Arithmetic Coding Entropy coder for lossless compression
Encodes the entire input data using a real interval Slightly more efficient than Huffman Coding Implementation is harder: practical implementation variations have been proposed

Arithmetic Coding: Algorithm
Create an interval for each symbol, based on cumulative probabilities. The interval for a symbol is [low, high). Given an input string, determine the interval of the first symbol Scale the remaining intervals: New Low = Current Low + Sumn-1(p)*(H – L) New High = Current High + Sumn(p)*(H – L)

Arithmetic Coding: Example
Consider the string ARBER The intervals of symbols A, B, E, and R are: A: [0, 0.2); B: [0.2, 0.4); E: [0.4, 0.6); and R: [0.6, 1); Symbol A B E R Low 0.2 0.4 0.6 High 1

B E R 0.12 A 20% of (0, 0.2) 20% of (0.12, 0.2) 0.2 0.04 0.136 B 20% of (0, 0.2) 20% of (0.12, 0.2) 0.4 0.08 0.152 20% of (0, 0.2) E 20% of (0.12, 0.2) 0.6 0.12 0.168 R 40% of (0, 0.2) 40% of (0.12, 0.2) 0.2 1 0.2

B E R 0.12 0.136 A 20% of (0.136, 0.152) 0.2 0.04 0.136 0.1392 B 20% of (0.136, 0.152) 0.4 0.08 0.152 0.1424 E 20% of (0.136, 0.152) 0.6 0.12 0.168 0.1456 R 40% of (0.136, 0.152) 0.2 1 0.2 0.152

0.12 0.136 0.1424 A 20% of (0.1424, ) 0.2 0.04 0.136 0.1392 B 20% of (0.1424, ) 0.4 0.08 0.152 0.1424 E 20% of (0.1424, ) 0.6 0.12 0.168 0.1456 R 40% of (0.1424, ) 0.2 1 0.2 0.152 0.1456

0.12 0.136 0.1424 A 0.2 0.04 0.136 0.1392 B 0.4 0.08 0.152 0.1424 E 0.6 0.12 0.168 0.1456 R 0.2 0.1456 1 0.2 0.152 0.1456

The final interval for the input string ARBER is [ , ). In bits, one chooses a number in the interval and encodes the decimal part. For the sample interval, one may choose point , which in binary is: (51 bits)

Arithmetic Coding Practical implementations involve absolute frequencies (integers), since the low and high interval values tend to become really small. An END-OF-STREAM flag is usually required (with a very small probability) Decoding is straightforward: Start with the last interval and divide intervals proportionally to symbol probabilities. Proceed until and END-OF-STREAM control sequence is reached.

JBIG-2 Lossless and lossy bi-level data compression standard
Emerged from JBIG-1 Joint Bi-Level Image Experts Group Supports three coding modes: Generic Halftone Text Image is segmented into regions, which can be encoded using different methods

JBIG-2: Segmentation The image on the left is segmented into a binary image, text, and a grayscale image: binary text grayscale

JBIG-2: Encoding Arithmetic Coding (QM Coder) Context-based prediction
Larger contexts than JBIG-1 Progressive Compression (Display) Predictive context uses previous information Adaptive Coder A A X = Pixel to be coded A = Adaptive pixel (which can be moved) A A X

JBIG-2: Halftone and Text
Halftone images are coded as multi-level images, along with pattern and grid parameters Each text symbol is encoded in a dictionary along with relative coordinates:

Color Separation Images comprising discrete colors can be considered as multi-layered binary images: Each color and the image background form one binary layer If there are N colors, where one color represents the image background, then there will be N-1 binary layers: A map with white background and four colors will thus yield 4 binary layers

Color Separation: Example
The following Excel graph comprises 34 colors + the white background:

Layer 1

Layer 5

Layer 12

Comparison with JBIG2 and JPEG
Our Method: 96% JBIG2: 94% JPEG: 91% Our Method: 98% JBIG2: 97% JPEG: 92%

Encoding Example Original size: 64 * 3 = 192 bits Codebook RCRC
Uncompressible The compression ratio is the size of the encoded stream over the original size: 1 – ( ) / 192 = 56%

Definitions (cont.) Compression ratio is defined as the number of bits after a coding scheme has been applied on the source data over the original source data size Expressed as a percentage, or usually is bits per pixel (bpp) when source data is an image JBIG-2 is the standard binary image compression scheme Based mainly on arithmetic coding with context modeling Other methods in the literature designed for specific classes of binary images Our objective: design a coding method notwithstanding the nature of a binary image

Huffman Coding, Arithmetic Coding, and JBIG2

Similar presentations

Presentation on theme: "Huffman Coding, Arithmetic Coding, and JBIG2"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Huffman Coding, Arithmetic Coding, and JBIG2

Similar presentations

Presentation on theme: "Huffman Coding, Arithmetic Coding, and JBIG2"— Presentation transcript:

Similar presentations

About project

Feedback