Download presentation
Presentation is loading. Please wait.
1
Huffman Coding
2
A simple example Suppose we have a message consisting of 5 symbols, e.g. [►♣♣♠☻►♣☼►☻] How can we code this message using 0/1 so the coded message will have minimum length (for transmission or saving!) 5 symbols at least 3 bits For a simple encoding, length of code is 10*3=30 bits
3
A simple example – cont. Intuition: Those symbols that are more frequent should have smaller codes, yet since their length is not the same, there must be a way of distinguishing each code For Huffman code, length of encoded message will be ►♣♣♠☻►♣☼►☻ =3*2 +3*2+2*2+3+3=24bits
4
Another Example This is eleven letters in 23 bits
A = 0 B = 100 C = 1010 D = 1011 R = 11 ABRACADABRA = This is eleven letters in 23 bits A fixed-width encoding would require 3 bits for five different letters, or 33 bits for 11 letters Notice that the encoded bit string can be decoded!
5
The first way needs 1003=300 bits. The second way needs
Huffman codes Binary character code: each character is represented by a unique binary string. A data file can be coded in two ways: a b c d e f frequency(%) 45 13 12 16 9 5 fixed-length code 000 001 010 011 100 101 variable-length code 111 1101 1100 The first way needs 1003=300 bits. The second way needs 45 1+13 3+12 3+16 3+9 4+5 4=232 bits. 2018/11/22
6
Variable-length code Need some carefulness to read the code.
(codeword: a=0, b=00, c=01, d=11.) Where to cut? 00 can be explained as either aa or b. Prefix of 0011: 0, 00, 001, and 0011. Prefix codes: no codeword is a prefix of some other codeword. (prefix free) Prefix codes are simple to encode and decode. 2018/11/22
7
Using codeword in Table to encode and decode
Encode: abc = = (just concatenate the codewords.) Decode: = = aabe a b c d e f frequency(%) 45 13 12 16 9 5 fixed-length code 000 001 010 011 100 101 variable-length code 111 1101 1100 2018/11/22
8
Encode: abc = = (just concatenate the codewords.) Decode: = = aabe (use the (right)binary tree below:) a:45 b:13 c:12 d:16 e:9 f:5 1 100 14 86 28 58 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 100 1 Tree for the fixed length codeword Tree for variable-length codeword 2018/11/22
9
Binary tree Every nonleaf node has two children.
Why? The fixed-length code in our example is not optimal. The total number of bits required to encode a file is f ( c ) : the frequency (number of occurrences) of c in the file dT(c): denote the depth of c’s leaf in the tree 2018/11/22
10
Constructing an optimal coding scheme
Formal definition of the problem: Input: a set of characters C={c1, c2, …, cn}, each cC has frequency f[c]. Output: a binary tree representing codewords so that the total number of bits required for the file is minimized. Huffman proposed a greedy algorithm to solve the problem. 2018/11/22
11
(a) f:5 e:9 c:12 b:13 d:16 a:45 (b) a:45 d:16 e:9 f:5 14 1 b:13 c:12
1 b:13 c:12 2018/11/22
12
a:45 d:16 e:9 f:5 14 1 b:13 c:12 25 (c) a:45 b:13 c:12 d:16 e:9 f:5 25
1 b:13 c:12 25 (c) a:45 b:13 c:12 d:16 e:9 f:5 25 30 14 1 (d) 2018/11/22
13
a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 100 1 a:45 b:13 c:12 d:16 e:9
1 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 1 (f) (e) 2018/11/22
14
5 x:=left[z]:=EXTRACT_MIN(Q) 6 y:=right[z]:=EXTRACT_MIN(Q)
HUFFMAN(C) 1 n:=|C| 2 Q:=C 3 for i:=1 to n-1 do 4 z:=ALLOCATE_NODE() 5 x:=left[z]:=EXTRACT_MIN(Q) 6 y:=right[z]:=EXTRACT_MIN(Q) 7 f[z]:=f[x]+f[y] 8 INSERT(Q,z) 9 return EXTRACT_MIN(Q) 2018/11/22
15
The Huffman Algorithm This algorithm builds the tree T corresponding to the optimal code in a bottom-up manner. C is a set of n characters, and each character c in C is a character with a defined frequency f[c]. Q is a priority queue, keyed on f, used to identify the two least-frequent characters to merge together. The result of the merger is a new object (internal node) whose frequency is the sum of the two objects. 2018/11/22
16
Time complexity Lines 4-8 are executed n-1 times.
Each heap operation in Lines 4-8 takes O(lg n) time. Total time required is O(n lg n). Note: The details of heap operation will not be tested. Time complexity O(n lg n) should be remembered. 2018/11/22
17
An Complete Example Scan the original text
An Introduction to Huffman Coding March 21, 2000 An Complete Example Scan the original text Eerie eyes seen near lake. What characters are present? E e r i space y s n a l k . Mike Scott
18
Building a Tree Scan the original text
An Introduction to Huffman Coding March 21, 2000 Building a Tree Scan the original text Eerie eyes seen near lake. What is the frequency of each character in the text? Char Freq. Char Freq. Char Freq. E y k 1 e s r n 2 i a 2 space 4 l 1 Mike Scott
19
An Introduction to Huffman Coding
March 21, 2000 Building a Tree The array after inserting all nodes E 1 i y l k . r 2 s n a sp 4 e 8 Mike Scott
20
An Introduction to Huffman Coding
March 21, 2000 Building a Tree E 1 i y l k . r 2 s n a sp 4 e 8 Mike Scott
21
An Introduction to Huffman Coding
March 21, 2000 Building a Tree y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 sp 4 e 8 2 E 1 i 1 Mike Scott
22
An Introduction to Huffman Coding
March 21, 2000 Building a Tree y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 2 sp 4 e 8 E 1 i 1 Mike Scott
23
An Introduction to Huffman Coding
March 21, 2000 Building a Tree k 1 . 1 r 2 s 2 n 2 a 2 2 sp 4 e 8 E 1 i 1 2 y 1 l 1 Mike Scott
24
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 2 k 1 . 1 r 2 s 2 n 2 a 2 2 sp 4 e 8 y 1 l 1 E 1 i 1 Mike Scott
25
An Introduction to Huffman Coding
March 21, 2000 Building a Tree r 2 s 2 n 2 a 2 2 2 sp 4 e 8 y 1 l 1 E 1 i 1 2 k 1 . 1 Mike Scott
26
An Introduction to Huffman Coding
March 21, 2000 Building a Tree r 2 s 2 n 2 a 2 2 sp 4 e 8 2 2 k 1 . 1 E 1 i 1 y 1 l 1 Mike Scott
27
An Introduction to Huffman Coding
March 21, 2000 Building a Tree n 2 a 2 2 sp 4 e 8 2 2 E 1 i 1 y 1 l 1 k 1 . 1 4 r 2 s 2 Mike Scott
28
An Introduction to Huffman Coding
March 21, 2000 Building a Tree n 2 a 2 2 e 8 sp 4 2 4 2 k 1 . 1 E 1 i 1 r 2 s 2 y 1 l 1 Mike Scott
29
An Introduction to Huffman Coding
March 21, 2000 Building a Tree e 8 2 4 2 2 sp 4 r 2 s 2 y 1 l 1 k 1 . 1 E 1 i 1 4 n 2 a 2 Mike Scott
30
An Introduction to Huffman Coding
March 21, 2000 Building a Tree e 8 2 4 4 2 2 sp 4 r 2 s 2 n 2 a 2 y 1 l 1 k 1 . 1 E 1 i 1 Mike Scott
31
An Introduction to Huffman Coding
March 21, 2000 Building a Tree e 8 4 4 2 sp 4 r 2 s 2 n 2 a 2 k 1 . 1 4 2 2 E 1 i 1 y 1 l 1 Mike Scott
32
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 4 4 4 2 sp 4 e 8 2 2 r 2 s 2 n 2 a 2 k 1 . 1 E 1 i 1 y 1 l 1 Mike Scott
33
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 4 4 4 e 8 2 2 r 2 s 2 n 2 a 2 E 1 i 1 y 1 l 1 6 2 sp 4 k 1 . 1 Mike Scott
34
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 4 4 6 4 e 8 2 sp 4 2 2 r 2 s 2 n 2 a 2 k 1 . 1 E 1 i 1 y 1 l 1 What is happening to the characters with a low number of occurrences? Mike Scott
35
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 4 6 e 8 2 2 2 sp 4 k 1 . 1 E 1 i 1 y 1 l 1 8 4 4 r 2 s 2 n 2 a 2 Mike Scott
36
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 4 6 e 8 8 2 2 2 sp 4 4 4 k 1 . 1 E 1 i 1 y 1 l 1 r 2 s 2 n 2 a 2 Mike Scott
37
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 8 e 8 4 4 10 r 2 s 2 n 2 a 2 4 6 2 2 2 sp 4 E 1 i 1 y 1 l 1 k 1 . 1 Mike Scott
38
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 8 e 8 10 4 4 4 6 2 2 r 2 s 2 n 2 a 2 2 sp 4 E 1 i 1 y 1 l 1 k 1 . 1 Mike Scott
39
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 10 16 4 6 2 2 e 8 8 2 sp 4 E 1 i 1 y 1 l 1 k 1 . 1 4 4 r 2 s 2 n 2 a 2 Mike Scott
40
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 10 16 4 6 e 8 8 2 2 2 sp 4 4 4 E 1 i 1 y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 Mike Scott
41
An Introduction to Huffman Coding
March 21, 2000 Building a Tree 26 16 10 4 e 8 8 6 2 2 2 sp 4 4 4 E 1 i 1 y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 Mike Scott
42
An Introduction to Huffman Coding
March 21, 2000 Building a Tree After enqueueing this node there is only one node left in priority queue. 26 16 10 4 e 8 8 6 2 2 2 sp 4 4 4 E 1 i 1 y 1 l 1 k 1 . 1 r 2 s 2 n 2 a 2 Mike Scott
43
An Introduction to Huffman Coding
March 21, 2000 Using heap: P L R f 5 P L R e 9 P L R c 12 P L R b 13 P L R d 16 P L R a 45 Mike Scott
44
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R e 9 P L R c 12 P L R b 13 P L R d 16 P L R a 45 P L R f 5 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
45
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R a 45 P L R e 9 P L R c 12 P L R b 13 P L R d 16 P L R f 5 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
46
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R e 9 P L R a 45 P L R c 12 P L R b 13 P L R d 16 P L R f 5 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
47
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R e 9 P L R b 13 P L R c 12 P L R a 45 P L R d 16 P L R f 5 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
48
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R b 13 P L R c 12 P L R a 45 P L R d 16 P L R e 9 P L R f 5 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
49
CS3335 Design and Analysis of Algorithms/WANG Lusheng
P L R d 16 P L R b 13 P L R c 12 P L R a 45 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
50
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R c 12 P L R b 13 P L R d 16 P L R a 45 P f e g 14 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
51
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R c 12 P L R b 13 P L R d 16 P L R a 45 P f e g 14 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
52
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R b 13 P L R d 16 P L R a 45 P f e g 14 P L R c 12 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
53
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P f e g 14 P L R b 13 P L R d 16 P L R a 45 g L R f 5 g L R e 9 P L R c 12 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
54
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R b 13 P f e g 14 P L R d 16 P L R a 45 g L R f 5 g L R e 9 P L R c 12 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
55
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P f e g 14 P L R d 16 P L R a 45 P L R b 13 g L R f 5 g L R e 9 P L R c 12 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
56
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R a 45 P f e g 14 P L R d 16 g L R f 5 g L R e 9 P L R c 12 P L R b 13 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
57
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P f e g 14 P L R a 45 P L R d 16 g L R f 5 g L R e 9 P c b h 25 h L R c 12 h L R b 13 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
58
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P f e g 14 P L R a 45 P L R d 16 P c b h 25 g L R f 5 g L R e 9 h L R c 12 h L R b 13 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
59
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P f e g 14 P c b h 25 P L R d 16 P L R a 45 g L R f 5 g L R e 9 h L R c 12 h L R b 13 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
60
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P c b h 25 P L R d 16 P L R a 45 h L R c 12 h L R b 13 P f e g 14 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
61
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R a 45 P c b h 25 P L R d 16 h L R c 12 h L R b 13 P f e g 14 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
62
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R d 16 P c b h 25 P L R a 45 h L R c 12 h L R b 13 P f e g 14 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
63
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P c b h 25 P L R a 45 h L R c 12 h L R b 13 P L R d 16 P f e g 14 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
64
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R a 45 P c b h 25 h L R c 12 h L R b 13 P f e g 14 P L R d 16 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
65
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P c b h 25 P L R a 45 P g d i 30 h L R c 12 h L R b 13 i f e g 14 i L R d 16 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
66
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P c b h 25 P g d i 30 P L R a 45 h L R c 12 h L R b 13 i f e g 14 i L R d 16 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
67
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P g d i 30 P L R a 45 i f e g 14 i L R d 16 P c b h 25 g L R f 5 g L R e 9 h L R c 12 h L R b 13 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
68
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P g d i 30 P L R a 45 i f e g 14 i L R d 16 P c b h 25 g L R f 5 g L R e 9 h L R c 12 h L R b 13 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
69
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P L R a 45 P g d i 30 P c b h 25 i f e g 14 i L R d 16 h L R c 12 h L R b 13 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
70
CS3335 Design and Analysis of Algorithms/WANG Lusheng
Using heap: P h i j 55 P L R a 45 j c b h 25 j g d i 30 h L R c 12 h L R b 13 i f e g 14 i L R d 16 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
71
CS3335 Design and Analysis of Algorithms/WANG Lusheng
P L R a 45 P h i j 55 j c b h 25 j g d i 30 h L R c 12 h L R b 13 i f e g 14 i L R d 16 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
72
CS3335 Design and Analysis of Algorithms/WANG Lusheng
P h i j 55 P L R a 45 j c b h 25 j g d i 30 h L R c 12 h L R b 13 i f e g 14 i L R d 16 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
73
CS3335 Design and Analysis of Algorithms/WANG Lusheng
P h i j 55 j c b h 25 j g d i 30 P L R a 45 h L R c 12 h L R b 13 i f e g 14 i L R d 16 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
74
CS3335 Design and Analysis of Algorithms/WANG Lusheng
P h i j 55 P L R a 45 j c b h 25 j g d i 30 h L R c 12 h L R b 13 i f e g 14 i L R d 16 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
75
CS3335 Design and Analysis of Algorithms/WANG Lusheng
P a j k 100 k L R a 45 k h i j 55 j c b h 25 j g d i 30 h L R c 12 h L R b 13 i f e g 14 i L R d 16 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
76
CS3335 Design and Analysis of Algorithms/WANG Lusheng
P a j k 100 k L R a 45 k h i j 55 j c b h 25 j g d i 30 h L R c 12 h L R b 13 i f e g 14 i L R d 16 g L R f 5 g L R e 9 2018/11/22 CS3335 Design and Analysis of Algorithms/WANG Lusheng
77
Exercise Modify MyHeap.java in Tutorial 6’s folder so that the class ArrayNode has five data fields: int key; char letter; ArrayNode parent; ArrayNode left; ArrayNode right; and use the modified MyHeap to construct Huffman code tree. The program can read n pairs (ai, bi) from the keyboard , where ai is the number of times that character/letter bi appears and construct the Huffman code tree for the n pairs. 2018/11/22
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.