Huffman Coding Dr. Ying Lu RAIK 283 Data Structures & Algorithms
Giving credit where credit is due: Most of slides for this lecture are based on slides created by Dr. Richard Anderson, University of Washington. I have modified them and added new slides RAIK 283 Data Structures & Algorithms
Coding theory ASCII coding Conversion, Encryption, Compression Binary coding A B C D E F For fixed-length binary coding of a 6-character alphabet, how many bits are needed?
Coding theory (cont.) ASCII coding Conversion, Encryption, Compression Binary coding A000 B001 C010 D011 E100 F101
Coding theory (cont.) ASCII coding Conversion, Encryption, Compression Binary coding Variable length coding A B C D E F Probability Average bits/character = ? Compression Ratio = ?
Decode the following E0 T11 N100 I1010 S E0 T10 N100 I0111 S
Prefix(-free) codes No prefix of a codeword is a codeword Uniquely decodable A001 B C D E F
Prefix codes and binary trees Tree representation of prefix codes A00 B010 C0110 D0111 E10 F11 A 0 B C D E F
Minimum length code A1/4 B1/8 C1/16 D E1/2 Probability How to code so that average bits/character is minimized?
Minimum length code (cont.) Huffman tree – prefix codes tree with minimum weighted path length C(T) – weighted path length
Huffman code algorithm Derivation Two rarest items will have the longest codewords Codewords for rarest items differ only in the last bit Idea: suppose the weights are with and the smallest weights Start with an optimal code for and Extend the codeword for to get codewords for and
Huffman code H = new minHeap() for each w i T = new Tree(w i ) H.Insert(T) while H.Size() > 1 T 1 = H.DeleteMin() T 2 = H.DeleteMin() T 3 = Merge(T 1, T 2 ) H.Insert(T 3 )
Example characterABCD_ probability
In-class exercises P332 Exercises 9.4.1
In-class exercises What is the maximal length of a codeword possible in a Huffman encoding of an alphabet of n characters? Show that a Huffman tree can be constructed in linear time if the alphabet’s characters are given in a sorted order of their frequencies.