Huffman Coding The most for the least
Design Goals Encode messages parsimoniously No character code can be the prefix for another
Requirements Message statistics Data structures to create and store new codes
Conventional Encoding Schemes Fixed length codes – E.g., Unicode, ASCII, rgb Sample data Optimal code length (in bits) is given by the entropy E: FreqLog2(Freq)-Product A B C D E Entropy=>
Huffman Algorithm While (two or more trees in the Forest) – Find the two least weight trees i and j – Create a code Tree node whose left child is the root of tree i and right child is the root of tree j – Join the two least weight trees and replace tree i – Delete tree j
Graphical Solution (Step 1) A 0.13 B 0.27 C 0.35 D 0.17 E E 0.08 A 0.13 B 0.27 C 0.35 D 0.17
Graphical Solution (Step 2) B 0.27 C 0.35 D 0.17 B 0.27 C 0.35 D E 0.08 A E 0.08 A 0.13
Graphical Solution (Step 3) B 0.27 C 0.35 D D B 0.27 C E 0.08 A E 0.08 A 0.13
Graphical Solution (Step 4) D E 0.08 A D B 0.27 C B 0.27 C E 0.08 A 0.13
Interpreting the Code B 0.27 C 0.35 D E 0.08 A Symb ol CodeLengthFreqProd A B C D E Avg.2.21
Data Structures ForestSourceCode Tree CursorweightRootCursorSymbolProbabilityLeafCursorleft childRight ChildRoot A B C D E 55000
Step 1 ForestSourceCode Tree CursorweightRootCursorSymbolProbabilityLeafCursorleft childRight ChildRoot A B C D E 55000
Step 2 ForestSourceCode Tree CursorweightRootCursorSymbolProbabilityLeafCursorleft childRight ChildRoot A B C D E
Step 3 ForestSourceCode Tree CursorweightRootCursorSymbolProbabilityLeafCursorleft childRight ChildRoot A B C D E
Step 4 ForestSourceCode Tree CursorweightRootCursorSymbolProbabilityLeafCursorleft childRight ChildRoot A B C D E
Step 5 ForestSourceCode Tree CursorweightRootCursorSymbolProbabilityLeafCursorleft childRight ChildRoot 1191A B C D E