Huffman Coding and Decoding TAIABUL HAQUE NAEEMUL HASSAN
Huffman Encoding An encoding algorithm used for lossless data compression- Variable-length code Prefix code Basic Intuition- Those symbols that are more frequent should have smaller codes A special kind of tree called Huffman Tree is built by exploiting this property
Huffman Tree Creation CharacterFrequency A29 E23 I25 O14 U7
Huffman Tree Creation CharacterFrequency A29 E23 I25 O14 U
Huffman Tree Creation CharacterFrequency A29 E23 I25 O14 U7 U(7)O(14)
Huffman Tree Creation CharacterFrequency A29 E23 I25 O14 U7 U(7)O(14) 21 E(23)
Huffman Tree Creation CharacterFrequency A29 E23 I25 O14 U7 U(7)O(14) 21 E(23) 44 I(25)A(29) 54 44
Huffman Tree Creation CharacterFrequency A29 E23 I25 O14 U7 U(7)O(14) 21 E(23) 44 I(25)A(29) 54 98
Start Accept training data Scan data, keep tally Make prioritized list Create, Draw Tree Traverse tree Determine code words Save code words Accept test sentence Encode with lookup Display encoded string Decode with traversal Display decoded string Calculate comp. ratio End
Start Accept training data Scan data, keep tally Make prioritized list Create, Draw Tree Traverse tree Determine code words Save code words Accept test sentence Encode with lookup Display encoded string Decode with traversal Display decoded string Calculate comp. ratio End
analysis of algorithm
a(3)
Start Accept training data Scan data, keep tally Make prioritized list Create, Draw Tree Traverse tree Determine code words Save code words Accept test sentence Encode with lookup Display encoded string Decode with traversal Display decoded string Calculate comp. ratio End
Start Accept training data Scan data, keep tally Make prioritized list Create, Draw Tree Traverse tree Determine code words Save code words Accept test sentence Encode with lookup Display encoded string Calculate comp. ratio Decode with traversal Display decoded string End
algo = 4 * 8 = 32 bits = 14 bits Compression Ratio = 14/32*100 = 43.75
Start Accept training data Scan data, keep tally Make prioritized list Create, Draw Tree Traverse tree Determine code words Save code words Accept test sentence Encode with lookup Display encoded string Calculate comp. ratio Decode with traversal Display decoded string End
a(3)
l(2)
g(1)
o(2)
Frequency Analysis E T A O I N S H R D L U is the approximate order of frequency of the twelve most commonly used letters in the English language. Our Observation: File SizeOrder of letters 338E T I A O N S R C H L D 65E T N O I A R S C L H D 70E A O T R S I N L H D C 8E O T A N R I S L H U D 677E T A O N I R S H L D C
THANK YOU