Presentation is loading. Please wait.

Presentation is loading. Please wait.

Abdullah Aldahami (11074595) April 6, 2010 1. 2  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average.

Similar presentations


Presentation on theme: "Abdullah Aldahami (11074595) April 6, 2010 1. 2  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average."— Presentation transcript:

1 Abdullah Aldahami (11074595) April 6, 2010 1

2 2  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average size.  Huffman codes are part of several data formats as ZIP, MPEG and JPEG.  The code is generated based on the estimated probability of occurrence.  Huffman coding works by creating an optimal binary tree of nodes, that can be stored in a regular array.

3 3  The method starts by building a list of all the alphabet symbols in descending order of their probabilities (frequency of appearance).  It then construct a tree from bottom to top.  Step by step, the two symbols with the smallest probabilities are selected; added to the top.  When the tree is completed, the codes of the symbols are assigned.

4 4  Example: circuit elements in digital computations  Summation of frequencies (Number of events) is 40 CharacterFrequency i6 t5 space4 c3 e3 n3 u2 l2 CharacterFrequency m2 s2 a2 o2 r1 d1 g1 p1

5 5  Example: circuit elements in digital computations r1d1g1p1 2 u2l2m2 2 a2o2s2 4444 c3e3n3‘ 4 7 t5 7 i6 78 1213 25 40 01 01 0101 01 0101 0101 01 0101 0101 01

6 6  So, the code will be generated as follows: CharacterFrequencyCode Length Total Length i60103 18 t50003 15 space41103 12 c300104 12 e301104 12 n31003 9 u2001105 10 l2011105 10 CharacterFrequencyCode Length Total Length m2011115 10 s210014 8 a211104 8 o211114 8 r10011106 6 d10011116 6 g1100005 5 p1100015 5  Total is 154 bits with Huffman Coding compared to 240 bits with no compression

7 7 Input Symbol it‘ ’cenul Probability P(x) 0.150.1250.10.075 0.05 Output Code 010000110001001101000011001110 Code length (in bits) (L i ) 33344355 Weighted path length L i ×P(x) 0.450.3750.3 0.2250.25 Optimality Probability budget (2 -L i ) 1/8 1/16 1/81/32 Information of a Message I(x) = – log 2 P(x) 2.743.003.323.74 4.32 Entropy H(x) =-P(x) log 2 P(x) 0.4110.3750.3320.280 0.216 Entropy is a measure defined in information theory that quantifies the information of an information source. The measure entropy gives an impression about the success of a data compression process.

8 8 Input Symbol msaordgp Sum Probability P(x) 0.05 0.025 = 1 Output Code 011111001111011110011100011111000010001 Code length (in bits) (L i ) 54446655 Weighted path length L i ×P(x) 0.250.2 0.15 0.125 3.85 Optimality Probability budget (2 -L i ) 1/321/16 1/64 1/32 = 1 Information of a Message I(x) = – log 2 P(x) 4.32 5.32 Entropy H(x) =-P(x) log 2 P(x) 0.216 0.133 3.787 Bit/sym The sum of the probability budgets across all symbols is always less than or equal to one. In this example, the sum is equal to one; as a result, the code is termed a complete code. Huffman coding approaches the optimum on 98.36% = (3.787 / 3.85) *100

9 9  Static probability distribution (Static Huffman Coding)  Coding procedures with static Huffman codes operate with a predefined code tree, previously defined for any type of data and is independent from the particular contents.  The primary problem of a static, predefined code tree arises, if the real probability distribution strongly differs from the assumptions. In this case the compression rate decreases drastically.

10 10  Adaptive probability distribution (Adaptive Huffman Coding)  The adaptive coding procedure uses a code tree that is permanently adapted to the previously encoded or decoded data.  Starting with an empty tree or a standard distribution.  This variant is characterized by its minimum requirements for header data, but the attainable compression rate is unfavourable at the beginning of the coding or for small files.

11 11


Download ppt "Abdullah Aldahami (11074595) April 6, 2010 1. 2  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average."

Similar presentations


Ads by Google