Download presentation
Presentation is loading. Please wait.
1
STATISTIC & INFORMATION THEORY (CSNB134)
MODULE 9 INFORMATION CONTENT, ENTROPY & CODING EFFICIENCY
2
Information Uncertainty
In Module 8, we have learned that the basic model of Information Theory is: Here, information is generated at the source, send through a channel and consume in the drain. In other words, information is being transmitted from the sender to the receiver. Prior to the transmission, the receiver has no idea what is the content of the information. This implies the concept of information as a random variable and ‘Information Uncertainty’ (i.e. the receiver is uncertain of the content of the information, until after he / she has received it down the transmission channel)
3
Information Uncertainty & Probability
Consider the following statements: (a) Tomorrow, the sun will rise from the East. (b) The phone will ring in the next 1 hour. (c) It will snow in Malaysia this December. Everybody knows the sun always rises in the East. The probability of this event is almost 1. Thus, this statement hardly carries any information. The phone may or may not ring in the next 1 hour. The probability of this event is less that the probability of event in statement (a). Statement (b) carries more information than statement (a). It has never snowed in Malaysia. The probability of this event is almost 0. Statement (c) carries the most information amongst all.
4
Information Uncertainty & Probability (cont.)
Therefore, we can conclude that the amount of information carries by each statement (or the information content of a single event) is inversely proportioned to the probability of that event. The formulae is: The unit of I(xi) is determined by the base of the logarithm. Since in the digital world, the basic unit is in bits, hence the formulae is:
5
Entropy Entropy is defined as the weighted sum of all information contents. Entropy is denoted as H(x) , and its formulae in base 2 is: Note: It is weighted / normalized by multiplying each information content with the probability of the event
6
Binary Entropy Function
Consider a discrete binary source that emits a sequence of statistically independent symbols. The output is either ‘0’ with probability p or a ‘1’ with a probability 1-p. The entropy of the binary source is: The overall Binary Entropy Function can be plotted as: p
7
Exercise 1 Calculate the entropy of a binary source that has an equal probability of transmitting a ‘1’ and a ‘0’. Equal probability of transmitting a ‘1’ and a ‘0’ means p = 0.5 Note: If your calculator does not have the log2 function, then you may use this formulae: logbx = ln x / ln b
8
Exercise 2 Calculate the entropy of a binary source only transmit ‘1’ (i.e. transmit ‘1’ all the time!) Always transmitting a ‘1’ means p = 1
9
Exercise 3 Calculate the entropy of a binary source that transmit a ‘1’ after every 9 series of ‘0’. Transmit a ‘1’ after every 9 series of ‘0’ is similar to means p = 0.1 and 1-p = 0.9
10
Exercise 4 Assuming a binary source is used to cast the outcome of a fair dice. What is the entropy of the dice in bits? A fair dice always results in p = 1/6 for all possible 6 outcomes
11
Recaps… Entropy is the weighted sum of all information contents.
In the example of the fair dice which outcome is represented in bits, the entropy is This can be concluded as : is the recommended average number of bits needed to sufficiently describe the outcome of a single cast. However, we know that by using fixed length coding we need to use an average of 3 bits to represent 6 symbols (Note: we denote Ṝ as the average bits / symbol, thus Ṝ = 3) . We know that 23 = 8, thus it is sub-optimal to represent only 6 symbols with 3 bits. Note: there is a difference between what is recommended and what is used in actual!
12
Average Number of Bits per Symbol
Previously, we denote Ṝ as the average number of bits per symbol. In a fixed length coding (FLC), Ṝ is equal to n bits where 2n = maximum number of symbols than can be represented with n bits. In a fixed length coding, all symbols are represented by fixed same length number of bits. There is also another type of coding which is known as variable length coding (VLC). Here, all symbols need not necessarily be represented by a fixed same length number of bits. In both FLC and VLC the formulae for Ṝ is: where l(xi ) = length per symbol i
13
Coding Efficiency The formulae for entropy (H) is :
Where as, the formulae for average number of bits / symbol (Ṝ) is: From here we can derive the coding efficiency (ἠ) as:
14
Exercise 5 Calculate the coding efficiency of FLC for the example of the fair dice where the entropy is and that the average number of bits / symbol using FLC is 3. The ideal optimal coding would yield efficiency of 1.
15
STATISTIC & INFORMATION THEORY (CSNB134)
INFORMATION CONTENT, ENTROPY & CODING EFFICIENCY --END--
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.