Download presentation
Presentation is loading. Please wait.
1
1 Chapter 5 A Measure of Information
2
2 Outline 5.1 Axioms for the uncertainty measure 5.2 Two Interpretations of the uncertainty function 5.3 Properties of the uncertainty function 5.4 Entropy and Coding 5.5 Shannon-Fano Coding
3
3 5.1 Axioms for the uncertainty measure x : discrete random variable x 1 x 2 ... x M p 1 p 2 ... p M h(p): the uncertainty of an event with probability p h(p i ): the uncertainty of { x = x i } The average uncertainty of x: If p 1 = p 2 = ... = p M =, we say
4
4 Axiom 1: f(M) should be a monotonically increasing function of M, that is, M < M ’ implies f(M) < f(M ’) For example, f(2) < f(6) Axiom 2: X: (x 1,..., x M ) Y: (y 1,..., y L ) (X,Y): Joint experiment has M . L equally likely outcome. f(M . L) = f(M) + f(L) independent
5
5 Axiom 3 (Group Axiom): X = (x 1, x 2,..., x r, x r+1,..., x M ) Construct a compound experiment X A B XrXr X1X1 X r+1 XMXM
6
6 AB
7
7 Axiom 5: H(p,1-p) is a continuous function of p, i.e., a small change in p will correspond to a small change in uncertainty. We can use four axioms above to find the H function. Thm 5.1: The only function satisfying the four given axioms is H(p 1,..., P M )=, where C > 0 and the logarithm base > 1
8
8 For example, C = 1, and base = 2 H(p,1-p) 0 1 ½ 1 Coin : { tail, head } ½ ▪ 1 0 Max. uncertainty Min. uncertainty
9
9 5.2 Two Interpretations of the uncertainty function (1) H(p 1,..., p M ) may be interpreted as the expectation of a random variable W = w(x)
10
10 (2) H(p 1,..., p M ) may be interpreted as the min average number of ‘yes’ ‘no’ questions required to specify the values of x For example, H(x) = H( 0.3, 0.2, 0.2, 0.15, 0.15 ) = 2.27 Does x=x 1 or x 2 ? x=x1?x=x1? x=x3?x=x3? x1x1 x2x2 x3x3 x=x4x=x4 x4x4 x5x5 Y Y Y Y N N N N x1x1 x2x2 x3x3 x4x4 x5x5
11
11 # of questionProbability x1x1 20.3 x2x2 20.2 x3x3 2 x4x4 30.15 x5x5 3 Avg # of q = 2·0.7 + 3·0.3 = 2.3 > 2.27 H.W. : X = { x 1, x 2 } p(x 1 ) = 0.7 p(x 2 ) = 0.3 How many questions (in average) are required to specify the outcome of a joint experiment involving 2 independent observation of x?
12
12 5.3 Properties of the uncertainty function Lemma 5.2 Let p 1,..., p M & q1,..., q M be arbitrary positive number with Then y x y = x -1 y = ln x ln x ≤ x -1
13
13
14
14 Thm 5.3 H(p 1,..., p M ) ≤ log M with equality iff p i =
15
15 5.4 Entropy and Coding Noiseless Coding Theorem X : x 1 x 2 · · · · x M p 1 p 2 · · · · p M Codeword: w 1 w 2 · · · · w M length: n 1 n 2 · · · · n M Minimize: Code Alphabet: { a 1, a 2, …, a D } Ex. D = 2, { 0, 1 }
16
16 Thm (Noiseless Coding Thm) –If is the average codeword length of a uniquely decodable code for X, then with equality iff, for i = 1, 2, …, M. Note: – is the uncertainty of X computed by using the base D.
17
17 pf:
18
18 A code is called “absolutely optimal” if it achieves the lower bound by the noiseless coding thm. Ex. XProb.codewords x11/20 x21/410 x31/8110 X41/8111 H(x) = 7/4 =
19
19 5.5 Shannon-Fano Coding Select the integer n i s.t. => An instantaneous code can be constructed with the lengths n 1, n 2, …, n M obtained from Shannon-Fano coding.
20
20 Thm: Given a random variable X with uncertainty
21
21 In fact, we can always approach the lower bound as closely as desired if we are allowed to use “block coding”. Take a series of observation of X Let Y = (x 1, x 2, …, x s ) Assign a codeword to Y => Block coding decrease the average codeword length per value of X
22
22 Ex. XPiPi codeword x1x1 0.70 x2x2 0.31 But H(X) = 0.88129 H(p), p = 0.3 or p = 0.7 look up table Y=(x 1, x 2 )PiPi Codeword x1 x1x1 x1 0.490 x1 x2x1 x2 0.2110 x2 x1x2 x1 0.21110 x2 x2x2 x2 0.09111
23
23
24
24 How do we find the actual code symbols? –We simply assign them in order. –By S-F coding: –We then assign
25
25 How bad is Shannon-Fano Coding?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.