Chapter 10 Shannon’s Theorem
Shannon’s Theorems First theorem:H(S) ≤ L n (S n )/n < H(S) + 1/n where L n is the length of a certain code. Second theorem: extends this idea to a channel with errors, allowing one to reach arbitrarily close to the channel capacity while simultaneously correcting almost all the errors. Proof: it does so without constructing a specific code, and relies instead on a random code.
Review/Example Choose a decision rule based on maximum likelihood: d(b 1 ) = a 1 ; d(b 2 ) = arbitrary; d(b 3 ) = a 2. The probability of making a mistake is P(E | b j ) = 1 − P(d(b j ) | b j ). assume all source symbols equally likely Calculation for above example P E = 1 − 1/3 ( ) = 17/ , b 1 b 2 b 3 a1a2a3a1a2a3
Random Codes Send an n-bit block code through a binary symmetric channel: 10.4 P Q Q P M distinct equiprobable n -bit blocks A = {a i : i = 1, …, M} I 2 (a i ) = log 2 M Intuitively, each block comes through with n∙C bits of information. C = 1 − H 2 (Q) Q < ½ To signal close to capacity, we want I 2 (a i ) = n (C − ε) small number ε > 0 intuitively, # of messages that can get thru channel by increasing n, this can be made arbitrarily large we can choose M so that we use only a small fraction of the # of messages that could get thru – redundancy. Excess redundancy gives us the room required to bring the error rate down. For a large n, pick M random codewords from {0, 1} n. B = {b j : |b j | = n, j = 1, …, 2 n }
With high probability, almost all a i will be a certain distance apart (provided M « 2 n ). Picture the a i in n-dimensional Hamming space. As each a i goes thru channel, we expect nQ errors on average. Consider a sphere on radius n (Q + ε ′ ) about each a i : aiai nQ nε′nε′ received symbol By the law of large numbers, can be made « δ Similarly, around each b j : What us the probability that an uncorrectable error occurs? bjbj nQ nε′nε′ a′ a i aiai sent symbol 10.4 bjbj too much noise another a ′ is also inside
Idea Pick # of code words M to be 2 n(C−ε) where C is the channel capacity (the block size n is as yet undetermined and depends on how close ε we wish to approach the channel capacity). The number of possible random codes = (2 n ) M = 2 nM, each equally likely. Let P E = the probability of errors averaged over all random codes. The idea is to show that P E → 0. I.e. given any code, most of the time it will probably work!
Proof Suppose a is what’s sent, and b what’s received. Let X = 0/1 be a random variable representing errors in the channel, with probability P/Q. So if the error vector a b = (X 1, …, X n ), then d(a, b) = X 1 + … + X n (by law of large numbers) N. B. Q = E{X} Q < ½, pick ε′ Q + ε′ < ½
Since the a′ are randomly (uniformly) distributed throughout, by the binomial bound volume of whole space 10.5 Chance that some particular code word lands too close. Chance that any one is too close. N.b. e = log 2 ( 1 / Q –1) > 0, so we can choose ε′e < ε.