Error-Correcting Codes: Progress & Challenges Madhu Sudan Microsoft/MIT 02/17/20101ECC: Progress/Challenges

Error-Correcting Codes: Progress & Challenges Madhu Sudan Microsoft/MIT 02/17/20101ECC: Progress/Challenges (@CMU)

Communication in presence of noise Noisy Channel Sender Receiver We are not ready We are now ready If information is digital, reliability is critical 02/17/20102ECC: Progress/Challenges (@CMU)

Shannon’s Model: Probabilistic Noise Noisy Channel Probabilistic Noise: E.g., every letter flipped to random other letter of Σ w.p. p Focus: Design good Encode/Decode algorithms. Encode (expand) Decode (compress?) Sender Receiver E:Σ k  Σ n D:Σ n  Σ k 02/17/20103ECC: Progress/Challenges (@CMU)

Hamming Model: Worst-case error Errors: Upto t worst-case errors Focus: Code: C = Image(E) = {E(x) | x Є Σ k } (Note: Not encoding/decoding) (Note: Not encoding/decoding) Goal: Design code to correct every possible pattern of t errors. of t errors. 02/17/20104ECC: Progress/Challenges (@CMU)

Problems in Coding Theory, Broadly Combinatorics: Design best possible error- correcting codes. Combinatorics: Design best possible error- correcting codes. Probability/Algorithms: Design algorithms correcting random/worst-case errors. Probability/Algorithms: Design algorithms correcting random/worst-case errors. 02/17/20105ECC: Progress/Challenges (@CMU)

Part I (of III): Combinatorial Results 02/17/20106ECC: Progress/Challenges (@CMU)

Hamming Notions Hamming Distance: ¢ (x,y) = |{i | x i ≠ y i }| Distance of Code: ¢ (C) = min x,y 2 C { ¢ (x,y)} Code of distance 2t+1 corrects t errors. Main question: Four parameters: Length n, message length k, distance d, alphabet q = |Σ|. - How do they relate? - How do they relate? - Want + n, " k, " d, ? q - Want + n, " k, " d, ? q Let: R = k/n; δ= d/n; How do R, δ, q relate? Asymptotically: 02/17/20107ECC: Progress/Challenges (@CMU)

8 Simple results Ball(x,r) = {y \in Σ n | Δ(x,y) · r} Volume of Ball:Vol(q,n,r) = |Ball(x,r)| Entropy function: H q (δ) = c s.t. Vol(q,n, δn) ¼ q cn H q (δ) = c s.t. Vol(q,n, δn) ¼ q cn Hamming (Packing) Bound: Balls of radius δn/2 around codewords are disjoint. q k ¢ q H q(δ/2)n · q n R + H q (δ/2) · 1 02/17/2010ECC: Progress/Challenges (@CMU)

Gilbert-Varshamov (Greedy) Bound: Let C:Σ k  Σ n be maximal code of distance d. Then balls of radius d-1 around codewords cover Σ n So q k ¢ q H q (δn) ¸ q n ……… Or … R ¸ 1 – H q (δ) Simple results (contd.) 02/17/20109ECC: Progress/Challenges (@CMU)

Simple results (Summary) For the best code: 1 – H q (δ) · R · 1 – H q (δ/2) 1 – H q (δ) · R · 1 – H q (δ/2) After fifty years of research … We still don’t know. Which is right? 02/17/201010ECC: Progress/Challenges (@CMU)

Binary case (q =2): Case of large distance: δ = ½ - ², ²  0. Case of large distance: δ = ½ - ², ²  0. Case of small (relative) distance: Case of small (relative) distance: No bound better than R · 1 – (1-o(1)) ¢ H(δ/2) No bound better than R · 1 – (1-o(1)) ¢ H(δ/2) Case of constant distance d: Case of constant distance d: (d/2) log n ¸ n-k ¸ (1-o(1)). (d/2) \log n (d/2) log n ¸ n-k ¸ (1-o(1)). (d/2) \log n GV / C h erno ® LPB oun d H amm i ng BCH H amm i ng ( ² 2 ) · R · O * ( ² 2 ) 02/17/201011ECC: Progress/Challenges (@CMU)

Binary case (Closer look): For general n,d: For general n,d: # Codewords ¸ 2 n / Vol (2,n, d-1) # Codewords ¸ 2 n / Vol (2,n, d-1) Can we do better? Twice as many codewords? Can we do better? Twice as many codewords? (won’t change asymptotics of R, δ ) (won’t change asymptotics of R, δ ) Recent progress [Jiang-Vardy]: Recent progress [Jiang-Vardy]: # Codewords ¸ d ¢ 2 n / Vol(2,n,d-1) # Codewords ¸ d ¢ 2 n / Vol(2,n,d-1) 02/17/201012ECC: Progress/Challenges (@CMU)

Proof idea of [Jiang-Vardy]: V er t i ces = f 0 ; 1 g n, u $ v, ¢ ( u ; v ) < d L oo k a t H amm i ng d i s t ance d ¡ 1 grap h : C o d e = I n d epen d en t se t i n t h i sgrap h GVB oun d : I. S. s i ze ¸ # ver t i ces / d egree. J i ang- V ar d y: N o t i ce # t r i ang l essma ll. U se [ AKS ] F orgrap h sw i t h no ( sma ll# o f ) t r i ang l es, b oun d i mproves b y f ac t oro fl og d egree. 02/17/201013ECC: Progress/Challenges (@CMU)

Major questions in binary codes: Give explicit construction meeting GV bound. Give explicit construction meeting GV bound. Specifically: Codes with δ = ½ - ² & R = ( ² 2 ) Specifically: Codes with δ = ½ - ² & R = ( ² 2 ) Is Hamming tight when δ  0 ? Is Hamming tight when δ  0 ? Do there exist codes of distance δ with Do there exist codes of distance δ with R = 1 – [ c ¢ (1 – o(1)) ¢ δ log 2 (1/δ) ] R = 1 – [ c ¢ (1 – o(1)) ¢ δ log 2 (1/δ) ] for c ½ ] for c ½ ] Is LP Bound tight? Is LP Bound tight? 02/17/201014ECC: Progress/Challenges (@CMU)

Combinatorics (contd.): q-ary case Fix δ and let q  1 (then fix q and let n  1 ) Fix δ and let q  1 (then fix q and let n  1 ) Surprising result (’80s): Surprising result (’80s): Algebraic Geometry yields : R ¸ 1 – δ – 1/(√q – 1) Algebraic Geometry yields : R ¸ 1 – δ – 1/(√q – 1) (Also a negative surprise: BCH codes only yield (Also a negative surprise: BCH codes only yield 1 – R · (q-1)/q log q n) 1 – R · (q-1)/q log q n) GV bound Plotkin Not Hamming 02/17/201015ECC: Progress/Challenges (@CMU) 1 – δ – O(1/log q) · R · 1 – δ – 1/q

Major questions: q-ary case Suppose R = 1 – δ – f(q) Suppose R = 1 – δ – f(q) What is the fastest decaying function f(.)? What is the fastest decaying function f(.)? (somewhere between 1/√q and 1/q). (somewhere between 1/√q and 1/q). Give a simple explanation for why f(q) · 1/√q Give a simple explanation for why f(q) · 1/√q Fix d, and let q  1 Fix d, and let q  1 How does (n-k)/(d log q n) grow in the limit? How does (n-k)/(d log q n) grow in the limit? Is it 1 or ½? Or somewhere in between? Is it 1 or ½? Or somewhere in between? 02/17/201016ECC: Progress/Challenges (@CMU)

Part II (of III): Correcting Random Errors 02/17/201017ECC: Progress/Challenges (@CMU)

Recall Shannon ‘1948 Σ-symmetric channel w. error prob. p: Σ-symmetric channel w. error prob. p: Transmits σ 2 Σ as σ w.p. 1-p; Transmits σ 2 Σ as σ w.p. 1-p; and as ¿ 2 Σ- {σ} w.p. p/(q-1). and as ¿ 2 Σ- {σ} w.p. p/(q-1). Shannon’s Coding Theorem: Shannon’s Coding Theorem: Can transmit at rate R = 1 – H q (p) - ², 8 ² > 0 Can transmit at rate R = 1 – H q (p) - ², 8 ² > 0 Converse Coding Theorem: Converse Coding Theorem: Can not transmit at rate R = 1 – H q (p) + ² Can not transmit at rate R = 1 – H q (p) + ² So: No mysteries? So: No mysteries? 02/17/201018ECC: Progress/Challenges (@CMU) If R = 1 – H q (p) - ², then for every n and k = Rn, there exist E:Σ k  Σ n and D:Σ n  Σ k s.t. Pr Channel,x [D(Channel(E(x)) ≠ x] · exp(-n).

Shannon’s functions: Shannon’s functions: E random, D brute force search. E random, D brute force search. Can we get poly time E, D? Can we get poly time E, D? [Forney 66]: Yes! (Using Reed-Solomon codes correcting ² -fraction error + composition.) [Forney 66]: Yes! (Using Reed-Solomon codes correcting ² -fraction error + composition.) [Sipser-Spielman ‘92, Spielman ‘94, Barg- Zemor ‘97]: Even in linear time! [Sipser-Spielman ‘92, Spielman ‘94, Barg- Zemor ‘97]: Even in linear time! Still didn’t satisfy practical needs. Why? Still didn’t satisfy practical needs. Why? [Berrou et al. 92] Turbo codes + belief propagation: [Berrou et al. 92] Turbo codes + belief propagation: No theorems; Much excitement No theorems; Much excitement Constructive versions 02/17/201019ECC: Progress/Challenges (@CMU)

What is satisfaction? Articulated by [Luby,Mitzenmacher,Shokrollahi,Spielman ’96] Articulated by [Luby,Mitzenmacher,Shokrollahi,Spielman ’96] Practically interesting question: Practically interesting question: n = 10000; q = 2, p =.1; n = 10000; q = 2, p =.1; Desired error prob. = 10 -6 ; k = ? Desired error prob. = 10 -6 ; k = ? [Forney ‘66]: Decoding time: exp(1/(1 – H(p) – (k/n))); [Forney ‘66]: Decoding time: exp(1/(1 – H(p) – (k/n))); Rate = 90% ) decoding time ¸ 2 100; Rate = 90% ) decoding time ¸ 2 100; Right question: reduce decoding time to poly(n,1/ ² ); where ² = 1 – H(p) – (k/n) Right question: reduce decoding time to poly(n,1/ ² ); where ² = 1 – H(p) – (k/n) 02/17/201020ECC: Progress/Challenges (@CMU)

Current state of the art Luby et al.: Propose study of codes based on irregular graphs (“Irregular LDPC Codes”). Luby et al.: Propose study of codes based on irregular graphs (“Irregular LDPC Codes”). No theorems so far for erroneous channels. No theorems so far for erroneous channels. Strong analysis for (much) simpler case of erasure channels (symbols are erased); decoding time = O(n log (1/ ² )) Strong analysis for (much) simpler case of erasure channels (symbols are erased); decoding time = O(n log (1/ ² )) (Easy to get “composition” based algorithms with (Easy to get “composition” based algorithms with decoding time = O(n poly(1/ ² )) decoding time = O(n poly(1/ ² )) Do have some proposals for errors as well (with analysis by Luby et al., Richardson & Urbanke), but none known to converge to Shannon limit. Do have some proposals for errors as well (with analysis by Luby et al., Richardson & Urbanke), but none known to converge to Shannon limit. 02/17/201021ECC: Progress/Challenges (@CMU)

Part III: Correcting Adversarial Errors 02/17/201022ECC: Progress/Challenges (@CMU)

Motivation: As notions of communication/storage get more complex, modeling error as oblivious (to message/encoding/decoding) may be too simplistic. As notions of communication/storage get more complex, modeling error as oblivious (to message/encoding/decoding) may be too simplistic. Need more general models of error + encoding/decoding for such models. Need more general models of error + encoding/decoding for such models. Most pessimistic model: errors are worst-case. Most pessimistic model: errors are worst-case. 02/17/201023ECC: Progress/Challenges (@CMU)

Gap between worst-case & random errors In Shannon model, with binary channel: In Shannon model, with binary channel: Can correct upto 50% (random) errors. Can correct upto 50% (random) errors. ( 1-1/q fraction errors, if channel q-ary.) ( 1-1/q fraction errors, if channel q-ary.) In Hamming model, for binary channel: In Hamming model, for binary channel: Code with more than n codewords has distance at most 50%. Code with more than n codewords has distance at most 50%. So it corrects at most 25% worst-case errors. So it corrects at most 25% worst-case errors. ( ½(1 – 1/q) errors in q-ary case.) ( ½(1 – 1/q) errors in q-ary case.) Shannon model corrects twice as many errors: Shannon model corrects twice as many errors: Need new approaches to bridge gap. Need new approaches to bridge gap. 02/17/201024ECC: Progress/Challenges (@CMU)

Approach: List-decoding Main reason for gap between Shannon & Hamming: The insistence on uniquely recovering message. Main reason for gap between Shannon & Hamming: The insistence on uniquely recovering message. List-decoding: Relaxed notion of recovery from error. Decoder produces small list (of L) codewords, such that it includes message. List-decoding: Relaxed notion of recovery from error. Decoder produces small list (of L) codewords, such that it includes message. Code is (p,L) list-decodable if it corrects p fraction error with lists of size L. Code is (p,L) list-decodable if it corrects p fraction error with lists of size L. 02/17/201025ECC: Progress/Challenges (@CMU)

List-decoding Main reason for gap between Shannon & Hamming: The insistence on uniquely recovering message. Main reason for gap between Shannon & Hamming: The insistence on uniquely recovering message. List-decoding [Elias ’57, Wozencraft ’58]: Relaxed notion of recovery from error. Decoder produces small list (of L) codewords, such that it includes message. List-decoding [Elias ’57, Wozencraft ’58]: Relaxed notion of recovery from error. Decoder produces small list (of L) codewords, such that it includes message. Code is (p,L) list-decodable if it corrects p fraction error with lists of size L. Code is (p,L) list-decodable if it corrects p fraction error with lists of size L. 02/17/201026ECC: Progress/Challenges (@CMU)

What to do with list? Probabilistic error: List has size one w.p. nearly 1 Probabilistic error: List has size one w.p. nearly 1 General channel: Need side information of only O(log n) bits to disambiguate [Guruswami ’03] General channel: Need side information of only O(log n) bits to disambiguate [Guruswami ’03] (Alt’ly if sender and receiver share O(log n) bits, then they can disambiguate [Langberg ’04]). (Alt’ly if sender and receiver share O(log n) bits, then they can disambiguate [Langberg ’04]). Computationally bounded error: Computationally bounded error: Model introduced by [Lipton, Ding Gopalan L.] Model introduced by [Lipton, Ding Gopalan L.] List-decoding results can be extended (assuming PKI and some memory at sender) [Micali et al.] List-decoding results can be extended (assuming PKI and some memory at sender) [Micali et al.] 02/17/201027ECC: Progress/Challenges (@CMU)

List-decoding: State of the art [Zyablov-Pinsker/Blinovskii – late 80s] [Zyablov-Pinsker/Blinovskii – late 80s] There exist codes of rate 1 – H q (p) - \epsilon that are (p,O(1))-list-decodable. There exist codes of rate 1 – H q (p) - \epsilon that are (p,O(1))-list-decodable. Matches Shannon’s converse perfectly! (So can’t do better even for random error!) Matches Shannon’s converse perfectly! (So can’t do better even for random error!) But [ZP/B] non-constructive! But [ZP/B] non-constructive! 02/17/201028ECC: Progress/Challenges (@CMU)

Algorithms for List-decoding Not examined till ’88. Not examined till ’88. First results: [Goldreich-Levin] for “Hadamard’’ codes (non-trivial in their setting). First results: [Goldreich-Levin] for “Hadamard’’ codes (non-trivial in their setting). More recent work: More recent work: [S.’96, Shokrollahi-Wasserman ’98, Guruswami-S.’99, Parvaresh-Vardy ’05, Guruswami-Rudra ’06] – Decode algebraic codes. [S.’96, Shokrollahi-Wasserman ’98, Guruswami-S.’99, Parvaresh-Vardy ’05, Guruswami-Rudra ’06] – Decode algebraic codes. [Guruswami-Indyk ’00-’02] – Decode graph- theoretic codes. [Guruswami-Indyk ’00-’02] – Decode graph- theoretic codes. 02/17/201029 ECC: Progress/Challenges (@CMU)

Results in List-decoding q-ary case: q-ary case: [Guruswami-Rudra ‘06] Codes of rate R [Guruswami-Rudra ‘06] Codes of rate R correcting 1 – R - ² fraction errors correcting 1 – R - ² fraction errors with q = q( ² ) with q = q( ² ) Matches Shannon bound (except for q( ² ) ) Matches Shannon bound (except for q( ² ) ) Binary case: Binary case: ¡ c ! 3 : I mp l i e db y P arvares h - V ar d y 05 ¡ c = 4 : G uruswam i e t a l. 2000 9 C o d eso f ra t e² c correc t i ng 1 2 ¡ ² f rac t i onerrors. 02/17/201030ECC: Progress/Challenges (@CMU) ¡ c = 3 : G uruswam i Rudra

Few lines about Guruswami-Rudra Code = Collated Reed-Solomon Code + Concatenation. Code = Collated Reed-Solomon Code + Concatenation. C o d emaps § K ! § N f or N ¼ q = C. M essage: D egree C ¢ K po l ynom i a l over F q. A l p h a b e t § = F C q ;q ! 1, C cons t an t. 02/17/201031ECC: Progress/Challenges (@CMU)

Few lines about Guruswami-Rudra Special properties: Special properties: Is this code combinatorially good? Is this code combinatorially good? Algorithmically good!! (uses ideas from [S’96,GS’98,PV’05 + new ones]. Algorithmically good!! (uses ideas from [S’96,GS’98,PV’05 + new ones]. Can concatenate to reduce alphabet size. Can concatenate to reduce alphabet size. ² S i = f ® i C = ® ; ° ¢ ® ;:::; ° C ¡ 1 ¢ ® g. ² °sa t i s ¯ esx q = °xmo dh ( x ) f or i rre d uc i bl e h o fd egree CK. o f K, S i s ² D o B a ll so f ra d i us ( 1 ¡ o ( 1 )) ¢ ( N ¡ K ) h ave f ewco d ewor d s ? 02/17/201032ECC: Progress/Challenges (@CMU)

Few lines about Guruswami-Rudra Warnings: K, N, partition all very special. Warnings: K, N, partition all very special. C o d emaps § K ! § N f or N ¼ q = C. M essage: D egree C ¢ K po l ynom i a l over F q. A l p h a b e t § = F C q ;q ! 1, C cons t an t. Encoding: \\ \indent First partition $\F_q$ into {\red special} sets $S_0,S_1,\ldots,S_N$, \\ \indent \indent with $|S_1| = \cdots = |S_N| = C$. \\ \indent Say $S_1 = \{\alpha_1,\ldots,\alpha_C\}$, $S_2 = \{\alpha_{C+1},\ldots,\alpha_{2C}\}$ etc.\\ \indent Encoding of $P$\\ \indent $\langle \langle P(x_1),\ldots,P(x_C) \rangle, \langle P(x_{C+1}),\ldots,P(x_{2C}) \rangle \cdots \rangle$ 02/17/201033ECC: Progress/Challenges (@CMU)

Major open question ² N o t e: I f runn i ng t i me i spo l y ( 1 = ² ) t h en t h i s i mp l i esa so l u t i on t o t h eran d omerrorpro bl emaswe ll. ² C ons t ruc t ( p ; O ( 1 )) l i s t - d eco d a bl e b i naryco d e o f ra t e 1 ¡ H ( p ) ¡ ²w i t h po l y t i me l i s t d eco d i ng.. 02/17/201034ECC: Progress/Challenges (@CMU)

Conclusions Coding theory: Very practically motivated problems; solutions influence (if not directly alter) practice. Coding theory: Very practically motivated problems; solutions influence (if not directly alter) practice. Many mysteries remain in combinatorial setting. Many mysteries remain in combinatorial setting. Significant progress in algorithmic setting, but many more questions to resolve. Significant progress in algorithmic setting, but many more questions to resolve. 02/17/201035ECC: Progress/Challenges (@CMU)

LDPC Codes D e ¯ nes E : f 0 ; 1 g k ! f 0 ; 1 g n. n l e f t ver t i cesn ¡ k r i g h t ver t i ces 010001111 C o d ewor d = 0 / 1 ass i gnmen tt o l e f t i f ne i g hb or h oo d o f r i g h t ver t i ces h aveevenpar i t y. R i g h t ver t i cesarepar i t yc h ec k s. G rap hh as l ow d ens i t y. H ence L ow- D ens i t y- P ar i t y- C h ec k C o d es. 02/17/201036ECC: Progress/Challenges (@CMU)

LDPC Codes 02/17/201037ECC: Progress/Challenges (@CMU)

LDPC Codes D e ¯ nes E : f 0 ; 1 g k ! f 0 ; 1 g n. n l e f t ver t i cesn ¡ k r i g h t ver t i ces 010001111 C o d ewor d = 0 / 1 ass i gnmen tt o l e f t i f ne i g hb or h oo d o f r i g h t ver t i ces h aveevenpar i t y. R i g h t ver t i cesarepar i t yc h ec k s. G rap hh as l ow d ens i t y. H ence L ow- D ens i t y- P ar i t y- C h ec k C o d es. 02/17/201038ECC: Progress/Challenges (@CMU)

LDPC Codes 010001111 D eco d i ng I n t u i t i on: P ar i t yc h ec kf a i l s ) somene i g hb orcorrup t e d. F ewne i g hb ors ) ass i gn i ng bl amewor k s. [ G a ll ager 63... S i pser- S p i e l man 92 ] : C orrec t ( 1 ) f rac t i onerrors. C urren t h ope: P i c k i ng d egreescare f u ll y w i lll ea d t oco d e / a l gor i t h m correc t i ngp f rac t i onran d omerrors 02/17/201039ECC: Progress/Challenges (@CMU)

Error-Correcting Codes: Progress & Challenges Madhu Sudan Microsoft/MIT 02/17/20101ECC: Progress/Challenges

Similar presentations

Presentation on theme: "Error-Correcting Codes: Progress & Challenges Madhu Sudan Microsoft/MIT 02/17/20101ECC: Progress/Challenges"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Error-Correcting Codes: Progress & Challenges Madhu Sudan Microsoft/MIT 02/17/20101ECC: Progress/Challenges

Similar presentations

Presentation on theme: "Error-Correcting Codes: Progress & Challenges Madhu Sudan Microsoft/MIT 02/17/20101ECC: Progress/Challenges"— Presentation transcript:

Similar presentations

About project

Feedback