Repairable Fountain Codes Megasthenis Asteris, Alexandros G. Dimakis IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 32, NO. 5, MAY /5/221
Outline Introduction Problem Description Prior Work Repairable Fountain Codes A Graph Perspective Simulations Conclusion 2014/5/222
Introduction Fountain codes [2], [3], [4] form a new family of linear erasure codes with several attractive properties. For a given set of k input symbols, produces a potentially limitless stream of output symbols. Randomly selected subset of (1+ ɛ)k encoded symbols, a decoder should be able to recover the original k input symbols with high probability (w.h.p.) for some small overhead ɛ. 2014/5/223 [2] J. W. Byers, M. Luby, M. Mitzenmacher, and A. Rege, “A digital fountain approach to reliable distribution of bulk data,” [3] M. Luby, “LT codes,” [4] A. Shokrollahi, “Raptor codes,”
Introduction This paper design a new family of Fountain codes that combine multiple properties appealing to distributed storage. – systematic form – efficient repair – locality 2014/5/224
Introduction A key observation is that in a systematic linear code, locality is strongly connected to the sparsity of parity symbols [11] – i.e., the maximum number of input symbols combined in a parity symbol. A parity symbol along with the systematic symbols covered by it form a local group. – The smaller the size of the local group, the lower the locality of the symbols in it. 2014/5/225 [11] P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, “On the locality of codeword symbols,”
Introduction The availability of a systematic symbol naturally extends the notion of locality, measuring the number of disjoint local groups the symbol belongs to. – define the availability of a systematic symbol as the number of disjoint sets of encoded symbols which can be used to reconstruct that particular symbol. 2014/5/226
Outline Introduction Problem Description Prior Work Repairable Fountain Codes A Graph Perspective Simulations Conclusion 2014/5/227
Problem Description 2014/5/228
Problem Description G to have the following properties: – Systematic form – Rateless property – MDS property – Low locality – High Availability 2014/5/229
Problem Description For any code, any sufficiently large subset of encoded symbols should allow recovery of the original data. – In the case of MDS codes, an information theoretically minimum subset of k encoded symbols suffices to decode. – When equipped with systematic form, the generator matrix of an MDS code affords no zero coefficient in the parity generating columns. 2014/5/2210
Problem Description This paper require that for ɛ > 0, a set of k’= (1+ ɛ)k randomly selected encoded symbols suffice to decode with high probability. – This property called as near-MDS. It is impossible to recover the original message with high probability if the parities are linear combinations of fewer than Ω(log k) input symbols. 2014/5/2211
Outline Introduction Problem Description Prior Work Repairable Fountain Codes A Graph Perspective Simulations Conclusion 2014/5/2212
Prior Work In LT codes, the average degree of the output symbols, i.e., the number of input symbols combined into an output symbol, is O (log k). However, that sparsity in this case does not imply good locality, since LT codes lack systematic form. 2014/5/2213
Prior Work Raptor codes, a different class of Fountain codes. – The core idea is to precode the input symbols prior to the application of an appropriate LT code. – The k input symbols can be retrieved in linear time by a set of (1 + ɛ)k encoded symbols However, the original Raptor design does not feature the highly desirable systematic form. – does not imply good locality. 2014/5/2214
Prior Work 2014/5/2215 [4] A. Shokrollahi, “Raptor codes,”
Prior Work [16], Gummadi proposes systematic variants of LT and Raptor codes that feature low (even constant) expected repair complexity. – However, the overhead ɛ required for decoding is suboptimal: it cannot be made arbitrarily small. 2014/5/2216 [16] R. Gummadi, “Coding and scheduling in networks for erasures and broadcast,”
Outline Introduction Problem Description Prior Work Repairable Fountain Codes A Graph Perspective Simulations Conclusion 2014/5/2217
Repairable Fountain Codes A new family of Fountain codes that are systematic and also have sparse parities. – Each parity symbol is a random linear combination of up to d = Ω(logk) randomly chosen input symbols. – Require that a set of k’= (1+ ɛ)k randomly selected encoded symbols – with probability of failure vanishing like 1/poly(k). 2014/5/2218
Repairable Fountain Codes 2014/5/2219
Repairable Fountain Codes 2014/5/2220
Repairable Fountain Codes 2014/5/2221 U is the set of k input symbols V is the set of n encoded symbols.
Repairable Fountain Codes 2014/5/2222
Repairable Fountain Codes In summary, decreasing d(k) improves the locality of the code – How small d(k) can be to ensure that randomly selected set of k’= (1+ɛ)k symbols is decodable. 2014/5/2223
Repairable Fountain Codes 2014/5/2224
Repairable Fountain Codes 2014/5/2225
Repairable Fountain Codes Theorem 3. Let rk be the total number of parities generated, with each parity symbol constructed as a linear combination of d(k) = c log(k) independently selected symbols uniformly at random with replacement. The expected number of parities covering a systematic symbol u is 2014/5/2226
Repairable Fountain Codes 2014/5/2227 Availability
Outline Introduction Problem Description Prior Work Repairable Fountain Codes A Graph Perspective Simulations Conclusion 2014/5/2228
A Graph Perspective First, consider a balanced random bipartite graph where |U| = |V | = k and each vertex of V is randomly connected to d(k) = c log k nodes in U. A classical result by Erd˝os and Renyi [14] shows that these graphs will have perfect matchings with high probability. 2014/5/2229 [14] P. Erd˝os and A. R´enyi, “On random matrices,”
A Graph Perspective However, the graphs are unbalanced – |U| = k and |V | = k’ = (1+ ɛ)k vertices, for ɛ ≥ /5/2230
A Graph Perspective 2014/5/2231
Outline Introduction Problem Description Prior Work Repairable Fountain Codes A Graph Perspective Simulations Conclusion 2014/5/2232
Simulations This paper experimentally evaluate the probability that decoding fails when a randomly selected subset of encoded symbols is available at the decoder. – set the rate equal to ½. – the generator matrix comprises the k columns of the identity matrix and k parity generating columns. – d(k) = c log(k). 2014/5/2233
Simulations 2014/5/2234 d(k) = c log(k), with c = 6. Expected decoding overhead ɛ = 1− 2Pe. k = 100,300,500.
Simulations 2014/5/2235 d(k) = c log(k), with c = 4. k = 100,300,500.
Outline Introduction Problem Description Prior Work Repairable Fountain Codes A Graph Perspective Simulations Conclusion 2014/5/2236
Conclusion This paper introduce a new family of Fountain codes that are systematic and also have parity symbols with logarithmic sparsity. 2014/5/2237
Conclusion Main technical contribution is a novel random matrix result: a family of random matrices with non independent entries have full rank with high probability. – The analysis builds on the connections of matrix determinants to flows on random bipartite graphs, using techniques from [14], [15]. 2014/5/2238 [14] P. Erd˝os and A. R´enyi, “On random matrices,” [15] A. G. Dimakis, V. Prabhakaran, and K. Ramchandran, “Decentralized erasure codes for distributed networked storage,”
Conclusion 2014/5/2239 [13] D. Wiedemann, “Solving sparse linear equations over finite fields,”