Download presentation
Presentation is loading. Please wait.
1
Mario Vodisek 1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Erasure Codes for Reading and Writing Mario Vodisek ( joint work with AG Schindelhauer)
2
Mario Vodisek 2 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Agenda Erasure (Resilient) Codes in storage networks The Read-Write-Coding-System -A Lower Bound and Perfect Codes -Requirements and Techniques
3
Mario Vodisek 3 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity n -symbol message x with symbols from alphabet m -symbol encoding y with symbols from ( m > n ) erasure coding provides mapping: n ! m such that –reading any n · r < m symbols of y are sufficient for recovery –(mostly: r = n ) optimal for reading) advantages: –b m-r c erasures can be tolerated –storage overhead is a factor of Generally, erasure codes are used to guarantee information recovery for data transmission over unreliable channels (RS-, Turbo-, LT-Codes, …) Lots of research in code properties such as –scalability –encoding/decoding speed-up –rateless-ness Attractive also to storage networks: downloads (P2P) and fault-tolerance Erasure (Resilient) Coding coding
4
Mario Vodisek 4 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Erasure Codes for Storage (Area) Networks SANs require high system availability –disks fail or be blocked (probability $ size) efficient modification handling –Slow devices ) expensive I/O-operations Properties: a fixed set E of existing errors can be considered at encoding time E can have changed to E ‘ at decoding time Additional requirements to erasure codes: tolerate some certain number of erasures ensure modification of codeword even if erasures occur consider E at encoding time and E ‘ at decoding time Networ k
5
Mario Vodisek 5 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity The Read-Write-Coding-System An ( n, r, w, m ) b -Read-Write-Coding System (RWC) is defined as follows: The base b : b -symbol alphabet b as the set of all used items n 1 blocks of information x 1, …, x n b m n code blocks y 1, …, y m b any n r m code words sufficient to read the information any n w m code words sufficient to change the information by 1, …, n (In the language of Coding Theory) : given m, n, r, w, our RW-Codes provide: a (linear) code of dimension n and block length m such that for n · r, w · m : –the minimum distance of the code is at least m - r +1 –any two codewords y 1, y 2 are within a distance of at most w from another –distance( x, y ):=|{1· i· m : x i y i }| coding m, r, w n
6
Mario Vodisek 6 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity A Lower Bound for RW-Codes Theorem: For r + w < n + m and any base b there does not exist any ( n, r, w, m ) b - RWC system ! We know: n r, w m Assume: r = w = n m n +1 Write and subsequent read n m Proof: w r Index Sets ( W, R ): | W | = w | R | = r | S | = W R { n, n -1} Assume: | S | = n there are b n possible change vectors to be encoded by `write` into S ; only basis for reading with r = n (notice: R \ S code words remain unchanged) Assume: | S | < n = n -1 at most b n -1 possible change vectors for S can be encoded by `write` ´read´ will produce faulty output
7
Mario Vodisek 7 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Codes at Lower Bound: Perfect Codes In the best case (n, r, w, m) b -RWC have parameters r + w = n + m (perfect Codes) Unfortunately, perfect RWC do not always exist !! - E.g. there is no ( 1, 2, 2, 3) 2 -RWC but there exists a (1, 2, 2, 3) 3 -RWC ! But: all perfect RW -Codes exist if the alphabet is sufficiently large ! Notice to RAID: Definition of parity RAID (RAID 4/5) corresponds to an ( n, n, n +1, n +1) 2 -RWC From the lower bounds it follows: there is no ( n, n, n, n +1) 2 -RWC ) there is no RAID-system with improved access properties !
8
Mario Vodisek 8 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity The Model: Operations Given: X=x 1,…, x n the n -symbol information vector over a finite alphabet . Y=y 1,…, y n the m -symbol code over b =| |. P(M) : the power set of M, P k (M):={S 2 P(M): |S|=k} Define [m]:={1, ,m} An ( n, r, w, m ) b -RWC-system consists of the following operations: Inital state: X 0 2 n, Y 0 2 m Read function: f : P r ([ m ]) £ r ! m Write function: g : P r ([ m ]) £ r £ P w ([ m ]) £ n ! w Differential write function: : P w ([ m ]) £ n ! w
9
Mario Vodisek 9 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Initialization: Compute the Encoding Y 0 Given (in general): the information vector X = x 1, …, x n b the encoded vector Y = y 1, …, y n b internal variables V = v 1, …, v k for k = m - w = r - n, with no particular information set of functions M = M 1,…, M n for encoding Compute y i from X and V by function M i ; define M i as linear combination of X and V y i = M i ( x 1,…, x n, v 1,…, v k ) = j =1 n x j M i,j + l=1 k v l M i,l ( Define M as some m £ r matrix; M i as rows. It follows: M(XV = Y ) RW-Codes are closely related to Reed-Solomon-Codes !
10
Mario Vodisek 10 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity The Matrix Approach: ( n, r, w, m ) b - RWC Consider: the information vector X = x 1, …, x n b the encoded vector Y = y 1, …, y m b internal slack variables V = v 1, …, v k for k = m - w = r - n Further: an m r generator matrix M : M i,j b the submatrix ( M i,j ) i [ m ], j { n +1, …, r } is called the variable matrix =
11
Mario Vodisek 11 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Efficient Encoding: b = F[ b ] (Finite Fields) RWC requires efficient arithmetic on elements of b for encoding ) set b = F[ b ] (finite field with b elements (formerly: GF( b ))) b = p n for some prime number p and integer n ) F[ p n ] always exists Computation of binary words of length v : b = 2 v, F[2 v ] = {0,…,2 v -1} Features: F[ b ] is closed under addition, multiplication ) exact computation on field elements ) not more than v bits for representiation of results Addition, subtraction via XOR (avoids rounding, no carryover) Multiplication, division via mapping tables (analogous to logarithm tables for real numbers) –T : table mapping an integer to its logarithm in F[2 v ] –IT: table mapping an integer to its inverse logarithm in F[2 v ] ) multiplication, division by adding/subtracting the logs taking the inverse log
12
Mario Vodisek 12 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity The Vandermonde Matrix Consider M as m £ r Vandermonde matrix M i,j = j i -1 : X, Y, V 2 F[ b ] M i,j 2 F[ b ] and all elements are different The Vandermonde matrix is non-singular ) invertible Any k ‘ £ k ‘ submatix M ‘ is also invertible = Consider: each device i in the SAN corresponds to a row of M and element y i
13
Mario Vodisek 13 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Reading (or Recovery) Read: Given any r code entries from Y, compute X Rearrange rows of M and Y such that first r entries of Y are available - (any r rows of M are linear independent in a Vandermonde matrix) M ! M ‘ and Y ! Y ‘ The first r rows of M ‘ describe an invertible r £ r matrix M ‘‘ X is computed by: ( X | V ) T = ( M ‘‘) -1 Y M (X | V) Y r m M‘ Y‘
14
Mario Vodisek 14 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Differential Write Given: - The change vector = 1,…, n and w code entries from Y - X‘ = X + is new information vector ) change X without reading entries (XOR) - Compute the difference for the w code entries of Y Further: - Only choices w < r make sense - Rearrange m £ r matrix M and Y as follows: y 1,…,y w (denote M ‘ and Y ‘) - k = r-n (slack vector V )
15
Mario Vodisek 15 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Differential Write (con‘t) Define following sub-matrices: - M Ã" = ( M ‘ i,j ) i 2[ w ], j 2[ n ] - M "! = ( M ’ i,j ) i 2[ w ], j 2{ n+1,…, r } - M Ã# = ( M ’ i,j ) i 2{ w+1,…, m }, j 2[ n ] - M # ! = ( M ’ i,j ) i 2{ w+1,…, m }, j 2{ n+1,…, r } M Ã" M "! M Ã# M #! w n w+1…m n+1…r M #! is k £ k = m - w £ r - n matrix ) M #! invertible The vector Y can then be updated by a vector = ,…, w : = (( M Ã" ) – ( M "! )( M #! ) -1 ( M Ã# )) ¢
16
Mario Vodisek 16 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Differential Write: Proof Use: Vector = 1,…, k the change of vector V Vector = 1,…, w the change of vector Y M Ã" M "! M Ã# M #! X ’ = X + V ’ = V + Y ’ = Y + Correctness follows by combining: M = M + M = + This equation is equivalent to: ( M #! ) + ( M Ã# ) = 0, ( M Ã" ) + ( M "! ) = Since is given, is obained as follows: = ( M #! ) -1 (- M Ã# ) ¢
17
Mario Vodisek 17 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Heinz Nixdorf Institute & Computer Science Institute University of Paderborn Fürstenallee 11 33102 Paderborn, Germany Tel.: +49 (0) 52 51/60 64 51 Fax: +49 (0) 52 51/62 64 82 E-Mail: vodisek@upb.de http://www.upb.de/cs/ag-madh Thank you for your attention!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.