Download presentation
Presentation is loading. Please wait.
Published byOscar Lee Modified over 9 years ago
1
236601 - Coding and Algorithms for Memories Lecture 13 1
2
Large Scale Storage Systems 2 Big Data Players: Facebook, Amazon, Google, Yahoo,… Cluster of machines running Hadoop at Yahoo! (Source: Yahoo!) Failures are the norm
3
Node failures at Facebook 3 Date XORing Elephants: Novel Erasure Codes for Big Data M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali, S. Chen, and D. Borthakur, VLDB 2013
4
Problem Setup Disks are stored together in a group (rack) Disk failures should be supported Requirements: – Support as many disk failures as possible – And yet… Optimal and fast recovery Low complexity 4
5
Reed Solomon Codes 5
6
Advantages: – Support the maximum number of disk failures – Are very comment in practice and have relatively efficient encoding/decoding schemes Disadvantages – Require to work over large fields Solution: EvenOdd Codes – Need to read all the disks in order to recover even a single disk failure – not efficient rebuild Solution: ZigZag Codes 6
7
The Repair Problem 7 1 1 2 2 3 3 4 4 5 5 6 6 7 7 9 9 10 8 8 P1 P3 P4 P2 A disk is lost – Repair job starts Access, read, and transmit data of disks! Overuse of system resources during single repair Goal: Reduce repair cost in a single disk repair Facebook’s storage Scheme: – 10 data blocks – 4 parity blocks – Can tolerate any four disk failures RS code
8
ZigZag Codes Designed by Itzhak Tamo, Zhiying Wang, and Jehoshua Bruck The goal: construct codes correcting the max number of erasures and yet allow efficient reconstruction if only a single drive fails 8
9
ZigZag Codes Lower bound: The min amount of data required to be read to recover a single drive failure – (n,k) code: n drives, k information, and n-k redundancy – M- size of a single drive in bits For (n,n-2) code it is required to read at least 1/2 from the remaining drives, that is at least (1/2)(n-1)M bits – The last example is optimal In general, for (n,n-r) code it required to read at least 1/r from the remaining drives (1/r)(n-1)M 9
10
ZigZag Codes Example 10 info 1info 2info 3 Row parity ZigZag parity 0210 1301 2032 3123
11
Network Coding for Distributed Storage Goal – show the following: In general, for (n,n-r) code it required to read at least 1/r from the remaining drives (1/r)(n-1)M Network Coding for Distributed Storage Dimakis, Godfrey, Wu, Wainwright, Ramchandran File of size M is partitioned into k pieces of size M/k The k pieces are encoded into n encoded pieces using an (n,k) MDS code 11
12
Network Coding for Distributed Storage File of size M is partitioned into k pieces of size M/k The k pieces are encoded into n encoded pieces using an (n,k) MDS code 12 y1y1 y1y1 y2y2 y2y2 x1x1 x1x1 x2x2 x2x2 x3x3 x3x3 x4x4 x4x4
13
Network Coding for Distributed Storage File of size M is partitioned into k pieces of size M/k The k pieces are encoded into n encoded pieces using an (n,k) MDS code 13 y1y1 y1y1 y2y2 y2y2 x1x1 x1x1 x2x2 x2x2 x3x3 x3x3 x4x4 x4x4 x5x5 x5x5 β=? β β
14
Network Coding for Distributed Storage File of size M is partitioned into k pieces of size M/k The k pieces are encoded into n encoded pieces using an (n,k) MDS code 14 S S x 1 ou t x 2 ou t x 3 ou t x 4 ou t x 5 in β=? β β x 1 in x 2 in x 3 in x 4 in ∞ ∞ ∞ ∞ α=1 DC x 5 ou t ∞ ∞
15
ZigZag Codes Example 15 aba+ba+2d cdc+dc+b
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.