Presentation is loading. Please wait.

Presentation is loading. Please wait.

Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu.

Similar presentations


Presentation on theme: "Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu."— Presentation transcript:

1 Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu Martin J. Wainwright Kannan Ramchandran 1

2 Outline ﻪIntroduction ﻪBackground ﻪAnalysis ﻪEvaluation ﻪConclusion 2

3 Introduction ﻪDistributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. ﻪStoring data in distributed storage systems ﻩthe encoded data are spread across nodes. ﻩrequire less redundancy than replication. ﻩreplace stored data periodically. 3

4 Introduction ﻪKey issue in distributed storage systems. ﻩrepair bandwidth ﻩstorage space ﻪHow to generate encoded data in a distributed way as little data as possible ? 4

5 MDS Codes ﻪA common practice to repair from a single node failure for an erasure coded system. 1.a new node to reconstruct the whole encoded data object. 2.then, generate just one encoded block. ﻪMaximum Distance Separable (MDS) code. ﻩ(n, k)-MDS property ﻩrecover original file by any k set of encoded data. 5

6 MDS Codes File divide M/k encode store at n nodes MDS encode 6

7 Introduction ﻪRedundancy must be continually refreshed as nodes fail in distributed storage systems. ﻩlarge data transfers across the network. 7

8 Introduction ﻪThe erasure codes can be repaired without communicating the whole data object. ﻪ(4, 2)-MSR example when node is fail. ﻩgenerate smaller parity packets of their data. ﻩforward them to the newcomer. ﻩthe newcomer mix packets to generate two new packets. 8 0.5

9 Introduction ﻪThis paper identifies that there is a optimal tradeoff curve between storage and repair bandwidth. ﻩsmaller storage space => less redundancy => more repair bandwidth ﻪThis paper calls codes that lie on this optimal tradeoff curve regenerating codes. 9

10 Introduction ﻪMinimum-Storage Regenerating (MSR) codes. ﻩcan be efficiently repaired. ﻪMinimum-Bandwidth Regenerating (MBR) codes. ﻩstorage node stores slightly more than M/k. ﻩthe repair bandwidth can be reduced. 10

11 Outline ﻪIntroduction ﻪBackground ﻪAnalysis ﻪEvaluation ﻪConclusion 11

12 Erasure Codes ﻪClassical coding theory focuses on the tradeoff between redundancy and error tolerance. ﻪIn terms of the redundancy-reliability tradeoff, the Maximum Distance Separable (MDS) codes are optimal. ﻩthe most well-known is Reed-Solomon codes. 12

13 Network Coding ﻪNetwork coding allows ﻩthe intermediate nodes to generate output data by encoding previously received input data. ﻩinformation to be “mixed” at intermediate nodes. ﻪThis paper investigates the application of network coding for the repair problem in distributed storage. ﻩtradeoff between storage and repair network bandwidth 13

14 Distributed Storage Systems ﻪErasure codes could reduce bandwidth use by an order of magnitude compared with replication. ﻪHybrid strategy: ﻩone special storage node maintains one full replica. ﻩmultiple erasure encoded data. ﻩtransfer only M / k bytes for a new encoded data by replica node. ﻩthere is the problem when replica data lost. 14

15 Outline ﻪIntroduction ﻪBackground ﻪAnalysis ﻪEvaluation ﻪConclusion 15

16 Information Flow Graph 16

17 Storage-Bandwidth Tradeoff ﻪThe normal redundancy we want to maintain requires active storage nodes ﻩeach storing α bits ﻩβ bits each from any d surviving nodes ﻩtotal repair bandwidth is γ = d β ﻪFor each set of parameters (n, k, d, α, γ), there is a family of information flow graphs, each of which corresponds to a particular evolution of node failures / repairs. 17

18 Storage-Bandwidth Tradeoff ﻪDenote this family of directed acyclic graphs by ﻩ(4, 2, 3, 1 Mb, 1.5 Mb) is feasible. 18

19 Storage-Bandwidth Tradeoff ﻪTheorem 1 : For any α ≥ α*(n, k, d, γ), the points are feasible. 19

20 Theorem Proof (1/4) 20

21 Theorem Proof (2/4) ﻪ.ﻪ.ﻪ.ﻪ.ﻪ.ﻪ.ﻪ.ﻪ. 21

22 Theorem Proof (3/4) ﻪ.ﻪ.ﻪ.ﻪ. 22

23 Theorem Proof (4/4) ﻪ.ﻪ. ﻪ.ﻪ. 23

24 Storage-Bandwidth Tradeoff ﻪCode repair can be achieved if and only if the underlying information flow graph has sufficiently large min-cuts. 24

25 Storage-Bandwidth Tradeoff ﻪOptimal tradeoff curve between storage α and repair bandwidth γ ﻩ(γ = 1, α = 0.2) (γ = 1, α = 0.1) 25

26 Special Cases (1/2) ﻪMinimum-Storage Regenerating (MSR) Codes ﻩ. 26

27 Special Cases (2/2) ﻪMinimum-Bandwidth Regenerating (MBR) Codes ﻩ. 27

28 Outline ﻪIntroduction ﻪBackground ﻪAnalysis ﻪEvaluation ﻩNode Dynamics and Objectives ﻩModel ﻩQuantitative Results ﻪConclusion 28

29 Node Dynamics and Objectives (1/2) ﻪA permanent failure ﻩthe permanent departure of a node from the system ﻩa disk failure resulting in loss of the data stored on the node ﻪA transient failure ﻩnode reboot ﻩtemporary network disconnection 29

30 Node Dynamics and Objectives (2/2) ﻪA file is available ﻩit can be reconstructed from the data stored on currently available nodes. ﻪA file is durability ﻩafter permanent node failures, it may be available at some point in the future. 30

31 Model (1/5) ﻪThe model has two key parameters, f and a. ﻩa fraction f of the nodes storing file data fail permanently per unit time. ﻩat any given time, the node storing data is available with some probability a. ﻪThe expected availability and maintenance bandwidth of various redundancy schemes can be computed to maintain a file of M bytes. 31

32 Model (2/5) 32

33 Model (3/5) 33

34 Model (4/5) 34

35 Model (5/5) 35

36 Estimating f and a 36

37 Quantitative Results (1/2) 37

38 Quantitative Results (2/2) 38

39 Quantitative Comparison 39

40 Outline ﻪIntroduction ﻪBackground ﻪAnalysis ﻪEvaluation ﻪConclusion 40

41 Conclusion ﻪThis paper presented a general theoretic framework that can determine the information. ﻩcommunicate to repair failures in encoded systems. ﻩidentify a tradeoff between storage and repair bandwidth. ﻪOne potential application area for the proposed regenerating codes is distributed archival storage or backup. ﻩregenerating codes potentially can offer desirable tradeoffs in terms of redundancy, reliability, and repair bandwidth. 41


Download ppt "Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu."

Similar presentations


Ads by Google