Download presentation
Presentation is loading. Please wait.
Published byBarbara Mitchell Modified over 9 years ago
1
20/10/2010 1 Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Yuchong Hu Institute of Network Coding Please click the “Slides Show” to see animations in some slides
2
2 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Definition: Storing data distributed across the wide-area Characteristics: Resilience to local disaster ( Flood caused by Hurricane Katrina ). Scalable & economical ( Google File System (2005): 79112 PCs ). Low node availability! ( cheap disks fail, malicious attack, partial power outage). cheap storage node data object data collector
3
3 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Challenge: How to ensure data reliability while distributed storage systems have low node availability ?
4
4 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Challenge: How to ensure data reliability while distributed storage systems have low node availability ? Two common techniques Redundancy : Ensure data reliability by tolerating data loss. Recovery : Ensure data reliability by repairing data loss.
5
5 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Challenge: How to ensure data reliability while distributed storage systems have low node availability ? Two common techniques Redundancy : Ensure data reliability by tolerating data loss. Recovery : Ensure data reliability by repairing data loss. (a) Redundancy new node old node data object/file
6
6 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Challenge: How to ensure data reliability while distributed storage systems have low node availability ? Two common techniques Redundancy : Ensure data reliability by tolerating data loss. Recovery : Ensure data reliability by repairing data loss. (a) Redundancy (b) Recovery new node old node data object/file
7
7 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Challenge: How to ensure data reliability while distributed storage systems have low node availability ? Two common techniques Redundancy : Ensure data reliability by tolerating data loss. Recovery : Ensure data reliability by repairing data loss. Redundancy Schemes Replication (n,k) MDS Erasure codes - (n,k) MDS Property: Any k of n storage nodes can be used for file reconstruction.
8
8 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Challenge: How to ensure data reliability while distributed storage systems have low node availability ? A B A B A+B A B 3,2 (3,2) MDS code used in RAID 5 Data object Two common techniques Redundancy : Ensure data reliability by tolerating data loss. Recovery : Ensure data reliability by repairing data loss. Redundancy Schemes Replication (n,k) MDS Erasure codes - (n,k) MDS Property: Any k of n storage nodes can be used for file reconstruction. encode
9
9 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Challenge: How to ensure data reliability while distributed storage systems have low node availability ? Two common techniques Redundancy : Ensure data reliability by tolerating data loss. Recovery : Ensure data reliability by repairing data loss. A B A B A+B A B 4,2 (4,2) MDS code used in RAID 6 Data object A+2B Redundancy Schemes Replication (n,k) MDS Erasure codes - (n,k) MDS Property: Any k of n storage nodes can be used for file reconstruction. encode
10
10 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Challenge: How to ensure data reliability while distributed storage systems have low node availability ? Two common techniques Redundancy : Ensure data reliability by tolerating data loss. Recovery : Ensure data reliability by repairing data loss. Redundancy Schemes Replication Erasure codes A B A B A+B A B Data object A+2B (4,2) MDS erasure code Replication A A B B vs.
11
11 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Challenge: How to ensure data reliability while distributed storage systems have low node availability ? Two common techniques Redundancy : Ensure data reliability by tolerating data loss. Recovery : Ensure data reliability by repairing data loss. Redundancy Schemes Replication Erasure codes A B A B A+B A B Data object A+2B (4,2) MDS erasure code Replication A A B B vs. MDS code outperforms Replication
12
12 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Distributed Storage Systems Issues: coding + distributed storage Distribution communication Repair communication Reconstruction communication ? ABAB A B A+B A+2B
13
13 Recovery: Continuously repair lost redundancy. 1.Single-loss recovery: Continuously repair one loss each time. new node 1 new node 2 Source: Stores the original file S … … new node 3 Regenerating Codes: "Network Coding for Distributed Storage Systems", A. G. Dimakis, P. B. Godfrey, Y. Wu, M. Wainwright and K. Ramchandran, IEEE TOIT (to appear). Preliminary versions appeared in Infocom 07 and Allerton 07. 2.Multi-loss recovery: Continuously repair multi-losses each time. distribution Continuous single–loss recovery one by one old node new node source 1…d1…d 1…d1…d 1…d1…d from Multiple Losses Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Multiple Losses Recovery
14
14 Recovery: Continuously repair lost redundancy. 1.Single-loss recovery: Continuously repair one loss each time. Traditional Erasure Codes Data object: 4 packets A B C D 2A+C B+D A+C 2B+D C D A B B A : one packet : old node source node one node fails rebuild M any 2 nodes Distribution : new node rebuild M any 2 nodes A+C 2B+D C D 4 packets transmitted for recovery from Multiple Losses Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Multiple Losses Recovery
15
15 Recovery: Continuously repair lost redundancy. 1.Single-loss recovery: Continuously repair one loss each time. Regenerating Codes: "Network Coding for Distributed Storage Systems", Data object: 4 packets A B C D 2A+C B+D A+C 2B+D C D A B C+D A+2B+C+D 2A+B+C+D B A : one packet : old node source node one node fails C+D rebuild M any 2 nodes Distribution : new node A+2B+C+D 2A+B+C+D rebuild M any 2 nodes Only 3 packets transmitted for recovery from Multiple Losses Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Background – Multiple Losses Recovery RC outperforms Traditional EC!
16
16 from Multiple Losses Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Recovery: Continuously repair lost redundancy. 2.Multi-loss recovery: Continuously repair multi-losses each time. 1.Single-loss recovery: Continuously repair one loss each time. Multi-losses often occurs in real systems: Lazy repair mechanism Some systems like Total Recall trigger a recovery only when the total amount of losses reaches a given threshold. Malicious Attack Large number of maliciously controlled agents leave the network simultaneously. Partial power outage Sudden disconnections of multiple nodes because the power is cut off over a partial area. Background – Multiple Losses Recovery
17
Motivation 17 Cooperative Recovery Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Goal: How to repair r failures simultaneously at low bandwidth cost in distributed storage systems while keeping (n,k) MDS property? Intuition: Regenerating codes can be used to repair multiple failures one by one. What if all the new nodes repair data cooperatively? S (a) One by one repair S (b) Cooperative repair More links used for repairing 2 3 Motivation: Each new node in the cooperative repair can use more link resources, so the number of linearly dependent packets transmitted may be less and thus bandwidth cost could be reduced.
18
Motivation 18 Multi-loss recovery based on one-by-one repair A B C D 2A+C B+D A+C 2B+D C D A B : one packet : old node source node Two nodes fail rebuild M any 2 nodes : new node A B A+C 2B+D C D C D A B 2A+2C B+2D Data object: 4 packets 8 packets transmitted for recovery 4 packets transmitted for recovery Cooperative Recovery Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding
19
Motivation 19 Multi-loss recovery based on cooperative repair A B C D 2A+C B+D A+C 2B+D C D A B : one packet : old node source node Two nodes fail rebuild M any 2 nodes : new node A A+C B 2B+D Data object: 4 packets 6 packets transmitted for recovery 2A+C D D C B+D 4 packets transmitted for recovery 2 packets transmitted for recovery Cooperative Recovery Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Cooperative repair outperforms RC!
20
20 Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Contribution Contributions: Paper published: Yuchong Hu, Yinlong Xu, Xiaozhao Wang, Cheng Zhan, Pei Li. “Cooperative Recovery of Distributed Storage Systems from Multi- Losses with Network Coding.” IEEE Journal on Selected Areas in Communications (JSAC) February 2010 VOLUME 28 NUMBER 2. We present a Multi-loss Cooperative Recovery mechanism “MCR” and analyze the lower bound of repair bandwidth (transmitted packets for recovery) based on MCR. We construct an MCR recovery scheme and a random linearly coding algorithm which match the lower bound. So the proposed scheme is optimal in bandwidth and the lower bound is tight. By numerical comparisons, We show that the MCR mechanism is more efficient than Regenerating Codes in terms of bandwidth and storage cost for multi-loss recovery.
21
Outline 21 Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Repair Scheme How to design a repair scheme matching this lower bound? Numerical Comparisons Comparison with MSR/MBR MCR Mechanism What is MCR mechanism? Lower Bound of Repair Bandwidth What is the lower bound of repair bandwidth based on MCR?
22
MCR Mechanism MCR: mutually cooperative recovery I.Initially, the source encodes the original file of size M with (n,k) MDS codes, I M/2 source II III M rebuild M rebuild M II.Over time, r nodes fail, n-r nodes survive. The system select r new idle nodes. First, each old node sends the encoded data to all the new nodes. Second, each new node sends encoded data to the other new nodes. Last, each new node encodes all the received data into a fragment of size M/k, and stores it. III.Finally, n-r old nodes and r new nodes are still (n,k) MDS encoded. n=4,k=2 β r=2 β any 2 nodes M/2 any 2 nodes 2β2β 2β2β 3β3β 3β3β M/2 and distributes n encoded fragments (each of size M/k) to n idle nodes. Each repair link is size of β.
23
Outline 23 Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Repair Scheme How to design a repair scheme matching this lower bound? Numerical Comparisons Comparison with MSR/MBR MCR Mechanism What is MCR mechanism? Lower Bound of Repair Bandwidth What is the lower bound of repair bandwidth based on MCR?
24
Repair Bandwidth for MCR The value of Repair bandwidth : Between old nodes and new ones: (n-r)rβ Between new nodes: (r-1) rβ Repair bandwidth = (n-1)rβ I II III Example: n=4 k=2 r=2 M/2 β M : the original file n: total node number to store M k : node number for file reconstruction r : failed node number Data transmission for Recovery in MCR : β : old node : failed node : new node The lower bound of repair bandwidth : The lower bound of repair bandwidth = (n-1)r * β min
25
Repair Bandwidth for MCR Theorem 1: Given an original file of size M distributed stored with (n,k) MDS code. If the system uses MCR to repair r node failures, then there exists a random linear coding scheme such that after the recovery n available nodes are still (n,k) MDS encoded if β ≥ M/[k(n-k)] Proof of Theorem 1 Distributed storage NetworkFlow graph G(n,k,r, β) formulate M rebuild M M/2 any 2 nodes M/2 Data Collector1 Data Collector2 Data Collector3 Data Collector4 Data Collector5 Data Collector6 Distributed storage Network The capacity of a cut in G(n,k,r, β) represent A function of β : F(β)
26
Repair Bandwidth for MCR Theorem 1: Given an original file of size M distributed stored with (n,k) MDS code. If the system uses MCR to repair r node failures, then there exists a random linear coding scheme such that after the recovery n available nodes are still (n,k) MDS encoded if β ≥ M/[k(n-k)] Proof of Theorem 1 Distributed storage NetworkFlow graph G(n,k,r, β) formulate M M/2 Data Collector1 Data Collector2 Data Collector3 Data Collector4 Data Collector5 Data Collector6 Distributed storage Network The capacity of a cut in G(n,k,r, β) represent A function of β : F(β) z u w x y t s a a b b b a b a a⊕ b a⊕ b a⊕ b a⊕ b a⊕ b a⊕ b = Multicast Network
27
Repair Bandwidth for MCR Theorem 1: Given an original file of size M distributed stored with (n,k) MDS code. If the system uses MCR to repair r node failures, then there exists a random linear coding scheme such that after the recovery n available nodes are still (n,k) MDS encoded if β ≥ M/[k(n-k)] Proof of Theorem 1 Distributed storage NetworkFlow graph G(n,k,r, β) formulate The capacity of a cut in G(n,k,r, β) represent A function of β : F(β) The capacity of min-cut of G(n,k,r, β) ≥ M F(β) ≥ M β ≥ M/[k(n-k)] It is known [1,2] that a linear network code will exist if the max- flow is greater than size of data object in the multicast network. [1] R. Ahlswede, N. Cai, S.-Y. R. Li, and R. W. Yeung. Network information flow. IEEE Trans. Info. Theory, 46(4):1204–1216, [2] S.-Y. R. Li, R. W. Yeung, and N. Cai. Linear network coding. IEEE Trans. on Information Theory, 49:371–381, February 2003.
28
Repair Bandwidth for MCR Theorem 1: Given an original file of size M distributed stored with (n,k) MDS code. If the system uses MCR to repair r node failures, then there exists a random linear coding scheme such that after the recovery n available nodes are still (n,k) MDS encoded if β ≥ M/[k(n-k)] Proof of Theorem 1 Distributed storage NetworkFlow graph G(n,k,r, β) formulate The capacity of a cut in G(n,k,r, β) represent A function of β : F(β) The lower bound of repair bandwidth : Between old nodes and new ones: (n-r)rβ Between new nodes: (r-1) rβ Repair bandwidth = (n-1)rβ The lower bound of repair bandwidth = (n-1)r * M/[k(n-k)] The capacity of min-cut of G(n,k,r, β) ≥ M F(β) ≥ M β ≥ M/[k(n-k)]
29
Outline 29 Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Repair Scheme How to design a repair scheme matching this lower bound? Numerical Comparisons Comparison with MSR/MBR MCR Mechanism What is MCR mechanism? Lower Bound of Repair Bandwidth What is the lower bound of repair bandwidth based on MCR?
30
Recovery Scheme MCR recovery scheme (based on Random linear coding) k(n-k)=4 packets A B C D A+2C B+2D A+C B+D C D A B 2A+B C+2D A+3B 3C+D A+3B+3C+D 2A+B+C+2D 3A+4B+C+2D A+3B+3C+D 2A+B+C+2D A+3B+4C+3D : one packet : old node source node M: Two nodes fail Repairing with MCR A+3B 2A+B 3C+D C+2D MCR is done! rebuild M any 2 nodes Distribution Example: n=4,k=2,r=2 To ensure β = M/[k(n − k)] File is represented as k(n-k) packets Each repair link is size of one packet : new node Random linear coding based on field F.
31
Validity of MCR Theorem 2: Given an original file of size M distributed stored with (n,k) strong-MDS code. If the system uses MCR recovery scheme to repair r node failures, then after the recovery the probability that n available nodes are still (n,k) strong-MDS encoded can be arbitrarily driven to 100% by increasing the field size of F. Theorem 2 give the proof of the correctness of MCR recovery scheme. So the lower bound is matched and the proposed scheme is optimal.
32
(n,k) Strong-MDS codes (n,k) Strong-MDS codes A B C D A+2C B+2D A+C B+D C D A B B A+C B+D A+2C rebuild M 10211021 It is easy to verify that for any (h 1, h 2, h 3, h 4 ) the original file can be rebuilt, that is, these four storage nodes are (4,2) Strong-MDS coded. Example: n=4, k=2, r=2. Definition: Given any h 1, h 2,..., h n with h 1 +…+h n =k(n-k) and 0≤ h i ≤n-k. If n storage nodes X 1,…,X n can select h 1,…,h n packets correspondingly such that the selected packets can be used to rebuild the file, then X 1,…,X n are (n,k) Strong-MDS coded. source node (h 1, h 2, h 3, h 4 ) packets can be selected to rebuild M Given (h 1, h 2, h 3, h 4 ) = (1,0,2,1). X1X1 X2X2 X3X3 X4X4
33
Outline 33 Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Repair Scheme How to design a repair scheme matching this lower bound? Numerical Comparisons Comparison with MSR/MBR MCR Mechanism What is MCR mechanism? Lower Bound of Repair Bandwidth What is the lower bound of repair bandwidth based on MCR?
34
Numerical Comparisons File size: M bytes IEC: Ideal erasure code. It is the optimal theoretical erasure code. MSR/MBR: Two kinds of Regenerating Codes Table 1 Table 2
35
Extended work by Kennth W. Shum Kenneth W. Shum, Cooperative Regenerating Codes for Distributed Storage Systems, submitted to IEEE International Conference on Communications ( ICC ) 2011
36
Open problems 1.It is assumed that all the repair link consume the same bandwidth (symmetrical). If the repair link bandwidth cost can be unsymmetrical, then what is the corresponding optimal scheme? - We can construct some different kinds of scenarios. 2.It is assumed that the storage nodes amount is fixed at n. Actually, in real systems, the system often face to the situation that additional spare nodes are added into the system, such as buying more storage devices. - We can consider the scalability in distributed storage. 3.How to design a deterministic coding scheme instead of random linear coding scheme? (in order to decrease the computational complexity - “Deterministic” means “centralized coordination”, so it would be a tradeoff between computational complexity and centralized coordination.
37
37 Distributed Storage Systems Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Issues: coding + distributed storage Distribution communication Repair communication Reconstruction communication ABAB A B A+B A+2B Open problems
38
My Ongoing Work in INC Now I am working with two professors - John and Patrick. We focus on building up a real storage system based on network coding. And I am doing the implementation of Regenerating Codes. Some of goals are to verify some theoretical results and ideas on the system level. Meanwhile, Look for more practical problems in distributed storage. 38
39
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.