Presentation is loading. Please wait.

Presentation is loading. Please wait.

NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds

Similar presentations


Presentation on theme: "NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds"— Presentation transcript:

1 NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds
Henry C. H. Chen Yuchong Hu Patrick P. C. Lee Yang Tang IEEE Transactions on Computers, 15 August 2013

2 Outline Introduction Repair in Multiple Cloud Storage FMSR Codes
NCCloud Conclusion

3 Introduction Cloud storage provides an on-demand remote backup solution. A single cloud storage provider encounters the problem such as a single point of failure.

4 Introduction The general solution is to distribute data across different cloud providers. stripe data The fault-tolerance can be improved by the diversity of multiple clouds.

5 Introduction-Data Failure
This paper focuses on unexpected permanent cloud failure. a cloud fails permanently => activate repair. maintain data redundancy and fault-tolerance. A repair operation retrieves data from existing surviving clouds. reconstructs the lost data in a new cloud.

6 Introduction-Data Failure
During repair, each surviving node encode its stored data chunks. send the encoded chunks to a new node Regenerate the lost data.

7 Introduction-Cost Problem
Today’s cloud storage providers charge users for outbound data. While repairing failures, moving the enormous amount of data (repair traffic) can introduce significant monetary costs.

8 Introduction-Repair Traffic Problem
In order to minimize repair traffic problem, regenerating codes [16] have been proposed. store data redundantly in a distributed storage system. require less repair traffic, but with the same fault-tolerance level. [16] Network Coding for Distributed Storage Systems

9 Introduction-Regenerating Codes
But, most existing regenerating codes require storage nodes equip with computation capabilities. perform encoding operations during repair.

10 Introduction-Regenerating Codes
In order to make regenerating codes portable to any cloud storage service. This paper considers only a thin-cloud interface where storage nodes only support read/write.

11 Introduction-NCCloud
In this paper, we present the design and implementation of NCCloud a proxy-based storage system. a fault-tolerant storage. over multiple cloud storage providers.

12 Introduction-FMSR On top of NCCloud, we propose the functional minimum-storage regenerating (FMSR) codes. The FMSR code implementation maintain double-fault tolerance. maintain the same storage cost as in RAID-6 less repair traffic when recovering a single-cloud failure.

13 Introduction-FMSR FMSR codes are non-systematic
the encoded chunks was formed by linear combination of the original data chunks. not keep the original data chunks as in systematic coding schemes.

14 Outline Introduction Repair in Multiple Cloud Storage FMSR Codes
NCCloud Conclusion

15 Repair in Multiple Cloud Storage
Transient failure is short-term, such that the failed cloud will return to normal after some time and no outsourced data is lost.

16 Repair in Multiple Cloud Storage
Permanent failure is long-term, in the sense that the outsourced data on a failed cloud will become permanently unavailable. example : data center outages in disasters. data loss and corruption. malicious attacks.

17 Outline Introduction Repair in Multiple Cloud Storage FMSR Codes
Motivation Implementation NCCloud Conclusion

18 Motivation This paper considers distributed multiple-cloud storage
data is striped proxy-based design

19 Motivation The proxy reads the essential data pieces from other surviving clouds, reconstructs new data pieces, and writes these new pieces to a new cloud.

20 Fault-tolerant Maximum Distance Separable property (n, k)-MDS code
divide file into equal-size native chunks. linearly combined to form code chunks. distribute over n (larger than k) nodes. reconstruct original file from any k of the n nodes. tolerate the failures of any n − k nodes.

21 Fault-tolerant The FMSR codes can reconstruct the data of failed node from the surviving nodes. download less data. not reconstruct the whole file.

22 Different Coding Schemes
Storage size 2M Repair traffic M Storage size 2M Repair traffic 0.75M Storage size 2M Repair traffic 0.75M

23 Double-fault Tolerant FMSR Codes
divide a file M into 2(n − 2) native chunks. generate 2n code chunks. each node store two code chunks of size 𝑀 2(𝑛−2) . repair a failed node, repair traffic is 𝑀(𝑛−1) 2(𝑛−2) . RAID-6 codes, total storage size is 𝑀𝑛 𝑛−2 , repair traffic is M. 50% saved

24 Outline Introduction Repair in Multiple Cloud Storage FMSR Codes
Motivation Implementation NCCloud Conclusion

25 FMSR Codes Implementation
FMSR codes do not require lost chunks to be exactly reconstructed not identical to those in the failed node. As long as the MDS property holds.

26 FMSR Codes Implementation
This paper propose a two-phase checking scheme to ensure the code chunks on all nodes always satisfy the MDS property.

27 FMSR Codes Implementation
The implementation assumes a thin-cloud interface. File upload File download Repair

28 File Upload Native chunks : Code chunks :
Encoding matrix of coefficients : size 𝑛 𝑛−𝑘 ×𝑘 𝑛−𝑘 in the Galois field GF(pn)

29 File Upload Galois field GF(pn) Encoding coefficient vector

30 File Download Download the k(n−k) code chunks from any k of the n storage nodes. The ECVs of the k(n−k) code chunks can form a k(n−k)×k(n−k) square matrix. Obtain the original k(n − k) native chunks. multiply the inverse of the square matrix with the code chunks.

31 Iterative Repair MDS property must hold even after iterative repairs.
This paper proposes a two-phase checking. MDS property rMDS property

32 Satisfy MDS, but not rMDS

33 Iterative Repair Step 1. Download the encoding matrix from a surviving node. Step 2. Select one ECV from each of the n-1 surviving nodes. Step 3. Generate a repair matrix . Step 4. Compute the ECVs for the new code chunks and reproduce a new encoding matrix.

34 Iterative Repair Step 5. Given EM’, verify if those properties are satisfied. verify MDS by enumerating all 𝑛 𝑘 . verify rMDS by n(n−k)n-1 𝑛 𝑘 . The corresponding encoding matrices must form a full rank. Step 6. Download the actual chunk data and regenerate new chunk data. Step 4 : The new ECVs Code chunks from surviving nodes

35 rMDS Sustaining

36 Time of Two-phase Checking

37 Double-fault Tolerant Codes
Markov Model

38 MTTDL, Compare to RAID-6 Mean Time To Data Loss

39 Outline Introduction Repair in Multiple Cloud Storage FMSR Codes
NCCloud Conclusion

40 NCCloud A proxy that bridges user applications and multiple clouds.
Its design is built on three layers. File system layer Coding layer Storage layer

41 NCCloud It is mainly implemented in Python, while the coding schemes are implemented in C for better efficiency.

42 Goal of NCCloud Compare the costs and response time of using RAID-6 and FMSR codes. The cost advantage of FMSR over RAID-6, while maintaining acceptable response time.

43 Goal of NCCloud Normal operations Repair operation
RAID-6 and FMSR incur similar storage costs. Repair operation FMSR save a significant amount of transfer costs over RAID-6.

44 Cost Saving-Price

45 Cost Saving Normal operations Repair operation 1.25PB of data stored
FMSR : $86,851 monthly storage cost RAID-6 : $86,851 monthly storage cost Repair operation RAID-6 : 1PB of data, $56,832 FMSR : PB of data, $33,894 Saving of $ 22,938

46 Response Time-Local Cloud

47 Response Time-Local Cloud

48 Response Time-Commerical Cloud

49 Outline Introduction Repair in Multiple Cloud Storage FMSR Codes
NCCloud Conclusion

50 Conclusion This paper present NCCloud providing the reliability of today’s cloud backup storage. proxy-based multiple-cloud storage system NCCloud not only provides fault tolerance in storage, but also allows cost-effective repair. The FMSR code implementation eliminates the encoding requirement of storage nodes during repair.


Download ppt "NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds"

Similar presentations


Ads by Google