Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu.

Slides:



Advertisements
Similar presentations
Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
Advertisements

IT 344: Operating Systems Winter 2007 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Chia-Chi Teng CTB 265.
Triple-Parity RAID and Beyond Hai Lu. RAID RAID, an acronym for redundant array of independent disks or also known as redundant array of inexpensive disks,
RAID Oh yes Whats RAID? Redundant Array (of) Independent Disks. A scheme involving multiple disks which replicates data across multiple drives. Methods.
RAID Redundant Array of Independent Disks
Analysis and Construction of Functional Regenerating Codes with Uncoded Repair for Distributed Storage Systems Yuchong Hu, Patrick P. C. Lee, Kenneth.
current hadoop architecture
Alex Dimakis based on collaborations with Dimitris Papailiopoulos Arash Saber Tehrani USC Network Coding for Distributed Storage.
Henry C. H. Chen and Patrick P. C. Lee
1 NCFS: On the Practicality and Extensibility of a Network-Coding-Based Distributed File System Yuchong Hu 1, Chiu-Man Yu 2, Yan-Kit Li 2 Patrick P. C.
BASIC Regenerating Codes for Distributed Storage Systems Kenneth Shum (Joint work with Minghua Chen, Hanxu Hou and Hui Li)
Self-repairing Homomorphic Codes for Distributed Storage Systems [1] Tao He Software Engineering Laboratory Department of Computer Science,
Simple Regenerating Codes: Network Coding for Cloud Storage Dimitris S. Papailiopoulos, Jianqiang Luo, Alexandros G. Dimakis, Cheng Huang, and Jin Li University.
Yuchong Hu1, Henry C. H. Chen1, Patrick P. C. Lee1, Yang Tang2
1 STAIR Codes: A General Family of Erasure Codes for Tolerating Device and Sector Failures in Practical Storage Systems Mingqiang Li and Patrick P. C.
Availability in Globally Distributed Storage Systems
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Facebook f4 Steve Ko Computer Sciences and Engineering University at Buffalo.
1 Rateless codes and random walks for P2P resource discovery in Grids IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, NOV Valerio Bioglio.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Beyond the MDS Bound in Distributed Cloud Storage
Concealment of Whole-Picture Loss in Hierarchical B-Picture Scalable Video Coding IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 11, NO. 1, JANUARY 2009 Xiangyang.
A “Hitchhiker’s” Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers K. V. Rashmi, Nihar Shah, D. Gu, H. Kuang, D. Borthakur,
1 Data Persistence in Large-scale Sensor Networks with Decentralized Fountain Codes Yunfeng Lin, Ben Liang, Baochun Li INFOCOM 2007.
6/5/ TRAP-Array: A Disk Array Architecture Providing Timely Recovery to Any Point-in-time Authors: Qing Yang,Weijun Xiao,Jin Ren University of Rhode.
Using Redundancy to Cope with Failures in a Delay Tolerant Network Sushant Jain, Michael Demmer, Rabin Patra, Kevin Fall Source:
1 Simple Network Codes for Instantaneous Recovery from Edge Failures in Unicast Connections Salim Yaacoub El Rouayheb, Alex Sprintson Costas Georghiades.
Efficient replica maintenance for distributed storage systems Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon, M. Frans Kaashoek,
Alex Dimakis based on collaborations with Dimitris Papailiopoulos Viveck Cadambe Kannan Ramchandran USC Tutorial on Distributed Storage Problems and Regenerating.
10th Canadian Workshop on Information Theory June 7, 2007 Rank-Metric Codes for Priority Encoding Transmission with Network Coding Danilo Silva and Frank.
Cooperative regenerating codes for distributed storage systems Kenneth Shum (Joint work with Yuchong Hu) 22nd July 2011.
CSE 451: Operating Systems Winter 2010 Module 13 Redundant Arrays of Inexpensive Disks (RAID) and OS structure Mark Zbikowski Gary Kimura.
Servers Redundant Array of Inexpensive Disks (RAID) –A group of hard disks is called a disk array FIGURE Server with redundant NICs.
NCCloud: A Network-Coding-Based Storage System in a Cloud-of-Clouds
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
Redundant Array of Independent Disks
Repairable Fountain Codes Megasthenis Asteris, Alexandros G. Dimakis IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 32, NO. 5, MAY /5/221.
22/07/ The MDS Scaling Problem for Cloud Storage Yu-chong Hu Institute of Network Coding.
CS492: Special Topics on Distributed Algorithms and Systems Fall 2008 Lab 3: Final Term Project.
© 2012 A. Datta & F. Oggier, NTU Singapore Redundantly Grouped Cross-object Coding for Repairable Storage Anwitaman Datta & Frédérique Oggier NTU Singapore.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
1 An Update Model for Network Coding in Cloud Storage Systems th Annual Allerton Conference on Communication, Control, and Computing Mohammad Reza.
Andrew Liau, Shahram Yousefi, Senior Member, IEEE, and Il-Min Kim Senior Member, IEEE Binary Soliton-Like Rateless Coding for the Y-Network IEEE TRANSACTIONS.
Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Yuchong Hu, Yinlong Xu, Xiaozhao Wang, Cheng Zhan and Pei.
Layer-aligned Multi-priority Rateless Codes for Layered Video Streaming IEEE Transactions on Circuits and Systems for Video Technology, 2014 Hsu-Feng Hsiao.
Multi-Edge Framework for Unequal Error Protecting LT Codes H. V. Beltr˜ao Neto, W. Henkel, V. C. da Rocha Jr. Jacobs University Bremen, Germany IEEE ITW(Information.
Effective Replica Maintenance for Distributed Storage Systems USENIX NSDI’ 06 Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon,
Exact Regenerating Codes on Hierarchical Codes Ernst Biersack Eurecom France Joint work and Zhen Huang.
20/10/ Cooperative Recovery of Distributed Storage Systems from Multiple Losses with Network Coding Yuchong Hu Institute of Network Coding Please.
A Fast Repair Code Based on Regular Graphs for Distributed Storage Systems Yan Wang, East China Jiao Tong University Xin Wang, Fudan University 1 12/11/2013.
Prioritized Distributed Video Delivery With Randomized Network Coding IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 13, NO. 4, AUGUST 2011 Nikolaos Thomos Jacob.
Coding and Algorithms for Memories Lecture 13 1.
Secret Sharing in Distributed Storage Systems Illinois Institute of Technology Nexus of Information and Computation Theories Paris, Feb 2016 Salim El Rouayheb.
Seminar On Rain Technology
I/O Errors 1 Computer Organization II © McQuain RAID Redundant Array of Inexpensive (Independent) Disks – Use multiple smaller disks (c.f.
CSE 451: Operating Systems Spring 2010 Module 18 Redundant Arrays of Inexpensive Disks (RAID) John Zahorjan Allen Center 534.
A Tale of Two Erasure Codes in HDFS
Double Regenerating Codes for Hierarchical Data Centers
Steve Ko Computer Sciences and Engineering University at Buffalo
Steve Ko Computer Sciences and Engineering University at Buffalo
Authors Alessandro Duminuco, Ernst Biersack Taoufik and En-Najjary
CSE 451: Operating Systems Spring 2006 Module 18 Redundant Arrays of Inexpensive Disks (RAID) John Zahorjan Allen Center.
Xiaoyang Zhang1, Yuchong Hu1, Patrick P. C. Lee2, Pan Zhou1
CSE 451: Operating Systems Spring 2005 Module 17 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
CMPE 252A : Computer Networks
Mark Zbikowski and Gary Kimura
CSE 451: Operating Systems Winter 2007 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
IT 344: Operating Systems Winter 2007 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Chia-Chi Teng CTB
CSE 451: Operating Systems Winter 2006 Module 18 Redundant Arrays of Inexpensive Disks (RAID) Ed Lazowska Allen Center 570.
Presentation transcript:

Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu Martin J. Wainwright Kannan Ramchandran 1

Outline ﻪIntroduction ﻪBackground ﻪAnalysis ﻪEvaluation ﻪConclusion 2

Introduction ﻪDistributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. ﻪStoring data in distributed storage systems ﻩthe encoded data are spread across nodes. ﻩrequire less redundancy than replication. ﻩreplace stored data periodically. 3

Introduction ﻪKey issue in distributed storage systems. ﻩrepair bandwidth ﻩstorage space ﻪHow to generate encoded data in a distributed way as little data as possible ? 4

MDS Codes ﻪA common practice to repair from a single node failure for an erasure coded system. 1.a new node to reconstruct the whole encoded data object. 2.then, generate just one encoded block. ﻪMaximum Distance Separable (MDS) code. ﻩ(n, k)-MDS property ﻩrecover original file by any k set of encoded data. 5

MDS Codes File divide M/k encode store at n nodes MDS encode 6

Introduction ﻪRedundancy must be continually refreshed as nodes fail in distributed storage systems. ﻩlarge data transfers across the network. 7

Introduction ﻪThe erasure codes can be repaired without communicating the whole data object. ﻪ(4, 2)-MSR example when node is fail. ﻩgenerate smaller parity packets of their data. ﻩforward them to the newcomer. ﻩthe newcomer mix packets to generate two new packets

Introduction ﻪThis paper identifies that there is a optimal tradeoff curve between storage and repair bandwidth. ﻩsmaller storage space => less redundancy => more repair bandwidth ﻪThis paper calls codes that lie on this optimal tradeoff curve regenerating codes. 9

Introduction ﻪMinimum-Storage Regenerating (MSR) codes. ﻩcan be efficiently repaired. ﻪMinimum-Bandwidth Regenerating (MBR) codes. ﻩstorage node stores slightly more than M/k. ﻩthe repair bandwidth can be reduced. 10

Outline ﻪIntroduction ﻪBackground ﻪAnalysis ﻪEvaluation ﻪConclusion 11

Erasure Codes ﻪClassical coding theory focuses on the tradeoff between redundancy and error tolerance. ﻪIn terms of the redundancy-reliability tradeoff, the Maximum Distance Separable (MDS) codes are optimal. ﻩthe most well-known is Reed-Solomon codes. 12

Network Coding ﻪNetwork coding allows ﻩthe intermediate nodes to generate output data by encoding previously received input data. ﻩinformation to be “mixed” at intermediate nodes. ﻪThis paper investigates the application of network coding for the repair problem in distributed storage. ﻩtradeoff between storage and repair network bandwidth 13

Distributed Storage Systems ﻪErasure codes could reduce bandwidth use by an order of magnitude compared with replication. ﻪHybrid strategy: ﻩone special storage node maintains one full replica. ﻩmultiple erasure encoded data. ﻩtransfer only M / k bytes for a new encoded data by replica node. ﻩthere is the problem when replica data lost. 14

Outline ﻪIntroduction ﻪBackground ﻪAnalysis ﻪEvaluation ﻪConclusion 15

Information Flow Graph 16

Storage-Bandwidth Tradeoff ﻪThe normal redundancy we want to maintain requires active storage nodes ﻩeach storing α bits ﻩβ bits each from any d surviving nodes ﻩtotal repair bandwidth is γ = d β ﻪFor each set of parameters (n, k, d, α, γ), there is a family of information flow graphs, each of which corresponds to a particular evolution of node failures / repairs. 17

Storage-Bandwidth Tradeoff ﻪDenote this family of directed acyclic graphs by ﻩ(4, 2, 3, 1 Mb, 1.5 Mb) is feasible. 18

Storage-Bandwidth Tradeoff ﻪTheorem 1 : For any α ≥ α*(n, k, d, γ), the points are feasible. 19

Theorem Proof (1/4) 20

Theorem Proof (2/4) ﻪ.ﻪ.ﻪ.ﻪ.ﻪ.ﻪ.ﻪ.ﻪ. 21

Theorem Proof (3/4) ﻪ.ﻪ.ﻪ.ﻪ. 22

Theorem Proof (4/4) ﻪ.ﻪ. ﻪ.ﻪ. 23

Storage-Bandwidth Tradeoff ﻪCode repair can be achieved if and only if the underlying information flow graph has sufficiently large min-cuts. 24

Storage-Bandwidth Tradeoff ﻪOptimal tradeoff curve between storage α and repair bandwidth γ ﻩ(γ = 1, α = 0.2) (γ = 1, α = 0.1) 25

Special Cases (1/2) ﻪMinimum-Storage Regenerating (MSR) Codes ﻩ. 26

Special Cases (2/2) ﻪMinimum-Bandwidth Regenerating (MBR) Codes ﻩ. 27

Outline ﻪIntroduction ﻪBackground ﻪAnalysis ﻪEvaluation ﻩNode Dynamics and Objectives ﻩModel ﻩQuantitative Results ﻪConclusion 28

Node Dynamics and Objectives (1/2) ﻪA permanent failure ﻩthe permanent departure of a node from the system ﻩa disk failure resulting in loss of the data stored on the node ﻪA transient failure ﻩnode reboot ﻩtemporary network disconnection 29

Node Dynamics and Objectives (2/2) ﻪA file is available ﻩit can be reconstructed from the data stored on currently available nodes. ﻪA file is durability ﻩafter permanent node failures, it may be available at some point in the future. 30

Model (1/5) ﻪThe model has two key parameters, f and a. ﻩa fraction f of the nodes storing file data fail permanently per unit time. ﻩat any given time, the node storing data is available with some probability a. ﻪThe expected availability and maintenance bandwidth of various redundancy schemes can be computed to maintain a file of M bytes. 31

Model (2/5) 32

Model (3/5) 33

Model (4/5) 34

Model (5/5) 35

Estimating f and a 36

Quantitative Results (1/2) 37

Quantitative Results (2/2) 38

Quantitative Comparison 39

Outline ﻪIntroduction ﻪBackground ﻪAnalysis ﻪEvaluation ﻪConclusion 40

Conclusion ﻪThis paper presented a general theoretic framework that can determine the information. ﻩcommunicate to repair failures in encoded systems. ﻩidentify a tradeoff between storage and repair bandwidth. ﻪOne potential application area for the proposed regenerating codes is distributed archival storage or backup. ﻩregenerating codes potentially can offer desirable tradeoffs in terms of redundancy, reliability, and repair bandwidth. 41