Erasure Coding vs. Replication: A Quantiative Comparison

Slides:



Advertisements
Similar presentations
MinCopysets: Derandomizing Replication in Cloud Storage
Advertisements

A Case for Redundant Arrays Of Inexpensive Disks Paper By David A Patterson Garth Gibson Randy H Katz University of California Berkeley.
Disk Arrays COEN 180. Large Storage Systems Collection of disks to store large amount of data. Performance advantage: Each drive can satisfy only so many.
RAID Oh yes Whats RAID? Redundant Array (of) Independent Disks. A scheme involving multiple disks which replicates data across multiple drives. Methods.
Tapestry: Decentralized Routing and Location SPAM Summer 2001 Ben Y. Zhao CS Division, U. C. Berkeley.
What is OceanStore? - 10^10 users with files each - Goals: Durability, Availability, Enc. & Auth, High performance - Worldwide infrastructure to.
CSCE430/830 Computer Architecture
Henry C. H. Chen and Patrick P. C. Lee
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
G O O G L E F I L E S Y S T E M 陳 仕融 黃 振凱 林 佑恩 Z 1.
© 2005 Andreas Haeberlen, Rice University 1 Glacier: Highly durable, decentralized storage despite massive correlated failures Andreas Haeberlen Alan Mislove.
POND: the OceanStore Prototype Sean Rhea, Patrick Eaton, Dennis Geels, Hakim Weatherspoon, Ben Zhao and John Kubiatowicz UC, Berkeley File and Storage.
Pond: the OceanStore Prototype CS 6464 Cornell University Presented by Yeounoh Chung.
Availability in Globally Distributed Storage Systems
Beyond the MDS Bound in Distributed Cloud Storage
David Choffnes, Winter 2006 OceanStore Maintenance-Free Global Data StorageMaintenance-Free Global Data Storage, S. Rhea, C. Wells, P. Eaton, D. Geels,
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Jack Lee Yiu-bun, Raymond Leung Wai Tak Department.
Typhoon: An Ultra-Available Archive and Backup System Utilizing Linear-Time Erasure Codes.
Memory Management Design & Implementation Segmentation Chapter 4.
1 Lecture 20 – Caching and Virtual Memory  2004 Morgan Kaufmann Publishers Lecture 20 Caches and Virtual Memory.
Selected Papers in CCGrid2004 Presented by Chan Chi Yuk.
A Hybrid Approach of Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation Yinlong Xu University of Science and Technology of.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Distributed Cluster Repair for OceanStore Irena Nadjakova and Arindam Chakrabarti Acknowledgements: Hakim Weatherspoon John Kubiatowicz.
Failure Independence in Oceanstore Archive Hakim Weatherspoon University of California, Berkeley.
Efficient replica maintenance for distributed storage systems Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon, M. Frans Kaashoek,
Adaptive Content Management in Structured P2P Communities Jussi Kangasharju Keith W. Ross David A. Turner.
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Presented by: Raymond Leung Wai Tak Supervisor:
Performance Evaluation of Peer-to-Peer Video Streaming Systems Wilson, W.F. Poon The Chinese University of Hong Kong.
Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.
OceanStore: An Architecture for Global-Scale Persistent Storage Professor John Kubiatowicz, University of California at Berkeley
Library Automation and Digital Libraries Class #5 LBSC 690 Information Technology.
1 SRAM: –value is stored on a pair of inverting gates –very fast but takes up more space than DRAM (4 to 6 transistors) DRAM: –value is stored as a charge.
Long Term Durability with Seagull Hakim Weatherspoon (Joint work with Jeremy Stribling and OceanStore group) University of California, Berkeley ROC/Sahara/OceanStore.
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Raymond Leung and Jack Y.B. Lee Department of Information.
Network Coding for Distributed Storage Systems IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010 Alexandros G. Dimakis Brighten Godfrey Yunnan Wu.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Redundant Array of Independent Disks
Failure Resilience in the Peer-to-Peer-System OceanStore Speaker: Corinna Richter.
A BigData Tour – HDFS, Ceph and MapReduce These slides are possible thanks to these sources – Jonathan Drusi - SCInet Toronto – Hadoop Tutorial, Amir Payberah.
Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley.
8.4 paging Paging is a memory-management scheme that permits the physical address space of a process to be non-contiguous. The basic method for implementation.
Autonomous Replication for High Availability in Unstructured P2P Systems Francisco Matias Cuenca-Acuna, Richard P. Martin, Thu D. Nguyen
Autonomous Replication for High Availability in Unstructured P2P Systems (Paper by Francisco Matias Cuenca-Acuna, Richard P. Martin, Thu D. Nguyen) Hristo.
Efficient P2P backup through buffering at the edge S. Defrance, A.-M. Kermarrec (INRIA), E. Le Merrer, N. Le Scouarnec, G. Straub, A. van Kempen.
Redundant Array of Independent Disks.  Many systems today need to store many terabytes of data.  Don’t want to use single, large disk  too expensive.
L/O/G/O Cache Memory Chapter 3 (b) CS.216 Computer Architecture and Organization.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
 Course Overview Distributed Systems IT332. Course Description  The course introduces the main principles underlying distributed systems: processes,
1 Push-to-Peer Video-on-Demand System. 2 Abstract Content is proactively push to peers, and persistently stored before the actual peer-to-peer transfers.
A P2P On-Demand Video Streaming System with Multiple Description Coding Yanming Shen, Xiaofeng Xu, Shivendra Panwar, Keith Ross, Yao Wang Polytechnic University.
Exact Regenerating Codes on Hierarchical Codes Ernst Biersack Eurecom France Joint work and Zhen Huang.
POND: THE OCEANSTORE PROTOTYPE S. Rea, P. Eaton, D. Geels, H. Weatherspoon, J. Kubiatowicz U. C. Berkeley.
1 Elastically Replicated Information Services: Sustaining the Availability of Distributed Storage Across Dynamic Topological Changes Sponsored by Program.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Cloud Computing Vs RAID Group 21 Fangfei Li John Soh Course: CSCI4707.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
CS791Aravind Elango Maintenance-Free Global Data Storage Sean Rhea, Chris Wells, Patrick Eaten, Dennis Geels, Ben Zhao, Hakim Weatherspoon and John Kubiatowicz.
A Case for Redundant Arrays of Inexpensive Disks (RAID) -1988
Memory COMPUTER ARCHITECTURE
CSS534: Parallel Programming in Grid and Cloud
OceanStore: An Architecture for Global-Scale Persistent Storage
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Raymond Leung and Jack Y.B. Lee Department of Information.
Authors Alessandro Duminuco, Ernst Biersack Taoufik and En-Najjary
RAID RAID Mukesh N Tekwani
Outline Midterm results summary Distributed file systems – continued
CMPE 252A : Computer Networks
RAID RAID Mukesh N Tekwani April 23, 2019
Presentation transcript:

Erasure Coding vs. Replication: A Quantiative Comparison Presented By Mr. P. H. Chan

Background Authors: Hakim Weatherspoon and John D. Kubiatowicz from CS Division of UC Berkeley. They have launched a project called “Oceanstore”, a distributed, peer-to-peer storage server in about November 2000. This paper compares Erasure coding with replication when applied on Oceanstore.

List of sections Background Introduction System architecture of Oceanstore (very brief) Availability System Model Comparisons (Bandwidth, storage, disk seek and MTTF) Discussion

Introduction For a peer to peer system, one crucial problem is reliability. Erasure coding and Replications are two commonly used method to improve reliability of these system. With these fault resilient algorithms and a repairing algorithms, the mean time to failure (MTTF) of the system will be increased.

Introduction Generally speaking, we know that erasure code is better than replication. What is improved? How much is improved? Is it worthwhile to use erasure coding? This paper gives a quantitative approach to evaluate the performance gain of erasure code over replication based on Oceanstore.

System Architecture Data are divided in the unit of blocks. Replication/Erasure coding is applied to code the blocks into “fragments”. These fragments are distributed to the workstations in the system. Fragments belongs to the same group of blocks will not be placed in the same workstation. A central management server will constantly retrieve the fragments belongs to each data blocks.

System Architecture If there are workstation broken down, some fragments will be missing. The management server will reproduce the missing fragment and place it in other workstations. (Assumption) A dead machine will be immediately replaced by a new, blank workstation. The time period between the examinations of the same block group is called an “epoch”.

Availability Probability to have a block available in the system.

Availability With N = 1 million, M=10k. Two replicas provide 0.99 availability. Erasure coding at rate ½ (rate = original data / erasure coded data) gives 0.999999998 availability. Erasure coding improves availability.

System Model The Max. number of blocks in the system MTTF of system and MTTF of block

System Model The storage requirement The bandwidth requirement

System Model Number of disk seeks

System Model Comparing the case of using erasure code and replication, we found that the ratio of disk seeks, storage and bandwidth requirement are all equal to R*r.

Comparisons With each user writing data to the system at a rate 35MB/hr, b = 8kB, dbsz = 8kB, N=224 users, erepl = eerase = 4 months, and MTTFsystems = 1000 years, Number of replica need to sustain such MTTF is R = 22 and erasure code need r = ½ to have that MTTF. Thus, R*r = 11.

Comparisons (find MTTFblock) With R = 2, r = 32/64 and erepl = eerase = 4 months, MTTFblock of replication scheme is 74 years and that of erasure code is 1020 years. (recall)

Discussion This paper presented a quantitative approach to calculate the performance gain of using erasure code. Mapping of erasure code to data require intensive CPU time. System MTTF decrease significantly with increasing number of blocks.

Thank you.