6.4 Data And File Replication Presenter : Jing He Instructor: Dr. Yanqing Zhang.

Slides:



Advertisements
Similar presentations
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Advertisements

Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
Consistency and Replication (3). Topics Consistency protocols.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Replica Control for Peer-to- Peer Storage Systems.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CS 582 / CMPE 481 Distributed Systems
CS 582 / CMPE 481 Distributed Systems Replication.
Overview Distributed vs. decentralized Why distributed databases
CSS490 Replication & Fault Tolerance
Manajemen Basis Data Pertemuan 10 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
Distributed Systems Fall 2011 Gossip and highly available services.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Client-Centric.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
CS 603 Data Replication February 25, Data Replication: Why? Fault Tolerance –Hot backup –Catastrophic failure Performance –Parallelism –Decreased.
-Bhavya Kilari Dr. Yanqing Zhang, CSc PREVIEW P ART I : I NTRODUCTION o Transaction Processing System [ R. Chow & T. Johnson, 1997 ] o Serializability.
Team CMD Distributed Systems Team Report 2 1/17/07 C:\>members Corey Andalora Mike Adams Darren Stanley.
6.4 Data and File Replication Gang Shen. Why replicate  Performance  Reliability  Resource sharing  Network resource saving.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Database Replication. Replication Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software.
Replication and Consistency. Reference The Dangers of Replication and a Solution, Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. In Proceedings.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
Chapter 6 Distributed File Systems Summary Bernard Chen 2007 CSc 8230.
Improving the Efficiency of Fault-Tolerant Distributed Shared-Memory Algorithms Eli Sadovnik and Steven Homberg Second Annual MIT PRIMES Conference, May.
Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency protocols.
1 ACTIVE FAULT TOLERANT SYSTEM for OPEN DISTRIBUTED COMPUTING (Autonomic and Trusted Computing 2006) Giray Kömürcü.
Chapter 6.5 Distributed File Systems Summary Junfei Wen Fall 2013.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
By Shruti poundarik.  Data Objects and Files are replicated to increase system performance and availability.  Increased system performance achieved.
1 Highly available services  we discuss the application of replication techniques to make services highly available. –we aim to give clients access to.
CS542: Topics in Distributed Systems Replication.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
1 Multiversion Reconciliation for Mobile Databases Shirish Hemanath Phatak & B.R.Badrinath Presented By Presented By Md. Abdur Rahman Md. Abdur Rahman.
2/29/ Replication CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright ©2012 Philip A. Bernstein.
Fault Tolerant Services
Fault Tolerance and Replication
Chap 7: Consistency and Replication
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Providing High Availability Using Lazy Replication Rivaka Ladin, Barbara Liskov, Liuba Shrira, Sanjay Ghemawat Presented by Huang-Ming Huang.
3/6/99 1 Replication CSE Transaction Processing Philip A. Bernstein.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Storage Systems: Data Replication using Quorums.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Lecture 13: Replication Haibin Zhu, PhD. Assistant Professor Department of Computer Science Nipissing University © 2002.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Distributed Computing Systems Replication Dr. Sunny Jeong. Mr. Colin Zhang With Thanks to Prof. G. Coulouris,
Lecturer : Dr. Pavle Mogin
6.4 Data and File Replication
Replication Control II Reading: Chapter 15 (relevant parts)
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Providing Secure Storage on the Internet
Outline Announcements Fault Tolerance.
Distributed Systems CS
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Slides for Chapter 15: Replication
EEC 688/788 Secure and Dependable Computing
Ch 6. Summary Gang Shen.
Presentation transcript:

6.4 Data And File Replication Presenter : Jing He Instructor: Dr. Yanqing Zhang

Outline Basic Knowledge Most Recent Projects Future Works References

Outline Basic Knowledge Most Recent Projects Future Works References

Why replicate Performance Reliability Resource sharing Network resource saving

Challenge Transparency –Parallelism transparency –Failure transparency –Replication transparency Concurrent Control Failure Recovery

Goal One-copy serializability: –The execution of transaction on replicated objects is equivalent to the execution of the same transactions on non-replicated objects [1][R. Chow et al ].

Architecture FSA, File service agent, client interface RM, replica manager, provide replication functions Client chooses one or more FSA to access data object. FSA acts as front end to replica managers RMs to provide replication transparency. FSA contacts one or more RMs for actual updating and reading of data objects.

Architecture

Read operations Read-one-primary: FSA only read from a primary RM to enforce consistency Read-one: FSA may read from any RM to gain concurrency Read-quorum: FSA must read from a quorum of RMs to decide the currency of data

Write Operations Write-one-primary: only write to primary RM, primary RM update all other RMs Write-all: update to all RMs Write-all- available: write to all functioning RMs. Faulty RM need to be synched before bring online.

Write Operations Cont. Write-quorum: update to a predefined quorum of RMs Write-gossip: update to any RM and lazily propagated to other RMs

Read-one-primary, write-one-primary Other RMs are backups of primary RM No concurrency Easy serialized Simple to implement Achieve one-copy serializability Primary RM is performance bottleneck

Read-one, Write-all Provides concurrency Concurrency control protocol needed to ensure consistency (serialization) Achieve one-copy serializability Difficult to implement (there will be failed TM to block any updates)

Read-one, Write-all-available Variation of Read one, Write all May not guarantee one-copy serializability Issue of lots conflict in transactions

Read-quorum, Write-quorum Version number attached to replicated object Highest version numbered object is the latest object in read. Write operation advances version by 1 Write-write conflict: 2 * Write quorum > all object copies Read-write conflict: Write quorum + read quorum > all object copies

Gossip Update Updates are less frequent than reads,updates can be propagated lazily to replicas. Both read and update operations are directed by FSA to any RM FSA shields replication details from clients. Increased performance Typical read one, write gossip Use timestamp

Basic Gossip Update Read: if TS fsa <=TS rm, RM has recent data, return it, otherwise wait for gossip, or try other RM Update: if Ts fsa >TS rm, update. Update TS rm send gossip. Otherwise, process based on application, perform update or reject Gossip: update RM if gossip carries new updates.

Causal Order Gossip Protocol Used for read-modify In a fixed RM configuration Using vector timestamps Using buffer to keep the order

Disadvantages of File replication Contents of the file needs to be known before replication operation takes place. Existing System cant work in limited bandwidth networks. DFS replication will not work well when there are large number of changes to replicate.

Outline Basic Knowledge Most Recent Projects Future Works References

Current Project Data Grid File Replication [2][C. Yang, 2008] Create copies in convenient location Replicas are adjusted to appropriate locations using Bavesian Networks (BN) File replication in P2P systems Plover: making replicas among physically close nodes; load balance between replica nodes [3][H. Shen, 2009] EAD: efficient and adative decentralized file replication algorithm[4,5][H. Shen, 2009]

Outline Basic Knowledge Most Recent Projects Future Works References

Future Work Improve Efficiency and Effectiveness of file replication scheme Integrate File Replication and Consistency Maintenance

Outline Basic Knowledge Most Recent Projects Future Works References

Reference [1] R. Chow and T. Johnson, Distributed Operating Systems & Algorithms, 1997 [2] C. Yang, C. Huang, and T. Hsiao, A Data Grid File Relication Maintenance Strategy Using Bayesian Networks, Eight International Conference on Intelligent Systems Design and Application, 2008 [3] H. Shen, and Y. Zhu, A proactive low-overhead file replication scheme for structured P2P content delivery network, Journal Parallel Distributed Computing, 2009 [4] H. Shen, IRM: Integrated File Replication and Consistency Maintenance in P2P Systems, IEEE Transactions on Parallel and Distributed Systems, 2009 [5] H. Shen, An Efficient and Adaptive Decentralized File Replication Algorithm in P2P File Sharing Systems, IEEE Transactions on Parallel and Distributed Systems, 2009