Replication and Availability in Distributed Systems

Slides:



Advertisements
Similar presentations
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
Advertisements

CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
DISTRIBUTED SYSTEMS II REPLICATION CNT. II Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Replica Control for Peer-to- Peer Storage Systems.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class:Consistency Semantics Consistency models –Data-centric consistency models –Client-centric.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
Failure Resilience in the Peer-to-Peer-System OceanStore Speaker: Corinna Richter.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
1 ACTIVE FAULT TOLERANT SYSTEM for OPEN DISTRIBUTED COMPUTING (Autonomic and Trusted Computing 2006) Giray Kömürcü.
Fault Tolerance and Replication
Chap 7: Consistency and Replication
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
CSE 486/586 Distributed Systems Consistency --- 3
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Distributed databases A brief introduction with emphasis on NoSQL databases Distributed databases1.
Distributed Cache Technology in Cloud Computing and its Application in the GIS Software Wang Qi Zhu Yitong Peng Cheng
Updating SF-Tree Speaker: Ho Wai Shing.
CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.
6.4 Data and File Replication
Replication Control II Reading: Chapter 15 (relevant parts)
Failure recovery and Checkpointing in Distributed Systems
Replication and Availability in Distributed systems
Consistency in Distributed Systems
View Change Protocols and Reconfiguration
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Storage Systems for Managing Voluminous Data
Distributed Shared Memory
Content Dissemination Systems Including Streaming Systems
Providing Secure Storage on the Internet
CSE 486/586 Distributed Systems Consistency --- 3
Implementing Consistency -- Paxos
Introduction There are many situations in which we might use replicated data Let’s look at another, different one And design a system to work well in that.
Distributed P2P File System
EECS 498 Introduction to Distributed Systems Fall 2017
Outline Announcements Fault Tolerance.
7.1. CONSISTENCY AND REPLICATION INTRODUCTION
Fault Tolerance Distributed Web-based Systems
CSE 486/586 Distributed Systems Consistency --- 1
Distributed File Systems
Consistency and Replication
Active replication for fault tolerance
PERSPECTIVES ON THE CAP THEOREM
EEC 688/788 Secure and Dependable Computing
Naman shah Harshil shah Priyank BambhrOLIA
Process Migration Troy Cogburn and Gilbert Podell-Blume
By: Greg Boyarko, Jordan Sutton, and Shaun Parkison
EEC 688/788 Secure and Dependable Computing
Cyber Physical Systems
IS 651: Distributed Systems Fault Tolerance
Lecture 21: Replication Control
EECS 498 Introduction to Distributed Systems Fall 2017
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Distributed Systems CS
CONSISTENCY IN DISTRIBUTED SYSTEMS
Replica Placement Model: We consider objects (and don’t worry whether they contain just data or code, or both) Distinguish different processes: A process.
CMSC Cluster Computing Basics
Last Class: Web Caching
EEC 688/788 Secure and Dependable Computing
Distributed Systems (15-440)
Distributed Graph Algorithms
Lecture 21: Replication Control
Implementing Consistency -- Paxos
CSE 486/586 Distributed Systems Consistency --- 3
Dennis Kafura – CS5204 – Operating Systems
Presentation transcript:

Replication and Availability in Distributed Systems CS455 Introduction to Distributed Systems Department of Computer Science Colorado State University Sagar Reddy Bijjam Srinivas Reddy Kontham

Why is this Problem Important? We are producing more data than ever - 90% of all data ever produced generated in last two years We need to concurrently access this data We need quick data access We need high system availability

Problem Characterization Failures are inevitable, whether it may be hardware, programmatic. Two primary reasons for replication: Increasing reliability: – If a replica crashes, system can continue working by switching to other replicas. Improving performance: – Important for distributed systems over large geographical areas. – Divide the work over a number of servers. – Place data in the proximity of clients.

Trade-off space for solutions in this area The CAP principle: Strong consistency: system should be able to provide concurrent updates. High availability: any consumer of data can always reach some replica Partition resilience: the system can survive network partitions CAP: Strong Consistency, High Availability, Partition-resilience: Pick at most 2!

Dominant Approaches to the Problem(1/2) Pessimistic Algorithms – Guarantees Strong consistency at the cost of availability – Based On Quorum Consensus – Gifford’s Voting Protocol Optimistic Algorithms – Gives more importance to availability at the expense of consistency – Good technique if probability of conflicts is small – Coda replication

Dominant Approaches to the Problem(2/2) Highly Available Pessimistic Algorithm – Increased availability with Strong Consistency Dynamic Voting Protocol Network partitions and other failures can hurt fault-tolerance of static voting Dynamic voting can boost fault-tolerance by adapting – the number of votes assigned to various nodes – the set of nodes that can form read/write quorums

Insights Gleaned Replication is mainly used to improve the availability and performance But there exists several trade-offs in choosing the replication strategy Optimistic algorithms are best suitable where availability is main criteria. Whereas pessimistic algorithms are at there best in situations where availability is of less concern. So choose the best that fits the situation.

Problem Space in the Future Which files should be replicated? How many replicas should be created? Where the replicas should be placed? Which replica should be deleted if there is no enough space in data storage?

Future Trade-Off Space More uniform distribution of files among nodes – Reduces bottlenecks at nodes that get accessed more often – NP-hard (variant of knapsack problem) – Use best algorithm/heuristic combination for task Demand-based replication – Popular files have higher level of replication – Files distributed closer to nodes making most requests – Delete the file with the least demand.

THANK YOU!