Presentation is loading. Please wait.

Presentation is loading. Please wait.

Replication and Availability in Distributed Systems

Similar presentations


Presentation on theme: "Replication and Availability in Distributed Systems"— Presentation transcript:

1 Replication and Availability in Distributed Systems
CS455 Introduction to Distributed Systems Department of Computer Science Colorado State University Sagar Reddy Bijjam Srinivas Reddy Kontham

2 Why is this Problem Important?
We are producing more data than ever - 90% of all data ever produced generated in last two years We need to concurrently access this data We need quick data access We need high system availability

3 Problem Characterization
Failures are inevitable, whether it may be hardware, programmatic. Two primary reasons for replication: Increasing reliability: – If a replica crashes, system can continue working by switching to other replicas. Improving performance: – Important for distributed systems over large geographical areas. – Divide the work over a number of servers. – Place data in the proximity of clients.

4 Trade-off space for solutions in this area
The CAP principle: Strong consistency: system should be able to provide concurrent updates. High availability: any consumer of data can always reach some replica Partition resilience: the system can survive network partitions CAP: Strong Consistency, High Availability, Partition-resilience: Pick at most 2!

5 Dominant Approaches to the Problem(1/2)
Pessimistic Algorithms – Guarantees Strong consistency at the cost of availability – Based On Quorum Consensus – Gifford’s Voting Protocol Optimistic Algorithms – Gives more importance to availability at the expense of consistency – Good technique if probability of conflicts is small – Coda replication

6 Dominant Approaches to the Problem(2/2)
Highly Available Pessimistic Algorithm – Increased availability with Strong Consistency Dynamic Voting Protocol Network partitions and other failures can hurt fault-tolerance of static voting Dynamic voting can boost fault-tolerance by adapting – the number of votes assigned to various nodes – the set of nodes that can form read/write quorums

7 Insights Gleaned Replication is mainly used to improve the availability and performance But there exists several trade-offs in choosing the replication strategy Optimistic algorithms are best suitable where availability is main criteria. Whereas pessimistic algorithms are at there best in situations where availability is of less concern. So choose the best that fits the situation.

8 Problem Space in the Future
Which files should be replicated? How many replicas should be created? Where the replicas should be placed? Which replica should be deleted if there is no enough space in data storage?

9 Future Trade-Off Space
More uniform distribution of files among nodes – Reduces bottlenecks at nodes that get accessed more often – NP-hard (variant of knapsack problem) – Use best algorithm/heuristic combination for task Demand-based replication – Popular files have higher level of replication – Files distributed closer to nodes making most requests – Delete the file with the least demand.

10 THANK YOU!


Download ppt "Replication and Availability in Distributed Systems"

Similar presentations


Ads by Google