Consistent Data Access From Data Centers

Slides:



Advertisements
Similar presentations
Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Andersen * Princeton, Intel Labs, CMU Dont Settle for Eventual : Scalable Causal Consistency.
Advertisements

Transactional storage for geo-replicated systems Yair Sovran, Russell Power, Marcos K. Aguilera, Jinyang Li NYU and MSR SVC.
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 7: Consistency 4/13/20151Distributed Systems - COMP 655.
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Facebook f4 Steve Ko Computer Sciences and Engineering University at Buffalo.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
2/18/2004 Challenges in Building Internet Services February 18, 2004.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
PETAL: DISTRIBUTED VIRTUAL DISKS E. K. Lee C. A. Thekkath DEC SRC.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Configuring File Services Lesson 6. Skills Matrix Technology SkillObjective DomainObjective # Configuring a File ServerConfigure a file server4.1 Using.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
Eiger: Stronger Semantics for Low-Latency Geo-Replicated Storage Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡ * Princeton,
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
12/17/2015Distributed Systems - Comp 6551 Consistency and Replication The problems we are trying to solve Types of consistency Approaches to propagation.
Chap 7: Consistency and Replication
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Distributed Systems Architectures Chapter 12. Objectives  To explain the advantages and disadvantages of different distributed systems architectures.
Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.
Chapter 1 Characterization of Distributed Systems

CSCI5570 Large Scale Data Processing Systems
Configuring File Services
CSE 486/586 Distributed Systems Consistency --- 2
Processes and threads.
Advanced Topics in Concurrency and Reactive Programming: Case Study – Google Cluster Majeed Kassis.
Steve Ko Computer Sciences and Engineering University at Buffalo
Steve Ko Computer Sciences and Engineering University at Buffalo
CSE 486/586 Distributed Systems Mid-Semester Overview
Large Distributed Systems
CSE 486/586 Distributed Systems Consistency --- 2
CSE 486/586 Distributed Systems Consistency --- 1
Discover How Your Business Can Benefit from a Facebook Fanpage
Discover How Your Business Can Benefit from a Facebook Fanpage
CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.
CSCI5570 Large Scale Data Processing Systems
Distribution and components
Chapter 19: Distributed Databases
The SNOW Theorem and Latency-Optimal Read-Only Transactions
Replication and Consistency
I Can’t Believe It’s Not Causal
CSE 486/586 Distributed Systems Consistency --- 3
EECS 498 Introduction to Distributed Systems Fall 2017
Replication and Consistency
EECS 498 Introduction to Distributed Systems Fall 2017
Transactions.
CSE 486/586 Distributed Systems Consistency --- 1
SAMANVITHA RAMAYANAM 18TH FEBRUARY 2010 CPE 691
Scalable Causal Consistency
CAP Theorem and Consistency Models
Distributed Systems CS
CONSISTENCY IN DISTRIBUTED SYSTEMS
Transaction Properties: ACID vs. BASE
View Change Protocols and Reconfiguration
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Advertising, Branding, and social media
Database System Architectures
Global Distribution.
CSE 486/586 Distributed Systems Consistency --- 2
Network management system
CSE 486/586 Distributed Systems Consistency --- 3
CSE 486/586 Distributed Systems Consistency --- 1
Replication and Consistency
Presentation transcript:

Consistent Data Access From Data Centers

General Architecture of Data Centers Network of servers containing data and performing computations Data replicated across servers Local area networks within a data center Wide area network for geo-replicated data centers Process failures and network failures (including network partition) not unlikely Scalability is another requirement

From ACMQUEUE paper Geo-replicated storage provides copies of the same data at multiple, geographically distinct locations. Facebook, for example, geo-replicates its data (profiles, friends lists, likes, etc.) to data centers on the east and west coasts of the United States, and in Europe. In each data center, a tier of separate Web servers accepts browser requests and then handles those requests by reading and writing data from the storage system, as shown in figure 1.

Geo-Replication

Geo-replication Benefits Geo-replication brings two key benefits to Web services: fault tolerance through redundancy: if one data center fails, others can continue to provide the service. low latency through proximity: clients can be directed to and served by a nearby data center to avoid speed-of-light delays associated with cross-country or round-the-globe communication.

Low Latency Front-end Web servers read or write data from the storage system potentially many times to answer a single request; therefore, low latency in the storage system is critical for enabling fast page-load times, which are linked to user engagement with a service—and, thus, revenue. An always-available and partition-tolerant system can provide low latency on the order of milliseconds by serving all operations in the local data center. A strongly consistent system must contact remote data centers for reads and/or writes, which takes hundreds of milliseconds.

ALPS Systems systems that provide these four properties: always Available, low Latency, Partition tolerance Scalability. Scalability implies that adding storage servers to each data center produces a proportional increase in storage capacity and throughput. Scalability is critical for modern systems because data has grown far too large to be stored or served by a single machine.

ALPS Always Available All operations issued to the data store complete successfully No operation can block indefinitely or fail due to unavailability of data.

ALPS Always Available Low Latency Client operations complete “quickly” Suggested average performance is of a few milliseconds, and worst case performance (i.e., 99.9th percentile) of 10s or 100s of milliseconds.

ALPS Always Available Low Latency Partition Tolerance The data store continues to operate under network partitions, e.g., one separating datacenters in Asia from the United States

ALPS Always Available Low Latency Partition Tolerance High Scalability The data store scales out linearly Adding N resources to the system increases aggregate throughput and storage capacity by O(N)

Consistency in ALPS Systems the consistency offered by existing ALPS systems such as Amazon's Dynamo, LinkedIn's Project Voldemort, and Facebook/Apache's Cassandra, is eventual consistency: writes to one data center will eventually appear at other data centers, and if all data centers have received the same set of writes, they will have the same values for all data. Eventual consistency does not say anything about the ordering of operations. This means that different data centers can reflect arbitrarily different sets of operations.

Anomalies in Eventually Consistent Systems The out-of-order arrival leads to many potential anomalies in eventually consistent systems. Here are a few examples for a social network: Figure 2 shows that in the West Coast data center, Alice posts, she comments, and then Bob comments. In the East Coast data center, however, Alice's comment has not appeared, making Bob look less than kind. Figure 3 shows that in the West Coast data center, a grad student carefully deletes incriminating photos before accepting an advisor as a friend. Unfortunately, in the East Coast data center, the friend-acceptance appears before the photo deletions, allowing the advisor to see the photos.3

Anomaly 1 Figure 2 shows that in the West Coast data center, Alice posts, she comments, and then Bob comments. In the East Coast data center, however, Alice's comment has not appeared, making Bob look less than kind.

Anomaly 2

Anomalies in Eventually Consistent Systems – cont’d Figure 4 shows that in the West Coast data center, Alice uploads photos, creates an album, and then adds the photos to the album, but in the East Coast data center, the operations appear out of order and her photos do not end up in the album. Finally, in figure 5, Cindy and Dave have $1,000 in their joint bank account. Concurrently, Dave withdraws $1,000 from the East Coast data center and Cindy withdraws $1,000 from the West Coast data center. Once both withdrawals propagate to each data center, their account is in a consistent state (-$1,000), but it is too late to prevent the mischievous couple from making off with their ill-gotten gains.

Anomaly 3

Anomaly 4

Better Consistency for ALPS Systems Our research systems, COPS11 and Eiger,12 have pushed on the properties that ALPS systems can provide. In particular, they provide causal consistency instead of eventual, which prevents the first three anomalies. (The fourth anomaly in figure 5 is unfortunately unavoidable in a system that accepts writes in every location and guarantees low latency.) In addition, they provide limited forms of read-only and write-only transactions that allow programmers to consistently read or write data spread across many different machines in a data center.

Causal Consistency Causal consistency ensures that operations appear in the order the user intuitively expects. More precisely, it enforces a partial order over operations that agrees with the notion of potential causality. If operation A happens before operation B, then any data center that sees operation B must see operation A first.

Rules defining potential causality Three rules define potential causality:9 1. Thread of execution. If a and b are two operations in a single thread of execution, then a -> b if operation a happens before operation b. 2. Reads-from. If a is a write operation and b is a read operation that returns the value written by a, then a -> b. 3. Transitivity. For operations a, b, and c, if a -> b and b -> c, then a -> c. Thus, the causal relationship between operations is the transitive closure of the first two rules. Causal consistency ensures that operations appear in an order that agrees with these rules

Regularity 1: No Missing Comments In the West Coast data center, Alice posts, and then she and Bob comment: Op1 Alice: I've lost my wedding ring.  Op2 Alice: Whew, found it upstairs.  Op3 [Bob reads Alice's post and comment.]  Op4 Bob: I'm glad to hear that.   Op1 -> Op2 by the thread-of-execution rule; Op2 -> Op3 by the reads-from rule; Op3 -> Op4 by the thread-of-execution rule. Write operations are only propagated and applied to other data centers, so the full causal ordering that is enforced is Op1 -> Op2 -> Op4.

Regularity 1 (cont’d) Now, in the East Coast data center, operations can appear only in an order that agrees with causality. Thus:   Op1 Alice: I've lost my wedding ring. Then   Op2 Alice: Whew, found it upstairs.   Op4 Bob: I'm glad to hear that. but never the anomaly that makes Bob look unkind.

Regularity 2: No Leaked Photos In the West Coast data center, a grad student carefully deletes incriminating photos before accepting an advisor as a friend:   Op1 [Student deletes incriminating photos.]   Op2 [Student accepts advisor as a friend.]   Op1 -> Op2 by the thread-of-execution rule. Now, in the East Coast data center, operations can appear only in an order that agrees with causality, which is:   [Student deletes incriminating photos.] Then   [Student accepts advisor as a friend.] but never the anomaly that leaks photos to the student's advisor.

Regularity 3: Normal Photo Album In the West Coast data center, Alice uploads photos and then adds them to her Summer 2013 album:   Op1 [Alice uploads photos.]   Op2 [Alice creates an album.]   Op3 [Alice adds photos to the album.]   Op1 -> Op2 -> Op3 by the thread-of-execution rule. Now, in the East Coast data center, the operations can appear only in an order that agrees with causality: Then but never in a different order that results in an empty album or complicates what a programmer must think about

Causal Consistency And Anomaly 4 Anomaly 4 represents the primary limitation of causality consistency: it cannot enforce global invariants. Anomaly 4 has an implicit global invariant—that bank accounts cannot go below $0—that is violated. This invariant cannot be enforced in an ALPS system. Availability dictates that operations must complete, and low latency ensures they are faster than the time it takes to communicate between data centers. Thus, the operations must return before the data centers can communicate and discover the concurrent withdrawals.

Global Invariants True global invariants are quite rare, however. E-commerce sites, where it seems inventory cannot go below 0, have back-order systems in place to deal with exactly that scenario. Even some banks do not enforce global $0 invariants, as shown by a recent concurrent withdrawal attack on ATMs that extracted $40 million from only 12 account numbers.14

END