Continuous Consistency and Availability Haifeng Yu CPS 212 Fall 2002.

Continuous Consistency and Availability Haifeng Yu CPS 212 Fall 2002

2 Consistency in Replication l Replication comes with consistency cost: l Reasons for replication: Better performance and availability client server l Replication transforms client-server communication to server-server communication: Decrease performance Decrease availability

3 Strong Consistency and Optimistic Consistency l Traditionally, two choices for consistency level: Strong consistency: Strictly “in sync” Optimistic consistency: No guarantee at all Associated tradeoffs with each model Availability / Performance / Scalability Consistency Optimistic Consistency Strong Consistency

4 Problems with Binary Choice l Strong consistency incurs prohibitive overheads for many WAN apps Replication may even decrease performance, availability and scalability relative to a single server! l Optimistic consistency provides no consistency guarantee at all Resulting in upset users: Unbounded reservation conflicts Potentially render the app unusable: If traffic data is more than 1 hour stale, probably of little use l Applications cannot tune consistency level based on its environment Need to adapt to client, service and network characteristics

5 Continuous Consistency l Consistency is continuous rather than binary for many WAN apps These apps can benefit from exploiting the consistency spectrum between strong and optimistic consistency. Availability / Performance / Scalability Consistency Optimistic Consistency Strong Consistency Consistency Continuous Consistency Availability / Performance / Scalability

6 Quantifying Consistency l Many ways: Staleness (TTL in web caching): Invalidate Limit number of locally buffered writes buffered updates To Other Replicas

7 Applications ? l Applications: Web caching Airline reservation Distributed games Shared editor l Non-Applications: Some scientific computing problems Banking system Any application that has binary output l Application’s nature determines whether continuous consistency is applicable

8 Trading Consistency for Performance l Airline reservation: running at Berkeley, Utah, Duke Strong Consistency Optimistic Consistency [Yu’02, TOCS]

9 The Cost of Increased Performance l Increased performance comes with a cost Adaptively trade consistency for performance based on client, network, and service conditions

10 Model vs. Protocol l Continuous consistency model is a spec. l Protocol is anything that can enforce the spec. Corollary: Strong consistency protocol is a protocol for any model l Many protocols for a specific model, some are good, others are not

11 Designing a Continuous Consistency Model l Model is a spec, thus quantifying consistency (in a bad way) is trivial l Only applications know its definition of consistency Airline reservation vs. distributed games l What is a “good” continuous consistency model? Can be used by diverse apps Practical

12 Distributed Consensus and Leader Election l What does “continuous consistency” mean ? Allow at most k decision values Allow at most k leaders l Helps overcome some impossibilities Unique decision value requires ½ majority K decision values allow any partition with 1/(k + 1) nodes to decide

13 Group Membership Service l Def: Keep track of which nodes belong to which group l Traditionally, group membership only maintain a single group Primary-partition membership services Corresponds to strong consistency l Recently, partitionable membership services Still active area of research Corresponds to optimistic consistency l Continuous consistency: Allow at most k groups Again, helps overcome the ½ majority limitation

14 Continuous Consistency Summary l WAN replication needs dynamically tunable consistency l Tradeoff between consistency and performance l How to design a continuous consistency model l Continuous consistency in other context l Next: Availability

15 What is Availability ? l No well-accepted availability metric for Internet services l “Uptime” metric can be misleading for Internet services Server may be inaccessible because of network partition l Available: “present or ready for immediate use” From Webster’s Collegiate Dictionary What does “immediate” mean? Time-out l Availability = (accepted accesses) / (submitted accesses) Implicit time-out in the definition

16 Perform-ability l User satisfaction is not binary What if a partial result is returned before time-out ? What if the result is sent back after an hour, or a day ? Availability is related to performance l Performability = reward function (quality and timeliness of result) l Determining reward function is hard !

17 Availability of an Internet Service l We use user-observed availability in our study: Availability = (accepted accesses) / (submitted accesses) Server client × 2% [Chandra et.al., USITS’01] reject due to server failure × 0.1% [MS press release,Jan’01]

18 Effects of Replication l Consistency may force a replica to reject an otherwise acceptable request Network Failure Rate Replica Rejection Rate client × < 2% × reject Replica reject × communication to maintain consistency failed > 0.1%

19 Limitations of Strong Consistency : Replicas : Clients Option 1: accept reads accept reads reject writes reject writes Option 2: accept reads reject reads accept writes reject writes

20 Effects of Continuous Consistency Option 1: accept reads accept reads reject writes reject writes New Option 1: accept reads accept reads accept first 10 writes accept first 5 writes allow replica to buffer 5 writes

21 Effects of Continuous Consistency Option 2: accept reads reject reads accept writes reject writes New Option 2: accept reads accept first few reads accept writes accept first 5 writes allow replica to buffer 5 writes

22 Consistency Impact is Inherent Availability Inconsistency Hard Bound 0% Consistency 100% Availability 100% Consistency l Hard bound always exist l We always know the to end points, but may not know the exact shape of the curve

23 Effects of Consistency Protocol l Achieved availability also depends on protocol Design better protocols Job of system designers Availability Inconsistency Upper Bound Protocol A Protocol B

24 Availability Optimizations l Technique should not be tied to model l Focus on two techniques: Retiring replicas Aggressive write propagation

25 Limitations of Strong Consistency : Replicas : Clients Option 1: accept reads accept reads reject writes reject writes Option 2: accept reads reject reads accept writes reject writes

26 Retiring Replicas l Obviously, such decision may not be optimal unless we have future knowledge Importance of prediction l Even with future knowledge, it is hard l In option 2, all replicas much reach an agreement Leader election We are experiencing partitions One option: Voting What if we don’t have majority?

27 Aggressive Write Propagation l Applicable to continuous consistency l Continuous consistency gives us “buffers” that can be utilized in case of network partition l Keep the buffer empty: Cannot predict the occurrence of network partitions Propagate writes more aggressively Cut down the amount of inconsistency accumulated in times of good connectivity

28 Effects of Aggressive Propagation l Baseline: Propagate writes only when necessary (lazily) l Aggressive: When necessary and every 3 seconds 8 replicas with measured faultload From [Yu’01, SOSP]

29 More Aggressive Propagation l Aggressive write propagation does not work in all cases l Availability optimizations can incur more communication Best availability achieved when we use a strong consistency protocol l Speaks of availability / performance tradeoffs

30 Availability of Other Systems l Consensus and leader election Blocks without majority l Group membership Blocks without majority l Relaxing consistency enables them to make progress Open Question: But will these systems still be useful ?

31 Availability Summary l Availability definition l Inherent impact of consistency on availability l Availability also depends on consistency protocols l Availability optimizations: Replica retirement Aggressive write propagation

32 Why can we easily approach the upper bound? l Simple protocols in our study can approach the upper bound closely Remember reaching the upper bound in general needs future knowledge l Related to the characteristics of the faultloads we measured and simulated Most partitions are singleton partitions Most transitions are: fully-connected → singleton partition → fully-connected l These characteristics are consistent with Internet hierarchical architecture

33 Dual Effects of Replication Scale on Availability l Consistency may force a replica to reject a request l Adding more replicas: Network Failure Rate Replica Rejection Rate l Availability = (1 - Network Failure Rate) * ( 1 - Rejection Rate) Too large or too small replication scale can hurt availability

34 Optimal Replication Scale l Optimal replication scale: Adding more replicas can hurt! Increase in “replica rejection rate” outweighs decrease in “network failure rate” l Optimal replication scale depends on Consistency level Network failure rate among replicas

Continuous Consistency and Availability Haifeng Yu CPS 212 Fall 2002.

Similar presentations

Presentation on theme: "Continuous Consistency and Availability Haifeng Yu CPS 212 Fall 2002."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Continuous Consistency and Availability Haifeng Yu CPS 212 Fall 2002.

Similar presentations

Presentation on theme: "Continuous Consistency and Availability Haifeng Yu CPS 212 Fall 2002."— Presentation transcript:

Similar presentations

About project

Feedback