Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Chubby Lock Service for Loosely-coupled Distributed Systems Mike Mosharaf Chowdhury.

Similar presentations


Presentation on theme: "The Chubby Lock Service for Loosely-coupled Distributed Systems Mike Mosharaf Chowdhury."— Presentation transcript:

1 The Chubby Lock Service for Loosely-coupled Distributed Systems Mike Burrows @OSDI’06 Mosharaf Chowdhury

2 Problem (Un)Locking-as-a-service – Leader election – Synchronization –…–… Uses Paxos to solve the distributed consensus problem in an asynchronous environment

3 Overview Goals/Non-goals Primary – Availability – Reliability – Usability & deployability Secondary – Performance Non-goals – Storage capacity Use Cases Planned usage by – GFS, – BigTable, and – MegaStore Also heavily used as – Internal name service – MapReduce rendezvous point

4 System Structure Proxy Server R R R R R R R R M Reads are satisfied by the master alone Writes are acknowledged after updating a majority

5 How to Use it? UNIX File system like interface – API modeled in a similar way Read/write locks on each file/directory – Advisory locks – Coarse-grained Event notification – After corresponding action

6 Cogs KeepAlives – Piggybacks event notifications, cache invalidations etc. Leases/TimeOuts – Master and client-side local leases – Failover handling

7 Developers are … Our developers sometimes do not plan for high availability in the way one would wish. A lock-based interface is more familiar to our programmers. Our developers are confused by non-intuitive caching semantics, so we prefer consistent caching. However, mistakes, misunderstanding and the differing expectations of our developers lead to efforts that are similar to attacks. Despite attempts at education, our developers regularly write loops that retry indefinitely when a file is not present, or poll a file by opening it and closing it repeatedly when one might expect they would open the file just once. Developers also fail to appreciate the difference between a service being up, and that service being available to their applications. Human 

8 Hit or Miss? Scales to 90000 clients – Can be scaled further using proxies and partitioning 61 outages in total – 52 under 30s (almost no impact) – 1 outage due to overload Data loss on 6 occasions – 4 due to software error (fixed) – 2 due to operators Scalable Available Reliable

9

10 More Numbers Naming-related events – 60% file opens – 46% stored files 10 clients use each cached file out of 230k – 14% of cache are negative caches for names KeepAlive account for 93% traffic – Uses UDP instead of TCP


Download ppt "The Chubby Lock Service for Loosely-coupled Distributed Systems Mike Mosharaf Chowdhury."

Similar presentations


Ads by Google