EECS 498 Introduction to Distributed Systems Fall 2017

Slides:



Advertisements
Similar presentations
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 Presented by Wenhao Xu University of British Columbia.
Advertisements

Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
G O O G L E F I L E S Y S T E M 陳 仕融 黃 振凱 林 佑恩 Z 1.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google Jaehyun Han 1.
The Google File System Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani 1CS5204 – Operating Systems.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Google File System 1Arun Sundaram – Operating Systems.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
The Google File System.
Google File System.
Northwestern University 2007 Winter – EECS 443 Advanced Operating Systems The Google File System S. Ghemawat, H. Gobioff and S-T. Leung, The Google File.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
1 The Google File System Reporter: You-Wei Zhang.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
CS 5204 (FALL 2005)1 Leases: An Efficient Fault Tolerant Mechanism for Distributed File Cache Consistency Gray and Cheriton By Farid Merchant Date: 9/21/05.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
GFS : Google File System Ömer Faruk İnce Fatih University - Computer Engineering Cloud Computing
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Presenter: Seikwon KAIST The Google File System 【 Ghemawat, Gobioff, Leung 】
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.
Google File System Sanjay Ghemwat, Howard Gobioff, Shun-Tak Leung Vijay Reddy Mara Radhika Malladi.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
CS6320 – Performance L. Grewe.
Google File System.
CSE 486/586 Distributed Systems Consistency --- 1
CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.
GFS.
The Google File System (GFS)
Google Filesystem Some slides taken from Alan Sussman.
Google File System CSE 454 From paper by Ghemawat, Gobioff & Leung.
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
EECS 498 Introduction to Distributed Systems Fall 2017
Replication and Consistency
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
Distributed Systems CS
EECS 498 Introduction to Distributed Systems Fall 2017
آزمايشگاه سيستمهای هوشمند علی کمالی زمستان 95
Replication and Consistency
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
The Google File System (GFS)
CSE 486/586 Distributed Systems Consistency --- 1
The Google File System (GFS)
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
The Google File System (GFS)
The Google File System (GFS)
Scalable Causal Consistency
EECS 498 Introduction to Distributed Systems Fall 2017
Distributed Systems CS
The Google File System (GFS)
THE GOOGLE FILE SYSTEM.
by Mikael Bjerga & Arne Lange
Last Class: Web Caching
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google SOSP’03, October 19–22, 2003, New York, USA Hyeon-Gyu Lee, and Yeong-Jae.
Replication and Consistency
The Google File System (GFS)
Presentation transcript:

EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha

Primary Backup Replication View service Backup Client Primary Backup Backup September 27, 2017 EECS 498 – Lecture 7

View service Monitors primary and backups to detect when to change view Can change only after primary has ACKed view Primary ACKs only after syncing with backups Clients cache view for scalability To address split brain, primary must check with backup before serving client September 27, 2017 EECS 498 – Lecture 7

Replicating Bank Database One copy in SF (primary), one in NY (backup) “Deposit $100” “Pay 1% interest” $1,000 $1,000 September 27, 2017 EECS 498 – Lecture 7

Primary-Backup Sync C1 C2 P B “Deposit $100” “Pay 1% interest” $1,000 $1,111 $1,000 B $1,110 September 27, 2017 EECS 498 – Lecture 7

Ordering of Updates All updates must be applied in the same order at all replicas External view: Total ordering of writes Primary effectively serializes all writes September 27, 2017 EECS 498 – Lecture 7

Serving Reads Can backups serve reads? Assume no split brain What if primary’s state is ahead of backup? Updates to primary not yet externally visible Effect of read equivalent to if primary fails at this point What if backup’s state is ahead of primary? Different backups may not be in sync Primary may get replaced before it applies update September 27, 2017 EECS 498 – Lecture 7

Reads: Primary vs. Backup “Deposit $100” P B1 B2 $1000 $1100 C2 September 27, 2017 EECS 498 – Lecture 7

Desired Properties All writes are totally ordered Once read returns particular value, all later reads should return that value or value of later write Once a write completes, all later reads should return value of that write or value of later write September 27, 2017 EECS 498 – Lecture 7

Reads relative to Writes C1 “Pay 1% interest” “Deposit $100” P B $1100 $1111 C2 September 27, 2017 EECS 498 – Lecture 7

Linearizability Total ordering of writes Read returns last completed write Single copy semantics Externally visible effects of writes and reads are equivalent to if there existed a single copy Users oblivious to replication September 27, 2017 EECS 498 – Lecture 7

Consistency Spectrum Consistency: What are the properties of externally visible effects? Read-after- write Eventual Causal Sequential Linearizability Consistency Ease of programming September 27, 2017 EECS 498 – Lecture 7

Why weaken consistency? Shouldn’t we always strive for single copy semantics? Comes at the expense of lower performance Latency vs. consistency tradeoff September 27, 2017 EECS 498 – Lecture 7

Consistency Spectrum Read-after- write Eventual Causal Sequential Linearizability Consistency Ease of programming Latency September 27, 2017 EECS 498 – Lecture 7

Causal Consistency Order of causally related writes must be preserved in values returned to reads If W1  W2, then if a read sees effect of W2, it must see effect of W1 Example: Facebook News Feed Okay to not see all completed posts But, if you see a comment, you must see the post on which the comment is made Main utility: Lazy sync between replicas September 27, 2017 EECS 498 – Lecture 7

Linearizability with Locks Lock service Replica 1 Client Replica 2 Problems? Client failures! Replica 3 September 27, 2017 EECS 498 – Lecture 7

Lease Lock with timeout If lease holder fails, not a problem because lease will expire How to pick lease timeout value? Short timeout  Client needs to renew lease Long timeout  Unnecessarily block operations September 27, 2017 EECS 498 – Lecture 7

Discrepancy in Lease Validity Lease service Replica 1 Client Replica 2 Scenario in which lease server and client differ about lease validity? Replica 3 September 27, 2017 EECS 498 – Lecture 7

Discrepancy in Lease Validity Message that grants lease may have high delay Clock at lease holder and lease service may have different skew How to account for potential discrepancy? September 27, 2017 EECS 498 – Lecture 7

Discrepancy in Lease Validity Lease service Replica 1 Client Replica 2 Replica must check with lease service to confirm lease validity Replica 3 September 27, 2017 EECS 498 – Lecture 7

Case study: GFS Google File System Distributed storage system tailored to Google’s workload Workload characteristics and setting: Multi-GB files Files are mostly appended to Failures are extremely common September 27, 2017 EECS 498 – Lecture 7

High-level Design Files are split into 64 MB chunks Every chunk is replicated on three randomly selected machines A central chunkmaster server picks and knows where every replicas of every chunk are stored September 27, 2017 EECS 498 – Lecture 7

GFS Overview September 27, 2017 EECS 498 – Lecture 7

Replication in GFS Chunkmaster Backup Client Primary Backup September 27, 2017 EECS 498 – Lecture 7

Replication in GFS Chunkmaster Backup Client Primary Backup Challenge introduced due to large writes: High latency when writing to distant primary How to optimize write performance? September 27, 2017 EECS 498 – Lecture 7

Data flow vs. Control flow September 27, 2017 EECS 498 – Lecture 7

Handling Server Failures Chunkmaster grants 60 sec lease to primary Utility of lease? What if lease expires in the midst of write? New version number upon lease renewal Replicas locally log version number Helps detect stale replicas Store checksums to detect corrupted data September 27, 2017 EECS 498 – Lecture 7

Handling Master Failures Replicate chunkmaster Any update to state logged to local disk and propagated to replicas Shadow masters only serve reads Potentially out of date What if all replicas of master wiped out? September 27, 2017 EECS 498 – Lecture 7

GFS Performance Benchmark September 27, 2017 EECS 498 – Lecture 7