Spanner Lixin Shi 6.033 Quiz2 Review (Some slides from Spanner’s OSDI presentation)

Slides:



Advertisements
Similar presentations
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 Presented by Wenhao Xu University of British Columbia.
Advertisements

Spanner: Google’s Globally-Distributed Database James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat,
Spanner: Google’s Globally-Distributed Database By - James C
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
The Google File System Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani 1CS5204 – Operating Systems.
Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.
1 Chapter 3. Synchronization. STEMPusan National University STEM-PNU 2 Synchronization in Distributed Systems Synchronization in a single machine Same.
SPANNER: GOOGLE’S GLOBALLYDISTRIBUTED DATABASE James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat,
Presented By Alon Adler – Based on OSDI ’12 (USENIX Association)
Exercises for Chapter 17: Distributed Transactions
The Google File System (GFS). Introduction Special Assumptions Consistency Model System Design System Interactions Fault Tolerance (Results)
Distributed Systems 2006 Styles of Client/Server Computing.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
Synchronization. Physical Clocks Solar Physical Clocks Cesium Clocks International Atomic Time Universal Coordinate Time (UTC) Clock Synchronization Algorithms.
Transaction Management and Concurrency Control
Session - 14 CONCURRENCY CONTROL CONCURRENCY TECHNIQUES Matakuliah: M0184 / Pengolahan Data Distribusi Tahun: 2005 Versi:
Transaction Management
7/2/2015EECS 584, Fall Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current.
9 Chapter 9 Transaction Management and Concurrency Control Hachim Haddouti.
Distributed storage for structured data
TRANSACTIONS AND CONCURRENCY CONTROL Sadhna Kumari.
Distributed Deadlocks and Transaction Recovery.
Spanner: Google’s Globally-Distributed Database James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman,Sanjay Ghemawat,
1 The Google File System Reporter: You-Wei Zhang.
Hardware Supported Time Synchronization in Multi-Core Architectures 林孟諭 Dept. of Electrical Engineering National Cheng Kung University Tainan, Taiwan,
5.1 Advanced Operating Systems Operating Systems Bugs Linux's code has been significantly growing during last years. Is this code bugs free? Obviously.
CSC 536 Lecture 10. Outline Case study Google Spanner Consensus, revisited Raft Consensus Algorithm.
Security and Transaction Nhi Tran CS 157B - Dr. Lee Fall, 2003.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
Distributed Transactions
Data in the Cloud – I Parallel Databases The Google File System Parallel File Systems.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
11/7/2012ISC329 Isabelle Bichindaritz1 Transaction Management & Concurrency Control.
The Google File System by S. Ghemawat, H. Gobioff, and S-T. Leung CSCI 485 lecture by Shahram Ghandeharizadeh Computer Science Department University of.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Distributed synchronization and mutual exclusion Distributed Transactions.
GFS : Google File System Ömer Faruk İnce Fatih University - Computer Engineering Cloud Computing
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.
Spanner: the basics Jeff Chase CPS 512 Fall 2015.
Page 1 Concurrency Control Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture Chunkservers Master Consistency Model File Mutation Garbage.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Google Spanner Steve Ko Computer Sciences and Engineering University at Buffalo.
18 September 2008CIS 340 # 1 Last Covered (almost)(almost) Variety of middleware mechanisms Gain? Enable n-tier architectures while not necessarily using.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Spanner: Google’s Globally Distributed Database
Spanner Storage insights
6.4 Data and File Replication
Spanner: Google’s Globally-Distributed Database
Google Filesystem Some slides taken from Alan Sussman.
Google File System CSE 454 From paper by Ghemawat, Gobioff & Leung.
Distributed Transactions and Spanner
MVCC and Distributed Txns (Spanner)
Concurrency Control II (OCC, MVCC) and Distributed Transactions
EECS 498 Introduction to Distributed Systems Fall 2017
Spanner: Google’s Globally-Distributed Database
Timewarp Elias Muche.
Concurrency Control II (OCC, MVCC)
Quiz 2 Review Slides.
Cloud scale storage: The Google File system
Concurrency Control II and Distributed Transactions
Cloud Computing Storage Systems
COS 418: Distributed Systems Lecture 16 Wyatt Lloyd
Transactions, Properties of Transactions
Presentation transcript:

Spanner Lixin Shi Quiz2 Review (Some slides from Spanner’s OSDI presentation)

Key Points to Remember Major Claims – Globally distributed data with configurable control – Consistent commit timestamps Consistency Model: external consistency TrueTime API Set of supported read/write operations – Paxos write – Relaxed read w/wo time stamp

Globally Distributed Data Cross-Datacenter Distribution User Configurable

Transaction Model Two-Phase locking with start/commit Transactional write and lock-free read Globally sortable time stamp with each commit Bounded error between time stamp and wall- clock time t T 1 : s 1 startcommit s 1 : Timestamp

External Consistency If a transaction T 1 commits before another transaction T 2 starts, then T 1 's commit timestamp is smaller than T 2 A real-world approximation of global wall- clock time consistency t T 1 : s 1 T 2 : s 2 startcommit s 1 < s 2 t T 1 : s 1 T 2 : s 2 startcommit s 1 < s 2 x

TrueTime API “Global wall-clock time” with bounded uncertainty time earliestlatest TT.now() 2*ε

Commit Wait What this gives you: T Pick s = TT.now().latest Acquired locks Release locks Wait until TT.now().earliest > ss average ε Commit wait average ε absolute start time < s < absolute commit time

From TrueTime to Consistency Recall: what TrueTime gives you: If there are two transactions: t T 1 : s 1 startcommit s 1 needs to be in this range t T 1 : s 1 T 2 : s 2 startcommit s 1 < s 2 s 1 range s 2 range

Supported Read/Write Operations Read-Write transaction – Paxos write – Commit wait Read transaction with timestamp s – Lock-free – Every replica tracks t safe – Read from any replica with t safe >= s Other variants of read – Standalone read: read with t.now().latest – Read with bounded timestamp

GFS Lixin Shi Quiz2 Review (Some slides from GFS’s SOSP presentation)

Design Assumption/Facts Hardware fails very often. Large files (>=100MB) are typical. For read operations: large streaming reads + small random reads. For write operations: Record appends are the prevalent form of writing. Need to handle concurrency. Design choice: high bandwidth >> low lantency

Architecture

Read Algorithm

Read Algorithm (cont.)

Write Algorithm

Write Algorithm (cont.)

Atomic Record Appends Atomic! A few changes from the write algorithm: