EECS 498 Introduction to Distributed Systems Fall 2017

Slides:



Advertisements
Similar presentations
Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Andersen * Princeton, Intel Labs, CMU Dont Settle for Eventual : Scalable Causal Consistency.
Advertisements

Time-based Transactional Memory with Scalable Time Bases Torvald Riegel, Christof Fetzer, Pascal Felber Presented By: Michael Gendelman.
Megastore: Providing Scalable, Highly Available Storage for Interactive Services. Presented by: Hanan Hamdan Supervised by: Dr. Amer Badarneh 1.
Consistency Guarantees and Snapshot isolation Marcos Aguilera, Mahesh Balakrishnan, Rama Kotla, Vijayan Prabhakaran, Doug Terry MSR Silicon Valley.
Spanner: Google’s Globally-Distributed Database James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat,
Spanner: Google’s Globally-Distributed Database By - James C
SPANNER: GOOGLE’S GLOBALLYDISTRIBUTED DATABASE James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat,
Presented By Alon Adler – Based on OSDI ’12 (USENIX Association)
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#26: Database Systems.
Computer Science Lecture 14, page 1 CS677: Distributed OS Consistency and Replication Today: –Introduction –Consistency models Data-centric consistency.
CS 582 / CMPE 481 Distributed Systems
Session - 14 CONCURRENCY CONTROL CONCURRENCY TECHNIQUES Matakuliah: M0184 / Pengolahan Data Distribusi Tahun: 2005 Versi:
GentleRain: Cheap and Scalable Causal Consistency with Physical Clocks Jiaqing Du | Calin Iorgulescu | Amitabha Roy | Willy Zwaenepoel École polytechnique.
Spanner Lixin Shi Quiz2 Review (Some slides from Spanner’s OSDI presentation)
Spanner: Google’s Globally-Distributed Database James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman,Sanjay Ghemawat,
CSC 536 Lecture 10. Outline Case study Google Spanner Consensus, revisited Raft Consensus Algorithm.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Distributed synchronization and mutual exclusion Distributed Transactions.
Spanner: the basics Jeff Chase CPS 512 Fall 2015.
Database Isolation Levels. Reading Database Isolation Levels, lecture notes by Dr. A. Fekete, resentation/AustralianComputer.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Google Spanner Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Systems Lecture 5 Time and synchronization 1.
Transactions Examples
CS6320 – Performance L. Grewe.
MDCC: Multi-data Center Consistency
Spanner: Google’s Globally Distributed Database
a journey from the simple to the optimal
Spanner Storage insights
Distributed Computing
CSE 486/586 Distributed Systems Consistency --- 1
Operational & Analytical Database
Lecturer : Dr. Pavle Mogin
CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.
Spanner: Google’s Globally-Distributed Database
Lecture 5 Time and synchronization
Distributed Transactions and Spanner
The SNOW Theorem and Latency-Optimal Read-Only Transactions
COS 418: Advanced Computer Systems Lecture 5 Michael Freedman
EECS 498 Introduction to Distributed Systems Fall 2017
MVCC and Distributed Txns (Spanner)
Replication and Consistency
Concurrency Control II (OCC, MVCC) and Distributed Transactions
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
Spanner: Google’s Globally-Distributed Database
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
Replication and Consistency
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
Concurrency Control II (OCC, MVCC)
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
CSE 486/586 Distributed Systems Consistency --- 1
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
Leader Election Using NewSQL Database Systems
Scalable Causal Consistency
Lecture 21: Replication Control
Atomic Commit and Concurrency Control
EECS 498 Introduction to Distributed Systems Fall 2017
Concurrency Control II and Distributed Transactions
COS 418: Distributed Systems Lecture 16 Wyatt Lloyd
Lecture 21: Replication Control
CSE 486/586 Distributed Systems Consistency --- 2
Implementing Consistency -- Paxos
Concurrency control (OCC and MVCC)
CSE 486/586 Distributed Systems Time and Synchronization
Replication and Consistency
Presentation transcript:

EECS 498 Introduction to Distributed Systems Fall 2017 Harsha V. Madhyastha

November 29, 2017 EECS 498 – Lecture 20

November 29, 2017 EECS 498 – Lecture 20

Context for Spanner Distributed database that supports transactions BigTable: Database spread across servers within a datacenter Spanner: Database spread across data centers Employee ID Date of Joining Age Title Salary November 29, 2017 EECS 498 – Lecture 20

Support for Transactions Serializability Externally visible effects equivalent to some serial order of execution How is this different from linearizability? Serializability does not guarantee real time order Spanner offers strict serializability Serializability + Linearizability Previously considered impractical at global scale November 29, 2017 EECS 498 – Lecture 20

Example: Social Network Transaction T1: Remove supervisor from friend’s list Make a negative post about supervisor Transaction T2: Read friend list Read posts made by friends November 29, 2017 EECS 498 – Lecture 20

Executing Transactions Transaction latency = 4 rounds of wide-area communication 2 rounds for each phase of 2PL November 29, 2017 EECS 498 – Lecture 20

Executing Transactions Leader-based Paxos  each phase of 2PL in 1 round-trip Still too slow for read-only transactions November 29, 2017 EECS 498 – Lecture 20

Read-Only Txns with 2PL TC P1 P2 P3 Lock Commit November 29, 2017 EECS 498 – Lecture 20

Lock-free Read-Only Txns How to ensure T1’s read from P1 returns data from before T2? TC1 Read P1 P2 Lock Commit TC2 November 29, 2017 EECS 498 – Lecture 20

Snapshot Reads Every transaction has a commit timestamp When a row is modified by transaction, create new version Transaction’s timestamp is version number Read-only transaction reads latest version of data committed before its timestamp What property must be true for linearizability? Commit timestamps should monotonically increase November 29, 2017 EECS 498 – Lecture 20

Executing Transactions How to ensure commit timestamps monotonically increase? November 29, 2017 EECS 498 – Lecture 20

Synchronizing Clocks Accurate sources of time: GPS Atomic clocks Why not put these into every server? Spanner deploys a few of these in every DC All servers periodically sync with timemasters November 29, 2017 EECS 498 – Lecture 20

TrueTime Architecture GPS timemaster GPS timemaster GPS timemaster GPS timemaster Atomic-clock timemaster GPS timemaster Client Datacenter 1 Datacenter 2 … Datacenter n Current time = now ± ε November 29, 2017 EECS 498 – Lecture 20

TrueTime Implementation Between syncs, assume worst case clock drift ε +6ms 200 μs/sec reference uncertainty time 0sec 30sec 60sec 90sec November 29, 2017 EECS 498 – Lecture 20

TrueTime TT.now() returns a range: [earliest, latest] Current time is … … not less than earliest … not greater than latest High-level principle: Known unknowns better than unknown unknowns November 29, 2017 EECS 498 – Lecture 20

Timestamps and TrueTime Acquired locks Release locks T Pick commit timestamp s s Wait until TT.now().earliest > s = TT.now().latest average ε average ε November 29, 2017 EECS 498 – Lecture 20

Leader Leases Must ensure at most one leader in any replica group at any point in time In Spanner, leader gets a lease on how long it gets to be the leader How can group members come to consensus on leader leases to ensure leader disjointness? Assume clocks are perfectly synchronized, but network has arbitrary delays How to leverage TrueTime? November 29, 2017 EECS 498 – Lecture 20

Paxos-based Leader Leases When a replica responds to a proposer at time t, it grants lease from [t, t + 10s] t = TT.now().earliest() Proposer assumes leader lease until minimum of replica leases Lease valid until TT.now().latest > min lease November 29, 2017 EECS 498 – Lecture 20

Schema-Change Transactions Not scalable to acquire locks on all shards How can we leverage TrueTime? Schedule transaction for time in future Employee ID Date of Joining Age Title Salary Gender November 29, 2017 EECS 498 – Lecture 20

Cloud Spanner Spanner recently introduced to Google Cloud Claims to offer high availability Spanner’s communication over Google’s private network Well engineered over last decade Partitions are very uncommon Spanner unavailability << VM unavailability Don’t need app logic to deal with Spanner outage November 29, 2017 EECS 498 – Lecture 20

Causes of Spanner Outages November 29, 2017 EECS 498 – Lecture 20