CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos – A. Pavlo Lecture#26: Database Systems.

Slides:



Advertisements
Similar presentations
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
Advertisements

Spanner: Google’s Globally-Distributed Database James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat,
Spanner: Google’s Globally-Distributed Database By - James C
Presented By Alon Adler – Based on OSDI ’12 (USENIX Association)
Consensus Algorithms Willem Visser RW334. Why do we need consensus? Distributed Databases – Need to know others committed/aborted a transaction to avoid.
Chapter 13 (Web): Distributed Databases
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Transaction Processing Lecture ACID 2 phase commit.
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
CS 582 / CMPE 481 Distributed Systems
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System.
Distributed Databases
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#23: Distributed Database Systems (R&G.
Distributed Storage System Survey
Distributed Systems Tutorial 11 – Yahoo! PNUTS written by Alex Libov Based on OSCON 2011 presentation winter semester,
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
DISTRIBUTED DATABASE SYSTEM.  A distributed database system consists of loosely coupled sites that share no physical component  Database systems that.
Presented by Dr. Greg Speegle April 12,  Two-phase commit slow relative to local transaction processing  CAP Theorem  Option 1: Reduce availability.
CSC 536 Lecture 10. Outline Case study Google Spanner Consensus, revisited Raft Consensus Algorithm.
NoSQL overview 杨振东. An order, which looks like a single aggregate structure in the UI, is split into many rows from many tables in a relational database.
Transaction Communications Yi Sun. Outline Transaction ACID Property Distributed transaction Two phase commit protocol Nested transaction.
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
Data in the Cloud – I Parallel Databases The Google File System Parallel File Systems.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
DISTRIBUTED COMPUTING
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
CS 347Lecture 9B1 CS 347: Parallel and Distributed Data Management Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina.
Distributed Databases DBMS Textbook, Chapter 22, Part II.
Databases Illuminated
XA Transactions.
Distributed Transactions Chapter – Vidya Satyanarayanan.
Distributed Databases
MBA 664 Database Management Systems Dave Salisbury ( )
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#25: Column Stores.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#28: Modern Systems.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#24: Distributed Database Systems (R&G.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed DBMS, Query Processing and Optimization
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Google Spanner Steve Ko Computer Sciences and Engineering University at Buffalo.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
Distributed Databases
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Detour: Distributed Systems Techniques
CSE-291 (Distributed Systems) Winter 2017 Gregory Kesden
MongoDB Distributed Write and Read
CSE-291 (Cloud Computing) Fall 2016
CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.
C. Faloutsos Concurrency control - deadlocks
Distributed Transactions and Spanner
CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden
MVCC and Distributed Txns (Spanner)
EECS 498 Introduction to Distributed Systems Fall 2017
Lecture#24: Distributed Database Systems
RELIABILITY.
CSE 486/586 Distributed Systems Concurrency Control --- 3
Lecture 21: Replication Control
Concurrency Control II and Distributed Transactions
Lecture 21: Replication Control
Distributed Databases
Presentation transcript:

CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#26: Database Systems

CMU SCS System Votes Faloutsos/PavloCMU SCS /6152

CMU SCS Two-Phase Commit Faloutsos/PavloCMU SCS /6153 Node 1Node 2 Application Server Commit Request Node 3 OK Phase1: Prepare Phase2: Commit Participant Coordinator

CMU SCS Paxos 4

CMU SCS Paxos Consensus protocol where a coordinator proposes an outcome (e.g., commit or abort) and then the participants vote on whether that outcome should succeed. Does not block if a majority of participants are available and has provably minimal message delays in the best case. –First correct protocol that was provably resilient in the face asynchronous networks Faloutsos/PavloCMU SCS /6155

CMU SCS Paxos Faloutsos/PavloCMU SCS /6156 Node 1Node 2 Application Server Commit Request Node 3 Proposer Node 4 Acceptor Propose Commit Agree Accept

CMU SCS Paxos Faloutsos/PavloCMU SCS /6157 Node 1Node 2 Application Server Commit Request Node 3 Proposer Node 4 Acceptor Propose Commit Agree Accept X

CMU SCS Paxos Proposer Acceptors Propose(n) Agree(n) Propose(n+1) Commit(n) Reject(n, n+1) Commit(n+1) Agree(n+1) Accept(n+1)

CMU SCS 2PC vs. Paxos 2PC is a degenerate case of Paxos. –Single coordinator. –Only works if everybody is up. Use leases to determine who is allowed to propose new updates to avoid continuous rejection. Faloutsos/PavloCMU SCS /6159

CMU SCS Google Spanner Google’s geo-replicated DBMS (>2011) Schematized, semi-relational data model. Concurrency Control: –2PL + T/O (Pessimistic) –Externally consistent global write-transactions with synchronous replication. –Lock-free read-only transactions. 10

CMU SCS Google Spanner 11 CREATE TABLE users { uid INT NOT NULL, VARCHAR, PRIMARY KEY (uid) }; CREATE TABLE albums { uid INT NOT NULL, aid INT NOT NULL, name VARCHAR, PRIMARY KEY (uid, aid) } INTERLEAVE IN PARENT users ON DELETE CASCADE; CREATE TABLE users { uid INT NOT NULL, VARCHAR, PRIMARY KEY (uid) }; CREATE TABLE albums { uid INT NOT NULL, aid INT NOT NULL, name VARCHAR, PRIMARY KEY (uid, aid) } INTERLEAVE IN PARENT users ON DELETE CASCADE; users(1001) albums(1001, 9990) albums(1001, 9991) users(1002) albums(1002, 6631) albums(1002, 6634)

CMU SCS Google Spanner Ensures ordering through globally unique timestamps generated from atomic clocks and GPS devices. Database is broken up into tablets: –Use Paxos to elect leader in tablet group. –Use 2PC for txns that span tablets. TrueTime API 12

CMU SCS Google Spanner Node 1Node 2 NETWORK Application Server Set A=2, B=9 Application Server A=1 Set A=0, B=7 B=8 T1T2 13 Paxos or 2PC

CMU SCS Google F1 OCC engine built on top of Spanner. –Read phase followed by a write phase –In the read phase, F1 returns the last modified timestamp with each row. No locks. –The timestamp for a row is stored in a hidden lock column. The client library returns these timestamps to the F1 server –If the timestamps differ from the current timestamps at the time of commit the transaction is aborted 14

CMU SCS Redis Remote Dictionary Server (2009) Key-value data store: –Values can be strings, hashes, lists, sets and sorted sets. –Single-threaded execution engine. In-memory storage: –Snapshots + WAL for persistence. CMU SCS /61515

CMU SCS Redis 16 STRING page:index.html → … view_count → SET users_logged_in → { 1, 2, 3, 4, 5 } latest_post_ids → { 111, 112, 119, … } LIST HASH user:999:session → time => username => tupac SORTED SET current_user_scores → odb ~ 11 tupac ~ 12 biggie ~ 19 eazye ~ 20 Keys Values

CMU SCS

Redis Asynchronous master-slave replication: –Master sends oplog to downstream replicas. –Do not wait for acknowledgements. Newer version can ensure that at least some replicas are available before accepting writes but still not check whether they received those writes. CMU SCS /61518

CMU SCS Redis Supports some notion of transactions: –Operations are batched together and executed serially on server side. –Allows for compare-and-swap. Does not support rollback! CMU SCS /61519

CMU SCS MongoDB Document Data Model –Think JSON, XML, Python dicts –Not Microsoft Word documents Different terminology: –Document → Tuple –Collection → Table/Relation 20

CMU SCS MongoDB JSON-only query API Single-document atomicity. –No server-side joins: –“Pre-join” collections by embedding related documents inside of each other. No cost-based query planner / optimizer. 21

CMU SCS MongoDB A customer has orders and each order has order items. 22 Customers Orders Order Items R2(orderId, custId, …) R1(custId, name, …) R3(itemId, orderId, …)

CMU SCS MongoDB A customer has orders and each order has order items. 23 Customers Orders Order Items Customer Order Order Item ⋮ { "custId" : 1234, "custName" : "Christos", "orders" : [ { "orderId" : 10001, "orderItems" : [ { "itemId" : "XXXX", "price" : }, { "itemId" : "YYYY", "price" : } ] }, { "orderId" : 10050, "orderItems" : [ { "itemId" : “ZZZZ", "price" : } ] } ] }

CMU SCS MongoDB Heterogeneous distributed components. –Centralized query router. Master-slave replication. Auto-sharding: –Define ‘partitioning’ attributes for each collection (hash or range). –When a shard gets too big, the DBMS automatically splits the shard and rebalances. 24

CMU SCS MongoDB Originally used mmap storage manager –No buffer pool. –Let the OS decide when to flush pages. –Single lock per database. Version 3 (2015) now supports pluggable storage managers. –WiredTiger from BerkeleyDB alumni. –Fine-grained locking. 25