What Should the Design of Cloud- Based (Transactional) Database Systems Look Like? Daniel Abadi Yale University March 17 th, 2011.

Slides:



Advertisements
Similar presentations
There is more Consensus in Egalitarian Parliaments Presented by Shayan Saeed Used content from the author's presentation at SOSP '13
Advertisements

Transactions - Concurrent access & System failures - Properties of Transactions - Isolation Levels 4/13/2015Databases21.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
Andy Pavlo April 13, 2015April 13, 2015April 13, 2015 NewS QL.
High throughput chain replication for read-mostly workloads
Calvin : Fast Distributed Transactions for Partitioned Database
Distributed Systems Overview Ali Ghodsi
HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.
C-Store: Data Management in the Cloud Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 5, 2009.
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
1 ICS 214B: Transaction Processing and Distributed Data Management Replication Techniques.
Overview Distributed vs. decentralized Why distributed databases
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
CS346: Advanced Databases
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System.
Consistent Join Queries for Cloud Data Stores Zhou Wei 1,2 Guillaume Pierre 1, Chi-Hung Chi 2 1 VU University Amsterdam 2 Tsinghua University Beijing.
Daniel Abadi Yale University. * The Big Data phenomenon is the best thing that could have happened to the database community * Despite other definitions.
IBM Haifa Research 1 The Cloud Trade Off IBM Haifa Research Storage Systems.
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
Commit Protocols. CS5204 – Operating Systems2 Fault Tolerance Causes of failure: process failure machine failure network failure Goals : transparent:
Databases with Scalable capabilities Presented by Mike Trischetta.
Distributed Storage System Survey
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
© , OrangeScape Technologies Limited. Confidential 1 Write Once. Cloud Anywhere. Building Highly Scalable Web applications BASE gives way to ACID.
Presented by Dr. Greg Speegle April 12,  Two-phase commit slow relative to local transaction processing  CAP Theorem  Option 1: Reduce availability.
Modern Databases NoSQL and NewSQL Willem Visser RW334.
Molecular Transactions G. Ramalingam Kapil Vaswani Rigorous Software Engineering, MSRI.
High Throughput Computing on P2P Networks Carlos Pérez Miguel
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Trade-offs in Cloud.
C-Store: Concurrency Control and Recovery Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun. 5, 2009.
1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
Lecture 13 Advanced Transaction Models. 2 Protocols considered so far are suitable for types of transactions that arise in traditional business applications,
Homework 4 Code for word count com/content/repositories/releases/com.cloud era.hadoop/hadoop-examples/
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
Motivation for Recovery Atomicity: –Transactions may abort (“Rollback”). Durability: –What if DBMS stops running? (Causes?) crash! v Desired Behavior after.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
CS 540 Database Management Systems NoSQL & NewSQL Some slides due to Magda Balazinska 1.
Software architectures and tools for highly distributed applications Voldemaras Žitkus.
CSCI5570 Large Scale Data Processing Systems
CS 440 Database Management Systems
A free and open-source distributed NoSQL database
The CAT Theorem Shegufta Ahsan, Indranil Gupta
Trade-offs in Cloud Databases
Distributed Systems – Paxos
Operational & Analytical Database
Modern Databases NoSQL and NewSQL
NOSQL.
CSCI5570 Large Scale Data Processing Systems
Introduction to NewSQL
NOSQL databases and Big Data Storage Systems
Massively Parallel Cloud Data Storage Systems
Transactions Properties.
NoSQL Databases An Overview
CS 440 Database Management Systems
Lecture 21: Replication Control
H-store: A high-performance, distributed main memory transaction processing system Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alex.
Transaction Properties: ACID vs. BASE
Lecture 21: Replication Control
Presentation transcript:

What Should the Design of Cloud- Based (Transactional) Database Systems Look Like? Daniel Abadi Yale University March 17 th, 2011

Does the Cloud Force Us to Build New Database Systems? Traditional vendors argue ‘no’ But there is increased desire for certain things: –Horizontal scalability to leverage cloud elasticity Traditional solutions use high-end SANs, but this is not a cloudy concept Shared-nothing is the most effective way to achieve horizontal scalability in the cloud –Virtualization can result in wild fluctuations of node performance Do not want to operate at the speed of slowest node –Virtual machines in the cloud have orders of magnitudes higher mortality rates Greedy owners like to murder the poor VMs Fault tolerance must be treated as a first class citizen

The Problem With Traditional Database Solutions Available shared-nothing solutions do not achieve high transactional throughput –Distributed concurrency control and commit protocols are expensive Need more research to reduce this overhead Database systems generally optimize everything in advance –Adaptive execution and optimization frameworks must be high on the research agenda Database systems treat faults as a rare event –Machine failures cause transactions to abort –Recovery from a REDO log is slow

Common Solutions Drop A or C of ACID –Relaxing consistency makes replication easy, facilitates fault tolerance –Relaxing atomicity reduces (or eliminates) need for distributed concurrency control Examples: SimpleDB, BigTable (HBase), Cassandra, PNUTs, SQL Azure, sharded MySQL, etc. –Often called NoSQL systems (Dropping ‘C’ also helps with CAP, but this is only part of the story)

Whither ACID in the Cloud? People still want ACID –Engineers at Google, Facebook, Amazon, Twitter, etc. are a very loud minority –NoSQL should not be the only option in the cloud Needed research: –Building an ACID-compliant, horizontally scalable, fault tolerant database for the cloud

One Potential Idea Get replication to work right out of the box –Today’s systems generally act, then replicate Complicates semantics of sending read queries to replicas Need confirmation from replica before commit (increased latency) if you want durability and high availability In progress transactions must be aborted upon a master failure –Want system that replicates then acts

Therefore … Instead of weakening ACID, strengthen it! –Guaranteeing equivalence to SOME serial order makes active replication difficult Running the same set of xacts on two different replicas might cause replicas to diverge Disallow any nondeterministic behavior Disallow aborts caused by DBMS –Disallow deadlock –Distributed commit much easier if you don’t have to worry about aborts

Consequences of Determinism Replicas produce the same output, given the same input, –Facilitates active replication Only initial input needs to be logged, state at failure can be reconstructed from this input log (or from a replica) Active distributed xacts not aborted upon node failure –Greatly reduces (or eliminates) cost of distributed commit Don’t have to worry about nodes failing during commit protocol Don’t have to worry about affects of transaction making it to disk before promising to commit transaction Just need one message from any node that potentially can deterministically abort the xact This message can be sent in the middle of the xact, as soon as it knows it will commit

If This Works Then … Node failure does not cause transaction failure –Fault tolerance will be extremely high Can run distributed transactions without an expensive commit protocol –Shared-nothing becomes much more attractive