Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.

Slides:



Advertisements
Similar presentations
Remus: High Availability via Asynchronous Virtual Machine Replication
Advertisements

Two phase commit. Failures in a distributed system Consistency requires agreement among multiple servers –Is transaction X committed? –Have all servers.
Chapter 16: Recovery System
Replication and Consistency (2). Reference r Replication in the Harp File System, Barbara Liskov, Sanjay Ghemawat, Robert Gruber, Paul Johnson, Liuba.
1 CSIS 7102 Spring 2004 Lecture 8: Recovery (overview) Dr. King-Ip Lin.
Recovery CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
CSCI 3140 Module 8 – Database Recovery Theodore Chiasson Dalhousie University.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
Thread-Level Transactional Memory Decoupling Interface and Implementation UW Computer Architecture Affiliates Conference Kevin Moore October 21, 2004.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Jan. 2014Dr. Yangjun Chen ACS Database recovery techniques (Ch. 21, 3 rd ed. – Ch. 19, 4 th and 5 th ed. – Ch. 23, 6 th ed.)
Chapter 11: File System Implementation
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
G Robert Grimm New York University Recoverable Virtual Memory.
1 Minggu 8, Pertemuan 16 Transaction Management (cont.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Overview Distributed vs. decentralized Why distributed databases
Transaction Processing IS698 Min Song. 2 What is a Transaction?  When an event in the real world changes the state of the enterprise, a transaction is.
Chapter 8 : Transaction Management. u Function and importance of transactions. u Properties of transactions. u Concurrency Control – Meaning of serializability.
1 Transaction Management Database recovery Concurrency control.
G Robert Grimm New York University Recoverable Virtual Memory.
Transaction Management WXES 2103 Database. Content What is transaction Transaction properties Transaction management with SQL Transaction log DBMS Transaction.
Distributed Databases
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
1 The Google File System Reporter: You-Wei Zhang.
Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Switch off your Mobiles Phones or Change Profile to Silent Mode.
Lecture 12 Recoverability and failure. 2 Optimistic Techniques Based on assumption that conflict is rare and more efficient to let transactions proceed.
Recovery System By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Reliability and Recovery CS Introduction to Operating Systems.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
Concurrency Control. Objectives Management of Databases Concurrency Control Database Recovery Database Security Database Administration.
Supporting Multi-Processors Bernard Wong February 17, 2003.
Databases Illuminated
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Distributed shared memory u motivation and the main idea u consistency models F strict and sequential F causal F PRAM and processor F weak and release.
Don’t be lazy, be consistent: Postgres-R, A new way to implement Database Replication Paper by Bettina Kemme and Gustavo Alonso, VLDB 2000 Presentation.
Distributed Databases
SYSTEMS IMPLEMENTATION TECHNIQUES TRANSACTION PROCESSING DATABASE RECOVERY DATABASE SECURITY CONCURRENCY CONTROL.
Free Transactions with Rio Vista Landon Cox April 15, 2016.
Jun-Ki Min. Slide Purpose of Database Recovery ◦ To bring the database into the last consistent stat e, which existed prior to the failure. ◦
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Database recovery techniques
CS 540 Database Management Systems
Remote Backup Systems.
Free Transactions with Rio Vista
Google Filesystem Some slides taken from Alan Sussman.
Introduction to Operating Systems
Page Replacement.
Outline Announcements Fault Tolerance.
Free Transactions with Rio Vista
Outline Introduction Background Distributed DBMS Architecture
Module 17: Recovery System
Transactions in Distributed Systems
Distributed Availability Groups
Lecture 21: Replication Control
Remote Backup Systems.
Presentation transcript:

Highly Available ACID Memory Vijayshankar Raman

Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent synchronization useful when multiple threads are accessing critical data l databases concurrency control and recovery logic runs through most of database code. Extremely complicated, and hard to get right bugs lead to data loss -- disastrous!

Project goal §Take recovery logic out of apps l Build a simple user-level library that provides recoverable, transactional memory.  all the logic in one place => easy to debug, maintain  easy to to make use of hardware advances l use replication and persistent memory for recovery -- instead of writing logs +simpler to implement +simpler for applications to use ??

Questions to answer §program simplicity vs. performance l how much do we lose by replicating instead of logging? §on a cluster, can we use replication directly for availability? l traditionally availability handled on top of the recovery system

Outline §Introduction §Acid Memory API §Single Node design & implementation §Evaluation §High Availability: multiple node design and implementation §Evaluation §Conclusion

Acid Memory API  Transaction manager interface TransactionManager(database name, acid memory area)  Transaction interface beginTransaction() getLock(memory region1, READ/WRITE) getLock(memory region2, READ/WRITE)... –memory region = virtual address prefix commit/abort() -- all locks released §combine concurrency control with recovery recovery done on write-locked regions §supports fine granularity locking => cannot use VM for recovery §applications can modify data directly

Implementation §assume non-volatile memory ( NVRAM, battery backup) §assume persistent file cache §acid memory area mmap’d from file §persistence => writes are permanent §getLock(WRITE) -- copy the region onto mirror area §transaction abort / system crash l undo changes on all writelocked regions using copy in mirror area §only overhead of recovery is a memcpy on each write lock Disk file master copy mirror Acid memory area mmap

Evaluation §Overhead of acid memory l read lock:  35usec (lock manager overhead) l write lock:  35usec + 5.5usec/KB (memcpy cost) l much lesser than methods that write log to disk §Ease of programming l application needs to only acquire locks to become recoverable l can manipulate the data directly -- do not have to call special function on every update

Example: suppose I want to transfer 1M $ from A’s account to B’s With ACID memory /* a points to A’s account */ /* b points to B’s account */ trans = new Transaction(transMgr); trans->getLock(a, WRITE); trans->getLock(b, WRITE); a = a ; b = b ; trans->commit(); Using logging BeginTransaction(); getLock(A’s account, WRITE); getLock(B’s account, WRITE); read(A’s account, a); read(B’s account, b); a = a ; b = b ; Update(A’s account, a); Update(B’s account, b); commit(); (Update() creates the needed logs)

§Performance comparison: acid memory vs. logging l consider a transaction updating integers in a 1KB data-structure l logging each individual update is a bit faster, to an extent l acid memory gives okay performance with very easy programmability Number of integer writes Time (in microseconds) Acid memory: write-lock the data-structure Logging: write-lock the structure and update each integer separately

Outline §Introduction §Acid Memory API §Single Node design & implementation §Evaluation §High Availability: multiple node design and implementation §Evaluation §Conclusion

Replication for availability § traditionally, availability has been handled in a separate layer -- above recovery §can we handle both recovery and availability via same mechanism? Transaction processing monitor DBMS replicate

Architecture §Transactions run by transaction handler §all lock requests must go to owner §data in all replicas must be kept in sync §balance load by partitioning data l different owner for each partition §failure model l fail-stop: nodes never send incorrect messages to others l failed nodes never recover data after crash l network never fails Owner data lock manager data Transaction handler replicas client

§Reads: client gets data from random replica §Writes: must update all replicas l on commit, transaction sends new data to owner l owner propagates update atomically to all replicas 3 phase non-blocking commit protocol. Always ensure that there is someone to take over the propagation if you crash §if owner crashes, fail-over to a replica Owner data lock manager data client Transaction handler

Evaluation  Very fast recovery usecs +get fast transactions without non-volatile memory  writes are slower  4n messages at commit if n replicas  still, this is faster than logging to disk –homogeneous software: susceptible to bugs

Conclusions §Acid memory easier to use §Performance relative to logging not too bad §replication gives fast recovery §Using cache for replication §when/how much to replicate? Future Work

Additional Slides

Evaluation, w.r.t. logging based approach §Ease of implementation l very little to code, mostly lock manager stuff l whereas in a traditional dbms specialized buffer manager log manager complex recovery mechanism

How to make file cache persistent §Rio (Chen et. Al, 1996) §place file cache in non-volatile memory §protect it against OS crashes using VM protection §flush pages in file cache to disk files on reboot