NVWAL: Exploiting NVRAM in Write-Ahead Logging

Slides:



Advertisements
Similar presentations
Crash Recovery John Ortiz. Lecture 22Crash Recovery2 Review: The ACID properties  Atomicity: All actions in the transaction happen, or none happens 
Advertisements

Better I/O Through Byte-Addressable, Persistent Memory
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
Journaling of Journal Is (Almost) Free Kai Shen Stan Park* Meng Zhu University of Rochester * Currently affiliated with HP Labs FAST
Rethink the Sync Ed Nightingale Kaushik Veeraraghavan Peter Chen Jason Flinn University of Michigan.
Loose-Ordering Consistency for Persistent Memory
Steven Pelley, Peter M. Chen, Thomas F. Wenisch University of Michigan
Chapter 11: File System Implementation
Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar.
File Management Systems
G Robert Grimm New York University SGI’s XFS or Cool Pet Tricks with B+ Trees.
User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M.
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
Joel Coburn. , Trevor Bunker
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute.
AN IMPLEMENTATION OF A LOG-STRUCTURED FILE SYSTEM FOR UNIX Margo Seltzer, Harvard U. Keith Bostic, U. C. Berkeley Marshall Kirk McKusick, U. C. Berkeley.
I/O Stack Optimization for Smartphones
Lecture 11: DMBS Internals
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How are data stored? –physical level –logical level.
Chapter 10 Storage and File Structure Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems CSCI-6140 – Computer Operating Systems David Goldschmidt, Ph.D.
IT253: Computer Organization
Advanced Topics in Databases Fahime Raja Niloofar Razavi Melody Siadaty Summer 2005 Technical Report II : Main memory Databases.
Resolving Journaling of Journal Anomaly in Android I/O: Multi-Version B-tree with Lazy Split Wook-Hee Kim 1, Beomseok Nam 1, Dongil Park 2, Youjip Won.
Main memory DB PDT Ján GENČI. 2 Obsah Motivation DRDBMS MMDBMS DRDBMS versus MMDBMS Commit processing Support in commercial systems.
Chapter VIIII File Systems Review Questions and Problems Jehan-François Pâris
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Using Uncacheable Memory to Improve Unity Linux Performance
ThyNVM Enabling Software-Transparent Crash Consistency In Persistent Memory Systems Jinglei Ren, Jishen Zhao, Samira Khan, Jongmoo Choi, Yongwei Wu, and.
JOURNALING VERSUS SOFT UPDATES: ASYNCHRONOUS META-DATA PROTECTION IN FILE SYSTEMS Margo I. Seltzer, Harvard Gregory R. Ganger, CMU M. Kirk McKusick Keith.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
1 Transactional Flash Vijayan Prabhakaran, Thomas L. Rodeheffer, Lidong Zhou Microsoft Research, Silicon Valley Presented by Sudharsan.
Free Transactions with Rio Vista Landon Cox April 15, 2016.
Jun-Ki Min. Slide Purpose of Database Recovery ◦ To bring the database into the last consistent stat e, which existed prior to the failure. ◦
Persistent Memory (PM)
Hathi: Durable Transactions for Memory using Flash
CS 540 Database Management Systems
Failure-Atomic Slotted Paging for Persistent Memory
Free Transactions with Rio Vista
Module 11: File Structure
High-performance transactions for persistent memories
Lecture 16: Data Storage Wednesday, November 6, 2006.
Using non-volatile memory (NVDIMM-N) as block storage in Windows Server 2016 Tobias Klima Program Manager.
Performance Measures of Disks
Better I/O Through Byte-Addressable, Persistent Memory
Lecture 11: DMBS Internals
NOVA: A High-Performance, Fault-Tolerant File System for Non-Volatile Main Memories Andiry Xu, Lu Zhang, Amirsaman Memaripour, Akshatha Gangadharaiah,
PASTE: A Networking API for Non-Volatile Main Memory
Module 11: Data Storage Structure
Introduction to Database Systems
Rethink the Sync Ed Nightingale Kaushik Veeraraghavan Peter Chen
Free Transactions with Rio Vista
Overview: File system implementation (cont)
PARAMETER-AWARE I/O MANAGEMENT FOR SOLID STATE DISKS
Chapter 13: Data Storage Structures
Rethink the Sync Ed Nightingale Kaushik Veeraraghavan Peter Chen
Chapter VIIII File Systems Review Questions and Problems
File System Implementation
SQL Statement Logging for Making SQLite Truly Lite
Chapter 13: Data Storage Structures
Chapter 13: Data Storage Structures
The Design and Implementation of a Log-Structured File System
Presentation transcript:

NVWAL: Exploiting NVRAM in Write-Ahead Logging ASPLOS’16, NVWAL: Exploiting NVRAM in Write-Ahead Logging Wook-Hee Kim1, Jinwoong Kim1, Woongki Baek1, Beomseok Nam1, Youjip Won2 1 UNIST (Ulsan National Institute of Science and Technology) 2 Hanyang University

Outline Motivation IO Bottleneck in SQLite NVWAL: Write-Ahead-Logging on NVRAM Byte-granularity differential logging User-level NVRAM management for WAL Transaction-aware lazy synchronization Evaluation Conclusion SQLite can be insensitive to NVRAM write latency. ASPLOS '16, Atlanta, GA.

SQLite data ASPLOS '16, Atlanta, GA.

I/O Bottleneck in Mobile device Small I/O (a few bytes) SQLite Database Journaling & Database page granularity I/O File System Block granularity I/O & File system Journaling Slow storage ASPLOS '16, Atlanta, GA.

How? NVRAM Storage Memory NVRAM Fast Persistence Volatile Cheap Byte-addressable How? ASPLOS '16, Atlanta, GA.

NVWAL (NVRAM Write-Ahead Logging) Byte-granularity differential logging User-level NVRAM management for WAL Transaction-aware lazy synchronization NVRAM Heap Manager DB H/W OS ASPLOS '16, Atlanta, GA.

Differential Logging ASPLOS '16, Atlanta, GA.

Write-Ahead Logging DRAM Buffer Volatile Commit! Non-Volatile ^ WAL File Flash memory Database File ASPLOS '16, Atlanta, GA.

Write-Ahead Logging in NVRAM Commit v v DRAM Buffer NVRAM Log Flash memory Database File ASPLOS '16, Atlanta, GA.

Differential Logging Differential Logging reduces WAL Header Wal frame Wal frame header Unused Wal frame Wal frame header Unused Commit! Wal frame Wal frame header Unused DRAM Buffer Wal frame Wal frame header Unused Differential Logging reduces up to 84% of unnecessary I/O. NVRAM Log ASPLOS '16, Atlanta, GA.

User-level Heap management ASPLOS '16, Atlanta, GA.

NVRAM Block Management DRAM Buffer User level Commit Kernel level nv_malloc() overhead Wal frame Wal frame header nv_malloc() overhead Wal frame Wal frame header nv_malloc() overhead Wal frame Wal frame header nv_malloc() overhead Wal frame Wal frame header NVRAM Log ASPLOS '16, Atlanta, GA.

NVRAM User-level management : in-use p : pending f : free Metadata u f DRAM buffer Kernel space Metadata block WAL Header Wal frame header Commit! Wal frame (Page 3) Wal frame header Wal frame (Page 7) User Space NVRAM Heap Unused next ASPLOS '16, Atlanta, GA.

NVRAM User-level management : in-use p : pending f : free Metadata u p f DRAM buffer Kernel space Metadata block WAL Header Unused Wal frame header Commit! Wal frame (Page 3) b = nv_pre_malloc() is called ! Wal frame header Wal frame (Page 7) User Space NVRAM Heap Unused next next ASPLOS '16, Atlanta, GA.

NVRAM User-level management : in-use p : pending f : free Metadata u f DRAM buffer Kernel space Metadata block WAL Header Unused Wal frame header Wal frame (Page 3) Wal frame header Wal frame (Page 5) Wal frame header Commit! Wal frame (Page 3) Wal frame (Page 8) Wal frame header Unused nv_set_flag(b,in-use) is called ! Wal frame header Wal frame (Page 7) User Space NVRAM Heap Unused next next ASPLOS '16, Atlanta, GA.

Recovery in NVWAL (1) – free in-use free WAL Header Unused Next frame Wal frame header Wal frame (Page 1) Wal frame header Wal frame (Page 7) Unused next ASPLOS '16, Atlanta, GA.

Recovery in NVWAL (2) – pending in-use free pending WAL Header Unused Next frame Wal frame header Wal frame (Page 1) Wal frame header Wal frame (Page 7) Recovery process reverts ‘pending’ to ‘free’ Unused next ASPLOS '16, Atlanta, GA.

Recovery in NVWAL (3) – pending in-use free pending WAL Header Unused Wal frame header Wal frame (Page 1) Wal frame header Wal frame (Page 7) Recovery process deletes pointers to pending blocks Unused next ASPLOS '16, Atlanta, GA.

Recovery in NVWAL (4) in-use pending WAL Header Wal frame header Wal frame (Page 3) Unused Wal frame header Wal frame (Page 1) Wal frame header Wal frame (Page 7) Normal WAL recovery Unused next ASPLOS '16, Atlanta, GA.

Persistency guarantee

Persistency guarantee in Flash W write() F fsync() Insert data W F W F ASPLOS '16, Atlanta, GA.

Eager Synchronization in NVRAM memcpy mb memory barrier CL flush cache line flush Insert data pb persist barrier CL flush mb Enforce ordering CL flush mb Enforce ordering CL flush mb Enforce ordering m m pb C pb ASPLOS '16, Atlanta, GA.

Transaction-Aware Persistency Guarantee in NVRAM memcpy mb memory barrier CL flush cache line flush Insert data pb persist barrier Logging phases Commit phases m mb CL flush mb m mb CL flush mb pb C mb CL flush mb pb ASPLOS '16, Atlanta, GA.

Asynchronous Commit in NVRAM pb memcpy persist barrier mb Chk sum checksum memory barrier CL flush cache line flush Insert data m m mb CL flush CL flush mb Chk sum pb C mb CL flush mb pb ASPLOS '16, Atlanta, GA.

Implement NVWAL in SQLite 3.7.11 Evaluation Implement NVWAL in SQLite 3.7.11 Used in Android 4.4 Tuna NVRAM emulation board with ARM Cortex-A9 DDR3-SDRAM DRAM DDR3-SDRAM NVRAM (Xilinx Zynq SOC) Nexus5 2.26 GHz Snapdragon 800 processor DDR memory we assume that specific address range of DRAM is NVRAM NVRAM latency is emulated by nop operations. Performance Analysis Tools Mobibench ASPLOS '16, Atlanta, GA.

Overhead of Ordering Constraints Memory fence Cache line flush Cache line flush + mfence memcpy Lazy synchronization eliminates up to 23% of persistency overhead. ASPLOS '16, Atlanta, GA.

Differential Logging and I/O Up to 84% of I/O reduced Differential logging eliminates up to 84% of the unnecessary I/O. ASPLOS '16, Atlanta, GA.

Transaction Throughput and NVRAM Latency insensitive 37% improvement 6% improvement 28% improvement User-Level Heap improves 6% of performance. Differential logging yields up to 28% higher throughput. Combining all, we can get up to 37% higher performance. ASPLOS '16, Atlanta, GA.

Transaction Throughput of NVWAL on Nexus5 10x faster Combining all of optimization performs at least 10 times faster when NVRAM latency is smaller than 3us. With our optimization, NVWAL shows performance similar to that of WAL on flash memory when the write latency is set to 230 usec. ASPLOS '16, Atlanta, GA.

Conclusion Strict ordering of memory writes causes unnecessary overhead. Transaction-aware lazy synchronization Leveraging byte-addressability of NVRAM Byte-granularity differential logging User-level NVRAM heap manager Via the optimizations, we make application performance insensitive to the NVRAM write latency. 400 nsec → 1900 nsec NVRAM latency results in only 4% performance degradation ASPLOS '16, Atlanta, GA.

Thank You ASPLOS '16, Atlanta, GA.