Persistent Memory Transactions

Slides:



Advertisements
Similar presentations
1 CSIS 7102 Spring 2004 Lecture 9: Recovery (approaches) Dr. King-Ip Lin.
Advertisements

Better I/O Through Byte-Addressable, Persistent Memory
Journaling of Journal Is (Almost) Free Kai Shen Stan Park* Meng Zhu University of Rochester * Currently affiliated with HP Labs FAST
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
1 CSIS 7102 Spring 2004 Lecture 8: Recovery (overview) Dr. King-Ip Lin.
Jan. 2014Dr. Yangjun Chen ACS Database recovery techniques (Ch. 21, 3 rd ed. – Ch. 19, 4 th and 5 th ed. – Ch. 23, 6 th ed.)
Spring 2003CSE P5481 Introduction Why memory subsystem design is important CPU speeds increase 55% per year DRAM speeds increase 3% per year rate of increase.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Oct 31, 2005 Topic: Memory Hierarchy Design (HP3 Ch. 5) (Caches, Main Memory and.
Chapter 8 : Transaction Management. u Function and importance of transactions. u Properties of transactions. u Concurrency Control – Meaning of serializability.
Crash recovery All-or-nothing atomicity & logging.
Caching and Demand-Paged Virtual Memory
The SNIA NVM Programming Model
NVM Programming Model. 2 Emerging Persistent Memory Technologies Phase change memory Heat changes memory cells between crystalline and amorphous states.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
Caching and Virtual Memory. Main Points Cache concept – Hardware vs. software caches When caches work and when they don’t – Spatial/temporal locality.
Selecting and Implementing An Embedded Database System Presented by Jeff Webb March 2005 Article written by Michael Olson IEEE Software, 2000.
1 Module 6 Log Manager COP Log Manager Log knows everything. It is the temporal database –The online durable data are just their current versions.
Modularizing B+-trees: Three-Level B+-trees Work Fine Shigero Sasaki* and Takuya Araki NEC Corporation * currently with 1st Nexpire Inc.
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Resolving Journaling of Journal Anomaly in Android I/O: Multi-Version B-tree with Lazy Split Wook-Hee Kim 1, Beomseok Nam 1, Dongil Park 2, Youjip Won.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
Design of Flash-Based DBMS: An In-Page Logging Approach Sang-Won Lee and Bongki Moon Presented by Chris Homan.
Memory Hierarchy Adaptivity An Architectural Perspective Alex Veidenbaum AMRM Project sponsored by DARPA/ITO.
ThyNVM Enabling Software-Transparent Crash Consistency In Persistent Memory Systems Jinglei Ren, Jishen Zhao, Samira Khan, Jongmoo Choi, Yongwei Wu, and.
Jiahao Chen, Yuhui Deng, Zhan Huang 1 ICA3PP2015: The 15th International Conference on Algorithms and Architectures for Parallel Processing. zhangjiajie,
2015 Storage Developer Conference. © Intel Corporation. All Rights Reserved. RDMA with PMEM Software mechanisms for enabling access to remote persistent.
NVWAL: Exploiting NVRAM in Write-Ahead Logging
Free Transactions with Rio Vista Landon Cox April 15, 2016.
Persistent Memory (PM)

Database recovery techniques
Hathi: Durable Transactions for Memory using Flash
An Analysis of Persistent Memory Use with WHISPER
Mihai Burcea, J. Gregory Steffan, Cristiana Amza
Memory Hierarchy Ideal memory is fast, large, and inexpensive
DURABILITY OF TRANSACTIONS AND CRASH RECOVERY
ARIES: Algorithm for Recovery and Isolation Exploiting Semantics
Failure-Atomic Slotted Paging for Persistent Memory
Free Transactions with Rio Vista
Jonathan Walpole Computer Science Portland State University
An Analysis of Persistent Memory Use with WHISPER
Transactional Recovery and Checkpoints
High-performance transactions for persistent memories
Enforcing the Atomic and Durable Properties
PHyTM: Persistent Hybrid Transactional Memory
Database Applications (15-415) DBMS Internals- Part XIII Lecture 22, November 15, 2016 Mohammad Hammoud.
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
Cache Memory Presentation I
Jason F. Cantin, Mikko H. Lipasti, and James E. Smith
Lecture 23: Cache, Memory, Security
Better I/O Through Byte-Addressable, Persistent Memory
Recovery I: The Log and Write-Ahead Logging
Lecture 23: Cache, Memory, Virtual Memory
Lecture 22: Cache Hierarchies, Memory
Database Applications (15-415) DBMS Internals- Part XIII Lecture 25, April 15, 2018 Mohammad Hammoud.
Free Transactions with Rio Vista
Printed on Monday, December 31, 2018 at 2:03 PM.
CSE 351: The Hardware/Software Interface
Morgan Kaufmann Publishers Memory Hierarchy: Cache Basics
NVSTREAM:NVRAM-based Transport for Streaming Objects
Lecture 22: Cache Hierarchies, Memory
Lecture 20: Intro to Transactions & Logging II
Database Applications (15-415) DBMS Internals- Part XIII Lecture 24, April 14, 2016 Mohammad Hammoud.
Lecture 23: Transactional Memory
SQL Statement Logging for Making SQLite Truly Lite
Dynamic Performance Tuning of Word-Based Software Transactional Memory
Janus Optimizing Memory and Storage Support for Non-Volatile Memory Systems Sihang Liu Korakit Seemakhupt, Gennady Pekhimenko, Aasheesh Kolli, and Samira.
Testing Persistent Memory Applications
Lecture 29: File Systems (cont'd)
Presentation transcript:

Persistent Memory Transactions Virendra J. Marathe, Achin Mishra, Amee Trivedi, Yihe Huang, Faisal Zaghloul, Sanidhya Kashyap, Margo Seltzer, Tim Harris, Steve Byan, Bill Bridge1, and Dave Dice Oracle Labs 1Oracle Persistent Memory: What is it? Looks like DRAM Byte-addressable DIMMs across memory bus Cacheable Low latency: 1000X lower than NAND flash (3D XPoint); projected to eventually eclipse DRAM performance (e.g. STT-MRAM, Carbon-NanoTubes, etc.) Acts like Storage: State persists across restarts Dense: projections better than DRAM (~10X, 3D XPoint) Cost expected to be between DRAM and NAND flash Persistent Memory Access Primitives Load/Store instructions (identical to load/store instructions for DRAM) Persisting stores clflush: Legacy cache line flush clflush-opt: New lightweight instruction to do the flushing more asynchronously clwb: Cache-line writeback; retains does not evict cache line sfence/mfence/etc.: Instruction with fence semantics will guarantee all prior writebacks and flushes have taken effect ntstore: Non-temporal store to memory controller buffers pcommit: Deprecated instruction for stronger persistence Writing programs that can recover from failures is hard; we need a better programming model: Transactions to the rescue!! Transactions for Persistent Memory Several runtime systems proposed Mnemosyne, NV-Heaps, NVM-Direct, NVML’s libpmemobj, SoftWRaP,etc. Implementations siding with either redo logging or undo logging for transactional writes Arguments largely based on intuition, or inadequate evaluation No comprehensive evaluation of performance trade offs over wide swath of parameters Workload’s access patterns Persistence barrier latencies Cache locality of the runtimes Transaction runtime bookkeeping overheads Our work does this broad performance analysis Lesson I: Mods can be contagious Persistence Domains: What does it take to persist updates? Persistent Memory Transactions: API and Runtimes Persistence Domain: Portion of the memory hierarchy that is effectively persistent DIMM-only (PDOM-0) DIMM + memory controller buffers (PDOM-1) Entire memory hierarchy (PDOM-2) Transactional API macros: TXN_BEGIN, TXN_COMMIT, TXN_READ, TXN_WRITE, and other accessors Transactional Writes and Persist Barriers Transaction T Undo log Redo log Copy-on-write old A new A ptr(A) Transactional Writes: Trade Offs Write A new B ptr(B) Write B old B Txn Writes Pros Cons Undolog Reads are uninstrumented loads Too many persist barriers Redolog Constant (4) pbarriers Need to instrument reads Redolog lookups for read-after-write scenarios Copy-On-Write (COW) Fixed lookup overhead for read-after-write scenarios Increased memory management overhead or memory footprint Adverse effects on cache locality A async writeback, flush Old New Operations Persistence Domains PDOM-0 PDOM-1 PDOM-2 Write store clwb/clflushopt Store Ordering Persists sfence pcommit nop pbarrier N writes Commit O(N) pbarriers + 4 pbarriers for Commit 4 pbarriers for Commit 4 pbarriers for Commit (a) (b) (c) Evaluation Intel Software Emulation Platform Emulated broad range of load and persist barrier latencies (0—1000 nsecs) Reporting 300 nsec load latency and 500 (PDOM-0), 100 (PDOM-1), and 0 (PDOM-2) nsec persist barrier latencies to approximate the 3 persistence domains Microbenchmarking Simple 2-D array of 64-bit integers read/updated using transaction Test runs configured to cover broad range of access granularities, read/write mixes, and transaction lengths Array: Transaction either reads or increments all integers in a slot, via a single coarse grain read-increment-update per slot. Each transaction touches 20 slots. Array-RAW increments each individual integer in a slot, thereby increasing read-after-write instances. Persistent Memcached Comprehensive re-write of Memcached for “instant warmup” COW has significant programmability challenges Concurrent updates to a single object not supported COW requires its own allocator; Memcached has its own Decided not to port Memcached to COW’s interface Transactions are complex Get performs 25 reads and 5 writes Put performs 80-90 reads and 24 writes (most writes are read-after-write instances) Fits with the Array-RAW benchmark with 25-30% read-after-writes Performance behavior is similar to Array-RAW Undo logging scales worse than redo logging for 50% reads for longer persist barrier latencies (PDOM-0 and PDOM-1) The bar charts explain: Put transactions take longer, which exacerbates contention on Memcached’s locks (a) Throughput – 90% reads (b) Throughput – 50% reads (c) Get/Put Latency – 90% reads (d) Get/Put Latency – 50% reads 250k 200k 150k 100k 50k 0k 250 200 150 100 50 500 400 300 200 100 Get Put 80k 60k 40k 20k 0k Operations/sec Undo/PDOM-0 Redo/PDOM-0 Undo/PDOM-1 Transactions latency (usecs) Undo/PDOM-2 Redo/PDOM-2 Undo/P0 Redo/P0 Undo/P1 Redo/P1 Undo/P2 Redo/P2 Undo/P0 Redo/P0 Undo/P1 Redo/P1 Undo/P2 Redo/P2 2 4 6 8 2 4 6 8 SQLite A lightweight database; supports ACID transactions, uses rollback/write-ahead log in the Commit() function Our version: Minor mod to Commit() to directly write dirty pages to database file (using our transactions) This is a write-only transaction; big write set with coarse writes In-memory is best case performance; transactions help come close to 10-15% Undo log incurs checksum overheads #cores Conclusions No single transaction runtime is a clear winner A complex interplay between multiple factors Workload read/write access patterns Persistence Domains Cache Locality Transaction runtime bookkeeping overheads COW appears to be a non-starter, but choice between undo and redo logging based runtimes is non-trivial Recent work (Dude-TM, Kamino-Tx) has pushed the performance envelope further by leveraging DRAM as a cache Would be interesting to add those works in the mix here #cores Persistent Key-Value Store Contains central growable persistent hash table Used transactions for consistent updates Base version, required for COW, instruments all accesses to persistent objects Optimized version avoids some transactional instrumentation Transactions are short; no significant difference between undo and redo logging COW performs worst because of extra level of indirection and greater memory churn (a) Throughput – 90% reads (b) Throughput – 50% reads In-memory Rollback WAL Undo Redo 1000K 800K 600K 400K 200K 0K 1000K 800K 600K 400K 200K 0K Operations / sec DRAM Redo-opt Redo Undo-opt Undo COW Stock PDOM-0 PDOM-1 PDOM-2