An Analysis of Persistent Memory Use with WHISPER

Slides:



Advertisements
Similar presentations
Software & Services Group PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira,
Advertisements

Better I/O Through Byte-Addressable, Persistent Memory
Monitoring Data Structures Using Hardware Transactional Memory Shakeel Butt 1, Vinod Ganapathy 1, Arati Baliga 2 and Mihai Christodorescu 3 1 Rutgers University,
A File is Not a File: Understanding the I/O Behavior of Apple Desktop Applications Tyler Harter, Chris Dragga, Michael Vaughn, Andrea C. Arpaci-Dusseau,
Memory Management (II)
Sinfonia: A New Paradigm for Building Scalable Distributed Systems Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christonos Karamanolis.
Supporting Nested Transactional Memory in LogTM Authors Michelle J Moravan Mark Hill Jayaram Bobba Ben Liblit Kevin Moore Michael Swift Luke Yen David.
Ext3 Journaling File System “absolute consistency of the filesystem in every respect after a reboot, with no loss of existing functionality” chadd williams.
Unbounded Transactional Memory Paper by Ananian et al. of MIT CSAIL Presented by Daniel.
The SNIA NVM Programming Model
LogTM: Log-Based Transactional Memory Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, & David A. Wood Presented by Colleen Lewis.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
Full and Para Virtualization
Preface 1Performance Tuning Methodology: A Review Course Structure 1-2 Lesson Objective 1-3 Concepts 1-4 Determining the Worst Bottleneck 1-5 Understanding.
Exploiting Fine-Grained Data Parallelism with Chip Multiprocessors and Fast Barriers Jack Sampson*, Rubén González†, Jean-Francois Collard¤, Norman P.
2015 Storage Developer Conference. © Intel Corporation. All Rights Reserved. RDMA with PMEM Software mechanisms for enabling access to remote persistent.
Free Transactions with Rio Vista Landon Cox April 15, 2016.
File System Consistency
Persistent Memory (PM)
NVMOVE: Helping Programmers Move to Byte-based Persistence

Hathi: Durable Transactions for Memory using Flash
Hands-Off Persistence System (HOPS)
Failure-Atomic Slotted Paging for Persistent Memory
Free Transactions with Rio Vista
On-Line Transaction Processing
An Analysis of Persistent Memory Use with WHISPER
CSE 451: Operating Systems
Memory Protection: Kernel and User Address Spaces
Chapter 1: A Tour of Computer Systems
High-performance transactions for persistent memories
Lecture 16: Data Storage Wednesday, November 6, 2006.
CS703 - Advanced Operating Systems
Multiscalar Processors
Using non-volatile memory (NVDIMM-N) as block storage in Windows Server 2016 Tobias Klima Program Manager.
Day 12 Threads.
Operational & Analytical Database
Isotope: Transactional Isolation for Block Storage
HPE Persistent Memory Microsoft Ignite 2017
Journaling File Systems
Microsoft Build /12/2018 5:05 AM Using non-volatile memory (NVDIMM-N) as byte-addressable storage in Windows Server 2016 Tobias Klima Program Manager.
Better I/O Through Byte-Addressable, Persistent Memory
Lecture 11: DMBS Internals
NOVA: A High-Performance, Fault-Tolerant File System for Non-Volatile Main Memories Andiry Xu, Lu Zhang, Amirsaman Memaripour, Akshatha Gangadharaiah,
Persistency for Synchronization-Free Regions
Introduction to Operating Systems
Temporally Silent Stores (Alternatively: Louder Silent Stores)
Persistent Memory Transactions
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
Predictive Performance
Reducing Memory Reference Energy with Opportunistic Virtual Caching
Chapter 3: Windows7 Part 3.
Microsoft Virtual Academy
Free Transactions with Rio Vista
CSE 451: Operating Systems Spring 2011 Journaling File Systems
Virtual Memory CSCI 380: Operating Systems Lecture #7 -- Review and Lab Suggestions William Killian.
NVSTREAM:NVRAM-based Transport for Streaming Objects
CSE 542: Operating Systems
Lecture 20: Intro to Transactions & Logging II
CSE 451 Fall 2003 Section 11/20/2003.
CSE 451: Operating Systems Spring 2008 Module 14
Relaxed Consistency Finale
PMTest: A Fast and Flexible Testing Framework for Persistent Memory Programs Sihang Liu1, Yizhou Wei1, Jishen Zhao2, Aasheesh Kolli3,4, and Samira Khan1.
Foundations and Definitions
Problems with Locks Andrew Whitaker CSE451.
SQL Statement Logging for Making SQLite Truly Lite
Janus Optimizing Memory and Storage Support for Non-Volatile Memory Systems Sihang Liu Korakit Seemakhupt, Gennady Pekhimenko, Aasheesh Kolli, and Samira.
Testing Persistent Memory Applications
CSE 451: Operating Systems Spring 2010 Module 14
Presentation transcript:

An Analysis of Persistent Memory Use with WHISPER Sanketh Nalli, Swapnil Haria Michael M. Swift, Mark D. Hill Haris Volos*, Kimberly Keeton* University of Wisconsin-Madison & *Hewlett-Packard Labs (HPL) Good Morning, I am Sanketh & Today we’ll see how applications use persistent memory. This is joint work with Swapnil, my advisors Mike and Mark and HP Enterprise.

WHISPER Facilitate better system support for Persistent Memory Wisconsin-HP Labs Suite for Persistence Discovered behaviors: 4% accesses to PM, 96% accesses to DRAM 5-50 epochs/tx, contributed by memory allocation & logging Re-referencing PM cachelines: Rare across threads, common within a thread WHISPER: research.cs.wisc.edu/multifacet/whisper

Outline WHISPER: Wisconsin-HP Labs Suite for Persistence Analysis Results

Persistent Memory is coming soon Persistent Memory = NVM attached to CPU on memory bus Offers low latency reads and persistent writes Allows user-level, byte-addressable loads and stores Font size : 24 Color consistency No space before colon

What guarantees do applications need ? Durability = Content is available to user after failure Consistency = Content is recoverable and usable after failure Data PM CACHE Data Pointer Pointer Pointer 1 . Data update followed by pointer update in cache 2. Pointer is evicted from cache to PM 3. Data lost on failure, dangling pointer persists

Achieving consistency Data PM Data Data Data flush Pointer flush Pointer 2 . Flush data update to PM 3 . Store pointer update in cache 4 . Flush pointer update to PM 1 . Store data update in cache Ordering = Useful building block of consistency mechanisms Epoch = Set of writes to PM guaranteed to be durable before ANY writes to PM in following epochs become durable Ordering primitives: sfence, mfence of x86-64

PM systems for consistency Native Application-specific optimizations Persistent Heaps Atomic allocations, type safety PM-aware Filesystems POSIX interface Application TX TX load/ store NVML Mnemosyne read/write load/store VFS ext4-DAX PMFS Persistent Memory (PM)

What’s the problem ? Lack of standard workloads slows research Micro-benchmarks may not be representative Partial understanding of how applications use PM

WHISPER Benchmark Type Brief description (*Adapted to PM) N-store* Database H-store like DB. Undo logs for consistency Echo* KV store Scalable, multi-version key-value store Memcached* Mnemosyne Distributed key-value store Vacation* Online travel reservation system Redis NVML REmote Dictionary Service C-tree Microbenchmarks for simulations Hashmap NFS PMFS Linux server/client for remote file access Exim Mail server;stores mails in per-user file MySQL Widely used RDBMS for OLTP

Outline ✔ WHISPER: Wisconsin-HP Labs Suite for Persistence Analysis Results ✔

Identify writes to PM PIN for userspace, mmiotrace for the kernel PIN/mmiotrace PM Application PM Runtime PM Writes Identify Instrument Execute Trace Analyze Stats PIN for userspace, mmiotrace for the kernel On average, 101 lines in applications that update PM 67 line in the kernel that update PM

Instrument writes to PM PIN/mmiotrace PM Application PM Runtime PM Writes Identify Instrument Execute Trace Analyze Stats C macros capture all modes of updating PM and, PM transaction start/end, cacheline flushes, fences Example: Update and persist size of filesystem journal log.size = size; flush_buffer(log.size); asm(“sfence”); PM_SET(log.size, size); PM_FLUSH(log.size, 8); PM_FENCE();

Execute and Analyze Python analyzer and dependency-checker PIN/mmiotrace PM Application PM Runtime PM Writes Identify Instrument Execute Trace Analyze Stats Python analyzer and dependency-checker Analyze trace for several statistics Number of epochs/tx Epoch dependencies Epoch sizes

Outline ✔ ✔ WHISPER: Wisconsin-HP Labs Suite for Persistence Analysis Results ✔ ✔

Suggestion: Do not impede volatile accesses How many accesses to PM ? Suggestion: Do not impede volatile accesses

How many epochs/transaction ? Durability after every epoch is impedes execution Assumption: 3 epochs/TX = log + data + commit Reality: 5 to 50 epochs/TX Highest rate of epochs: Native & TM libraries Suggestion: Enforce durability only at the end of a transaction

How large are epochs typically ? # of 64B cachelines Determines amount of state buffered per epoch Small epochs are abundant 75% update single cacheline Large epochs in PMFS Suggestion: Consider optimizing for small epochs

What contributes to epochs ? Log entries Undo log: Alternating epochs of log and data Redo log: 1 Log epoch + 1 data epoch Persistent memory allocation 1 to 5 epochs Suggestion: Use redo logs and reduce epochs from memory allocator

What are epoch dependencies ? 1 Self-dependency: B  D Cross-dependency: 2  C Why do they matter ? Dependency can stall execution Measured dependencies in 50 us window A B 2 C 3 D Thread 1 Thread 2

How common are dependencies ? Suggestion: Design multi-versioned caches OR avoid updating same cacheline across epochs

research.cs.wisc.edu/multifacet/whisper/ Summary WHISPER: Wisconsin-HP Labs Suite for Persistence 4% accesses to PM, 96% accesses to DRAM 5-50 epochs/TX, primarily small in size Memory allocation, logging introduce extra epochs Cross-dependencies rare, self-dependencies common More results in ASPLOS’17 paper and code at: research.cs.wisc.edu/multifacet/whisper/

Extra

A Simple Transaction using Epochs transaction_begin: log[pobj.init] ← True log[pobj.data] ← 42 write_back(log) wait_for_write_back() pobj.init ← True pobj.data ← 42 write_back(pobj) transaction_end; Epoch 1 Log entries stored & persisted. TM_BEGIN(); pobj.data = 42; pobj.init = True; TM_END(); Epoch 2 Variables stored & persisted.

Runtimes cause write amplification PMFS Mnemosyne Logs every PM write NVML Clears log Auxiliary structures < 5% writes to PM Non-temporal writes Mnemosyne logs PMFS user-data PM is vaporware, will come soon but not from intel Don’t do whisper 2.0, look at something else as part of your grad school Contribution. Benchmarks, allocators are not that exciting. Intel is doing Rudimentary stuff. Language level persistence, compiler level persistence Is an open problem. Compare diff tx libs and say why are they different Or why are they same. Increasing server throuhput is more important than User perceived latency. Msr uses battery backed dram. Companies don’t want To scale out. They don’t want to buy more servers but pack them with more memory, To save cost and fill as much memory in a tiny space. Look at persistent memory manager. With big memory servers, mapping is a problem. Windows has DAX. Micron has shown interest, not apple. Facebook was more concerned about tail latency. Os kernel team google