Loose-Ordering Consistency for Persistent Memory

Slides:



Advertisements
Similar presentations
Chapter 16: Recovery System
Advertisements

Crash Recovery John Ortiz. Lecture 22Crash Recovery2 Review: The ACID properties  Atomicity: All actions in the transaction happen, or none happens 
Coherence Ordering for Ring-based Chip Multiprocessors Mike Marty and Mark D. Hill University of Wisconsin-Madison.
Orchestrated Scheduling and Prefetching for GPGPUs Adwait Jog, Onur Kayiran, Asit Mishra, Mahmut Kandemir, Onur Mutlu, Ravi Iyer, Chita Das.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Better I/O Through Byte-Addressable, Persistent Memory
Is SC + ILP = RC? Presented by Vamshi Kadaru Chris Gniady, Babak Falsafi, and T. N. VijayKumar - Purdue University Spring 2005: CS 7968 Parallel Computer.
Crash Recovery, Part 1 If you are going to be in the logging business, one of the things that you have to do is to learn about heavy equipment. Robert.
1 Crash Recovery Chapter Review: The ACID properties  A  A tomicity: All actions of the Xact happen, or none happen.  C  C onsistency: If each.
© Dennis Shasha, Philippe Bonnet 2001 Log Tuning.
1 CSIS 7102 Spring 2004 Lecture 8: Recovery (overview) Dr. King-Ip Lin.
Jinze Liu. Have studied C.C. mechanisms used in practice - 2 PL - Multiple granularity - Tree (index) protocols - Validation.
CSCI 3140 Module 8 – Database Recovery Theodore Chiasson Dalhousie University.
Chapter 19 Database Recovery Techniques
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.
Thread-Level Transactional Memory Decoupling Interface and Implementation UW Computer Architecture Affiliates Conference Kevin Moore October 21, 2004.
Steven Pelley, Peter M. Chen, Thomas F. Wenisch University of Michigan
Jan. 2014Dr. Yangjun Chen ACS Database recovery techniques (Ch. 21, 3 rd ed. – Ch. 19, 4 th and 5 th ed. – Ch. 23, 6 th ed.)
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
More on transactions…. Dealing with concurrency (OR: how to handle the pressure!) Locking Timestamp ordering Multiversion protocols Optimistic protocols.
Recovery Fall 2006McFadyen Concepts Failures are either: catastrophic to recover one restores the database using a past copy, followed by redoing.
1 Tashkent: Uniting Durability & Ordering in Replicated Databases Sameh Elnikety, EPFL Steven Dropsho, EPFL Fernando Pedone, USI.
1 Anna Östlin Pagh and Rasmus Pagh IT University of Copenhagen Advanced Database Technology March 25, 2004 SYSTEM FAILURES Lecture based on [GUW ,
Techniques for Efficient Processing in Runahead Execution Engines Onur Mutlu Hyesoon Kim Yale N. Patt.
©Silberschatz, Korth and Sudarshan17.1Database System Concepts Chapter 17: Recovery System Failure Classification Storage Structure Recovery and Atomicity.
1 Coordinated Control of Multiple Prefetchers in Multi-Core Systems Eiman Ebrahimi * Onur Mutlu ‡ Chang Joo Lee * Yale N. Patt * * HPS Research Group The.
Transactions and concurrency control
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
A Lightweight Transactional Design in Flash-based SSDs to Support Flexible Transactions Youyou Lu 1, Jiwu Shu 1, Jia Guo 1, Shuai Li 1, Onur Mutlu 2 LightTx:
Distributed Deadlocks and Transaction Recovery.
JOURNALING VERSUS SOFT UPDATES: ASYNCHRONOUS META-DATA PROTECTION IN FILE SYSTEMS Margo I. Seltzer, Harvard Gregory R. Ganger, CMU M. Kirk McKusick Keith.
Journaling vs Soft Updates: Asynchronous Metadata Protection in File Systems Margo I. Seltzer, Harvard, Gregory R. Ganger CMU, M. Kirk McKusick, Keith.
Optimized Transaction Time Versioning Inside a Database Engine Intern: Feifei Li, Boston University Mentor: David Lomet, MSR.
@ Carnegie Mellon Databases Inspector Joins Shimin Chen Phillip B. Gibbons Todd C. Mowry Anastassia Ailamaki 2 Carnegie Mellon University Intel Research.
Lecture 12 Recoverability and failure. 2 Optimistic Techniques Based on assumption that conflict is rare and more efficient to let transactions proceed.
ReSlice: Selective Re-execution of Long-retired Misspeculated Instructions Using Forward Slicing Smruti R. Sarangi, Wei Liu, Josep Torrellas, Yuanyuan.
Resolving Journaling of Journal Anomaly in Android I/O: Multi-Version B-tree with Lazy Split Wook-Hee Kim 1, Beomseok Nam 1, Dongil Park 2, Youjip Won.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
Virtual Hierarchies to Support Server Consolidation Mike Marty Mark Hill University of Wisconsin-Madison ISCA 2007.
Chapter 10 Recovery System. ACID Properties  Atomicity. Either all operations of the transaction are properly reflected in the database or none are.
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Recovery.
Outline for Today Journaling vs. Soft Updates Administrative.
A Lightweight Transactional Design in Flash-based SSDs to Support Flexible Transactions Youyou Lu 1, Jiwu Shu 1, Jia Guo 1, Shuai Li 1, Onur Mutlu 2 LightTx:
ThyNVM Enabling Software-Transparent Crash Consistency In Persistent Memory Systems Jinglei Ren, Jishen Zhao, Samira Khan, Jongmoo Choi, Yongwei Wu, and.
JOURNALING VERSUS SOFT UPDATES: ASYNCHRONOUS META-DATA PROTECTION IN FILE SYSTEMS Margo I. Seltzer, Harvard Gregory R. Ganger, CMU M. Kirk McKusick Keith.
Motivation for Recovery Atomicity: –Transactions may abort (“Rollback”). Durability: –What if DBMS stops running? (Causes?) crash! v Desired Behavior after.
Dynamic Verification of Sequential Consistency Albert Meixner Daniel J. Sorin Dept. of Computer Dept. of Electrical and Science Computer Engineering Duke.
Quantifying and Controlling Impact of Interference at Shared Caches and Main Memory Lavanya Subramanian, Vivek Seshadri, Arnab Ghosh, Samira Khan, Onur.
Transactional Flash V. Prabhakaran, T. L. Rodeheffer, L. Zhou (MSR, Silicon Valley), OSDI 2008 Shimin Chen Big Data Reading Group.
2015 Storage Developer Conference. © Intel Corporation. All Rights Reserved. RDMA with PMEM Software mechanisms for enabling access to remote persistent.
What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently and safely. Provide.
NVWAL: Exploiting NVRAM in Write-Ahead Logging
Free Transactions with Rio Vista Landon Cox April 15, 2016.
Persistent Memory (PM)

DURABILITY OF TRANSACTIONS AND CRASH RECOVERY
Failure-Atomic Slotted Paging for Persistent Memory
Free Transactions with Rio Vista
High-performance transactions for persistent memories
PHyTM: Persistent Hybrid Transactional Memory
Better I/O Through Byte-Addressable, Persistent Memory
Persistency for Synchronization-Free Regions
HyperLoop: Group-Based NIC Offloading to Accelerate Replicated Transactions in Multi-tenant Storage Systems Daehyeok Kim Amirsaman Memaripour, Anirudh.
Free Transactions with Rio Vista
Printed on Monday, December 31, 2018 at 2:03 PM.
Outline Introduction Background Distributed DBMS Architecture
Presentation transcript:

Loose-Ordering Consistency for Persistent Memory Youyou Lu1, Jiwu Shu1, Long Sun1, Onur Mutlu2 Hello, everyone. I am Youyou Lu from Tsinghua University. Today, I will present our paper “Loose-Ordering Consistency for Persistent Memory”. This is a joint work with Jiwu Shu, Long Sun from Tsinghua University and Onur Mutlu from Carnegie Mellon University. 1Tsinghua University 2Carnegie Mellon University

Summary Problem: Strict write ordering required for storage consistency dramatically degrades performance in persistent memory Our Goal: To keep the performance overhead low while maintaining the storage consistency Key Idea: To Loosen the persistence ordering with hardware support Eager commit: A commit protocol that eliminates the use of commit record, by reorganizing the memory log structure Speculative persistence: Allows out-of-order persistence to persistent memory, but ensures in-order commit in programs, leveraging the tracking of transaction dependencies and the support of multi-versioning in the CPU cache Results: Reduces average performance overhead of persistence ordering from 67% to 35% In this paper, we work on the performance aspect of persistent memory while maintaining storage consistency. The problem is that strict write ordering required for storage consistency dramatically degrades performance in persistent memory. Our goal in this paper is to keep the performance overhead low while maintaining the storage consistency. Our key idea is to loosen the persistence ordering with hardware support. We call our approach Loose-Ordering Consistency, the LOC. We use two techniques in our approach. The first is Eager Commit. Eager Commit is a new commit protocol. It eliminates the use of commit record. We achieve this by reorganizing the memory log structure. The second is Speculative Persistence. Speculative Persistence allows out-of-order persistence to persistent memory, but ensures in-order commit in programs. We achieve this by leveraging the tracking of transaction dependencies and the support of multi-versioning in the CPU cache. Our evaluation results show that our approach reduces average performance overhead of persistence ordering from 67% to 35%.

Outline Introduction and Background Existing Approaches Our Approach: Loose-Ordering Consistency Eager Commit Speculative Persistence Evaluation Conclusion In this talk, I will first give the introduction and the background. And then, I will discuss the existing approaches. After that, I will present our approach – Loose-Ordering Consistency, including the Eager Commit and Speculative Persistence techniques. Finally, I will give the evaluation results and the conclusion.

Outline Introduction and Background Existing Approaches Our Approach: Loose-Ordering Consistency Eager Commit Speculative Persistence Evaluation Conclusion Let’s first come to the introduction and background.

Persistent Memory Persistent Memory Storage Consistency Memory-level storage: Use non-volatile memory in main memory level to provide data persistence Storage Consistency Atomicity and Durability: Recoverable from unexpected failures Boundary of volatility and persistence moved from Storage/Memory to Memory/Cache LLC L2 L1 LLC L2 L1 Our paper talks about persistent memory. Persistent memory is also called memory-level storage. It uses non-volatile memory in main memory level to provide data persistence. The left figure shows the architecture of traditional storage systems, and the right figure shows the architecture of persistent memory. In persistent memory, the memory provides data durability as well as the disk storage. In storage systems, storage consistency is an important design issue. It provides atomicity and durability of data writes, so that the storage system is recoverable even after unexpected failures. Persistent memory needs storage consistency, too. But the difference in ensuring storage consistency between disk-based storage systems and persistent memory is that: the boundary of volatility and persistence has moved from the interface of Storage and Memory to the interface of Memory and the CPU Cache. Let me show a simple example. In disk-based storage systems, … In persistent memory, … Memory (DRAM) Memory (NVM) Disk Storage Disk Storage

Storage Consistency – Write-Ahead Logging(WAL) F I J’ J M’ O’ P’ J’ M’ M N O O’ P’ Step 1. Log Write Step 2. Commit Record Write Step 3. In-place Write Step 4. Log Truncation Intra-tx Ordering Let’s use an example to look at the logical view of the Write-Ahead Logging (WAL). WAL is a heavily used protocol for storage consistency. In the example, we want to update the green nodes in the tree and inserts a new node. 1) The first step of WAL is to write them to the log, (named log write). 2) After the data blocks have been written to the log area completely, a commit record is written to indicate the completeness of a tx. The commit record is used during recovery to determine the tx completeness. 3) Only after the commit record is written, those updated data blocks can be written to their home locations, as shown in Step 3. 4) Finally, the log can be truncated after the updated data blocks have been written completely to their home locations. In the four steps, a transaction can send an acknowledge message to programs only after the commit record write. At that time, the durability and atomicity of a transaction are ensured. The atomicity is achieved using both the old version in their home locations and the new version in the log. The durability is achieved using the durable copy in the log. And thus, Step 3 and 4 are asynchronously executed, and not in the critical path. Instead, a tx can run only after its previous transaction completes. This waiting is the Inter-Tx Ordering in the figure. Similarly, we call the waiting between the log write and the commit record write the Intra-Tx Ordering. The two orderings are required for storage consistency, but lie in the critical path and hurt performance. We aim to reduce or relax the two orderings in our paper. Program Ack Inter-tx Ordering Ordering is required for storage consistency.

High Overhead for Ordering in PM Persistence ordering Force writes from volatile CPU cache to Persistent Memory L1 L2 Volatile LLC Memory (NVM) Persistent High overhead for persistence ordering The boundary between volatility and persistence lies between the H/W controlled cache and the persistent memory Costly software flushes (clflush) and waits (fence) Existing systems reorder writes at multiple levels, especially in the CPU and cache hierarchy The ordering we discussed specifically is referred to Persistence Ordering. Persistent ordering is used to write sequence keeping for writes from volatile media to persistent media. In persistent memory, to keep persistence ordering, it forces writes from the volatile CPU cache to the persistent memory. As shown in the figure, the CPU cache is volatile, and the memory is persistent. Persistence ordering is ensured by flushing from CPU cache to memory. Different from disk-based storage systems, persistence ordering causes even higher overhead in persistent memory. There are two reasons. The first is that The boundary between volatility and persistence lies between the H/W controlled cache and the persistent memory. And thus, the software explicitly flushes data blocks in the programs and waits for the completion. The second is that Existing systems reorder writes at multiple levels, especially in the CPU and cache hierarchy. It has to iterate each cache level.

Outline Introduction and Background Existing Approaches Our Approach: Loose-Ordering Consistency Eager Commit Speculative Persistence Evaluation Conclusion To address this problem, before introducing our LOC approach, I will first discuss existing approaches.

Existing Approaches Making the CPU cache non-volatile Reduce the time gap between volatility and persistence by employing a non-volatile cache Is complementary to our LOC approach Allowing asynchronous commit of transactions Allow the execution of a later transaction without waiting for the persistence of previous transactions Allow execution reordering, but no persistence reordering 3 2 1 Existing approaches can be divided into two categories in general. The first category is making the CPU cache non-volatile. This category of approaches aim to reduce the time gap between volatility and persistence by employing a non-volatile cache. The examples shows the execution of four transactions (T1 to T4). The figure shows one approach that uses non-volatile last level cache (LLC). Each transaction is committed only when their data blocks are written to the non-volatile cache. Fortunately, this category of approaches is complementary to our LOC approach. The second category of approaches is Allowing asynchronous commit of transactions. Approaches of this category Allow the execution of a later transaction without waiting for the persistence of previous transactions. The hardware keeps the ordering, and the software queries asynchronously. As shown in the right figure, T1 is written to the cache. But when the program needs to write T2, T1 should be flushed to PM, because T1 has overlapping data block write (block A) with T2. After T2 is written to the CPU cache, T3 can also be written to the CPU cache without forcing T2 to PM. At this time, the program begin to execute T4, while the actual committed tx is T1. In this way, the commit is asynchronous. However, the persistence ordering is still kept. Even though the execution order can be changed, the persistence order can not. T1: A, B, C, D T2: A, F T3: B, C, E T4: D, E, F, G LLC L2 L1 3 1 2 4 LLC L2 L1 3 1 2 4 1 LLC Memory (NVM) Memory (NVM)

Our Solution: Key Ideas 4 3 2 1 LLC L2 L1 Loose-Ordering Consistency (LOC) Allow persistence reordering 3 1 2 4 3 1 Memory (NVM) Eager Commit Remove the intra-tx ordering Delay the completeness check till recovery phase Reorganize the memory log structure Speculative Persistence Relax the inter-tx ordering Speculatively persist transactions but make the commit order visible to programs in the program order Use cache versioning and Tx dependency tracking Our approach is to allow persistence reordering. And our approach is called Loose-Ordering Consistency, simply the LOC. As shown in the figure, LOC allows txs to be written to the CPU cache without forcing them to the PM. Also, data blocks in txs can be written to PM in any order. In order to achieve this, we have two techniques: Eager Commit and Speculative Persistence. Eager Commit aims to remove the intra-tx ordering. It delays the completeness check till recovery phase. This is achieved by reorganizing the memory log structure. Speculative Persistence aims to relax the inter-tx ordering. It speculatively persists transactions but make the commit order visible to programs in the program order. This is achieved by using cache versioning and Tx dependency tracking. The two techniques are to be discussed next.

Outline Introduction and Background Existing Approaches Our Approach: Loose-Ordering Consistency Eager Commit Speculative Persistence Evaluation Conclusion Now, let us come to part of our approach – Loose-Ordering Consistency. I will discuss the two techniques Eager Commit and Speculative Persistence of LOC.

LOC Key Idea 1 – Eager Commit Step 1. Log Write Step 2. Commit Record Write Step 3. In-place Write Step 4. Log Truncation Intra-tx Ordering Program Ack Inter-tx Ordering Goal: Remove the intra-tx ordering Eager Commit: A new commit protocol without commit records Before discussing the Eager Commit, let us first review the four steps in a transaction. In the four steps, the first two steps are in critical path, and there are two orderings, intra-tx and inter-tx ordering. The goal of Eager Commit is to remove the intra-tx ordering. Eager Commit is a new commit protocol without commit records.

Eager Commit Commit Protocol Eager Commit Commit record: Check the completeness of log writes Eager Commit Reorganize the memory log structure for delayed check Remove the commit record and the intra-tx ordering Use count-based commit protocol: <TxID, TxCnt> In order to remove the use of commit records, we need to understand why traditional WAL uses commit records. In commit protocols, a commit record is written after the writes have been completely written to the log area. On recovery, the existence of a commit record is used to determine whether the transaction has been committed. If there is a commit record, the tx is committed. Otherwise, it is not committed. Therefore, during normal runs, the commit protocol makes sure all log writes are complete and then writes a commit record. Eager Commit aims to remove the commit records. This can be achieved by reorganizing the memory log structure and thus delaying the completeness check on recovery. Figure: in the memory log area, we divide the memory into block groups. Each block group consists of eight blocks, in which seven are used for data blocks and the other is used for metadata block. The metadata block keeps the metadata information for the seven data blocks. In the metadata, we use TxID and TxCnt for a count-based commit protocol.

Eager Commit No commit record. Intra-tx ordering eliminated. Count-based commit protocol During normal run, Tag each block with TxID Set only one TxCnt to the total # of blocks in the tx, and others to ‘0’ During recovery, Recorded TxCnt: Find the non-zero TxCnt for each tx TxID Counted TxCnt: Count the tot. # of blocks in the tx If the two TxCnts match (Recorded = Counted), committed; otherwise, not-committed This slide explains the count-based commit protocol. In the count-based commit protocol, during normal run, each data block is tagged with a TxID in the metadata. For each tx, only one block has TxCnt set to total # of blocks in the tx, and others are set to‘0’. The non-zero TxCnt can be used in the recovery phase for completeness check. During recovery, it first searches the non-zero TxCnt in each tx. The TxCnt is the recorded TxCnt. And then, it count the total number of blocks in a tx. This TxCnt is the counted TxCnt. If the two values match, which means recorded TxCnt equals Counted TxCnt, the tx is committed. Otherwise, the tx is not committed. In this way, we avoid the use of commit records, and thus the intra-tx ordering. No commit record. Intra-tx ordering eliminated.

LOC Key Idea 2 – Speculative Persistence Step 1. Log Write Step 2. Commit Record Write Step 3. In-place Write Step 4. Log Truncation Intra-tx Ordering Program Ack Inter-tx Ordering Goal: relax the inter-tx ordering Speculative Persistence Out-of-order persistence: To relax the inter-tx ordering to allow persistence reordering In-order commit: To make the tx commits visible to programs (program ack) in the program order Let’s come to the second technique: Speculative Persistence. As discussed just now, the Eager Commit technique has removed the use of commit records and thus the intra-tx ordering. The second technique, Speculative Persistence, work on the inter-tx ordering as well as the program acknowledgement. The goal of Speculative Persistence is to relax the inter-tx ordering. Speculative persistence allows out-of-order persistence. This relaxes the inter-tx ordering to allow persistence reordering. But it ensures in-order commit. In-order commit means to make the tx commits visible to programs (program ack) in the program order.

Speculative Persistence T1: (A, B, C, D) -> T2: (A, F) -> T3: (B, C, E) -> T4: (D, E, F, G) Strict Ordering volatile CPU cache A A A B B B C C C D D D A A A F F F B B B C C C E E E D D D E E E F F F G G G persistent memory A B C D A F B C E D E F G Loose Ordering volatile CPU cache A A B B C C D D A A A F F B B B C C C E E D D D E E E F F F G G G I will use an example to show how Speculative Persistence works. In the example, we compare the strict ordering, the legacy way, and the loose ordering, which uses the Speculative Persistence technique. In each of the two figures, the top rectangles are volatile CPU cache blocks, and the bottom rectangles are blocks in persistent memory. The purple line is the program ack. Before the line, all transactions are committed. After the line, all transactions are not committed. In Strict Ordering,… One by one In Loose Ordering, transactions runs without forcing blocks to persistent memory. Speculative Persistence allows persistence of data blocks out-of-order. For example, T3 can persist data blocks before T2. … The program ack moves if the tx is committed. Speculative Persistence commits in order. For example, when E is persisted, T3 is committed. But since T2 is not, the program ack does not move. Only after F is persisted, the program ack moves two txs forward. Also, data blocks can be merged if two transactions have overlapping writes. This is the write coalescing. Thus, Speculative Persistence relaxes the inter-tx ordering, and enables write coalescing. persistent memory A B C D E F G Inter-tx ordering relaxed. Write coalescing enabled.

Speculative Persistence Speculative Persistence enables write coalescing for overlapping writes between transactions. But there are two problems raised by write coalescing of overlapping writes: How to recover a committed Tx which has overlapping writes with a succeeding aborted Tx? Overlapping data blocks have been overwritten Multiple Versions in the CPU Cache How to determine the commit status using the count-based commit protocol of a Tx that has overlapping writes with succeeding Txs? Recorded TxCnt != Counted TxCnt Commit Dependencies between Transactions Tx Dependency Pair: <Tp, Tq, n> While Speculative Persistence enables write coalescing for overlapping writes between transactions, there are two problems raised by write coalescing of overlapping writes. The first is How to recover a committed Tx which has overlapping writes with a succeeding aborted Tx? A committed transaction may have data blocks overwritten by a later aborted tx. The second is How to determine the commit status using the count-based commit protocol of a Tx that has overlapping writes with succeeding Txs? In this case, the counted TxCnt is smaller than the recorded TxCnt. The tx can not be determined to be committed using the count-based commit protocol. To the first question, Speculative Persistence supports multiple versions in the CPU cache. The committed transaction can have its overwritten data blocks recovered. To the second question, Speculative Persistence keeps commit dependencies between txs using Tx Dependency Pairs <Tp, Tq, n>. The pair means Tq overwirtes n pages in Tp. Thus, n is added to the counted TxCnt to compare the counted and recorded TxCnt. Details of the two can be found in the paper. See the paper for more details.

Recovery Recovery is made by scanning the memory log. More details in the paper. As for recovery, it is made by scanning the memory log. We discuss how to handle transaction recovery in our paper. Due to lack of time, I do not discuss it in this talk, but I would be happy to answer any questions.

Outline Introduction and Background Existing Approaches Our Approach: Loose-Ordering Consistency Eager Commit Speculative Persistence Evaluation Conclusion Next, I will show some evaluation results.

Experimental Setup GEM5 simulator Simulator configuration Workloads Timing Simple CPU: 1GHz Ruby memory system Simulator configuration L1: 32KB, 2-way, 64B block size, latency=1cycle L2: 256KB, 8-way, 64B block size, latency=8cycles LLC: 1MB, 16-way, 64B block size, latency=21cycles Memory: 8 banks, latency=168cycles Workloads B+ Tree, Hash, RBTree, SPS, SDG, SQLite In the evaluation, we use the full system simulator, the GEM5 simulator. We configure the simulator use parameters shown in the slide. And we evaluate different workloads, including commonly used data structures for storage, such as B+ tree, database workload, SQLite, and others.

Overall Performance We first show the overall performance. All the performance has been normalized to the performance without persistence ordering. The gap between the bar to the top line (value ‘1’) is the performance overhead of persistent ordering. So, in the y-xis of this figure, lower bars means higher performance overhead of persistence ordering. In the figure, H-WAL is WAL implemented in the hardware, and the LOC-WAL is the H-WAL using our approach: Loose-Ordering Consistency. By comparing the H-WAL, we can see LOC significantly improves performance of WAL. It reduces average performance overhead of persistence ordering from 67% to 35%. Also in the figure, Kiln is a newly proposed approach to use non-volatile LLC in MIRCO 2013. We incorporate LOC with Kiln and call it LOC-Kiln. The results show LOC and Kiln can be combined favorably. We conclude that LOC effectively mitigates performance degradation from persistence ordering. LOC significantly improves performance of WAL: Reduces average performance overhead of persistence ordering from 67% to 35%. LOC and Kiln can be combined favorably. LOC effectively mitigates performance degradation from persistence ordering.

Effect of Eager Commit We study the effect of the Eager Commit technique by comparing the H-WAL and EC-WAL. EC-WAL is the protocol that uses only Eager Commit in H-WAL. It shows Eager Commit outperforms H-WAL by 6.4% on average due to the elimination of intra-tx ordering. Eager Commit outperforms H-WAL by 6.4% on average due to the elimination of intra-tx ordering.

Effect of Speculative Persistence We also study the effect of Speculative Persistence. From the figure, we can see the performance increases as the speculative degree increases. Speculative Degree is the maximum number of transactions that are allowed to be persisted at one time. Speculative Persistence improves the normalized transaction throughput from 0.353 (SD=1) to 0.689 (SD=32) with a 95.5% improvement. The larger the speculation degrees, the larger the performance benefits. Speculative Persistence improves the normalized transaction throughput from 0.353 (SD=1) to 0.689 (SD=32) with a 95.5% improvement.

Outline Introduction and Background Existing Approaches Our Approach: Loose-Ordering Consistency Eager Commit Speculative Persistence Evaluation Conclusion Finally, let me conclude the talk.

Conclusion Problem: Strict write ordering required for storage consistency dramatically degrades performance in persistent memory Our Goal: To keep the performance overhead low while maintaining the storage consistency Key Idea: To Loosen the persistence ordering with hardware support Eager commit: A commit protocol that eliminates the use of commit record, by reorganizing the memory log structure Speculative persistence: Allows out-of-order persistence to persistent memory, but ensures in-order commit in programs, leveraging the tracking of transaction dependencies and the support of multi-versioning in the CPU cache Results: Reduces average performance overhead of persistence ordering from 67% to 35%

Loose-Ordering Consistency for Persistent Memory Youyou Lu1, Jiwu Shu1, Long Sun1, Onur Mutlu2 1Tsinghua University 2Carnegie Mellon University