Download presentation
Presentation is loading. Please wait.
1
OLTP on NVM: YMMV @andy_pavlo
6
Prison Life GOOD EVIL Washing Dishes Not Fighting Repentant
Cafeteria Thievery Shankings Making Pruno When you’re in prison, you have to make choice about how you want to live your life in there. On one hand, you can try to lead a good life. That means helping out with washing dishes, not getting in fights or battles with other inmates, and generally having a conciliatory demeanor. You want the parole board to think that you are truly repentant about your crimes so that you can get out sooner. The problem with this approach is that the other inmates will see this and then you’ll probably get beat up. On the other end of the spectrum, you can continue to live a hard life in prison. This means doing things like stealing forks and other contraband from the cafeteria and then using them to make shanks to stab other roommates to take their cigarettes. You can then also use the food that you still from the cafeteria to start brewing pruno, or prison wine in your cell. The best pruno is usually made from “fruit” and ketchup. Oh just as a word of advice, you want to avoid making pruno from potatoes. A lot of prisoners think that they’re going to make pruno vodka with them but you just end up with a bout of botulism. The advantage of leading this lifestyle is that you are obviously going to not get beat up by other prisoners because they’re going to be afraid of you, but it also means that the parole board is going to be hard on you and you’re going to end up serving your full sentence. The trick that I learned is that you actually want to be in the middle. You want a little bit from both lifestyles. Maybe you still want to still stuff from the cafeteria but you want to just take stuff that will help you make pruno, and then you can share your pruno with other prisoners. That will keep you in good graces with the parole board but also keep you from getting beaten with a piece of soap inside of a tube sock while taking a shower.
7
NVM OLTP DRAM SSD/HDD Lightweight CC Logical Logging Snapshots
Heavyweight CC ARIES Logging Making Pruno What’s really remarkable about this philosophy is that it’s the same decision that we face when deciding what kind of DBMS architecture to use for running OLTP workloads on non-volatile memory devices. On one end of the spectrum we have DRAM-oriented systems that get really good performance using a lightweight concurrency control scheme, logical logging schemes. But they have a longer recovery time when the system crashes, since now you need to load in the last snapshot that you took from disk and replay the log to get you back to the state of the database that you had before the crash. Then at the other end, you have the disk-oriented systems backed by a solid-state drive or a spinning disk drive. With these systems, you have to use a heavyweight concurrency control scheme because any txn at any time could try to access a record that’s not in the buffer pool and the DBMS has to go out to disk to get it. They also employ a heavy-weight recovery scheme, something like ARIES, which has to record a lot of information about changes to the DB out disk. All of this takes a lot of time, so while you are waiting for a disk-oriented DBMS to process transactions you could still make a small batch of pruno in your server room. But with NVM, it’s not as black and white. The devices are going to be much, much faster than an SSD, so you probably don’t want to use the heavyweight mechanisms found in a disk-oriented system, but they’re not quite as fast as DRAM, so we can’t adopt all of the components of a DRAM system.
8
Overview Understand the performance characteristics of NVM to develop an optimal DBMS architecture for OLTP workloads. And that’s what our current research as part of the Big Data ISTC is all about. We’re trying to understand the performance characteristics of next-generation NVM devices so that we can design a new DBMS architecture that is likely to borrows ideas from the main memory-oriented DBMSs and the traditional, disk-oriented DBMSs. So much of this work is very preliminary. We have only been running on Intel’s NVM SDV for about a month now and we’re still porting our software to work on it, but I want to give you a glimpse of what we’ve done so far, and our current thoughts of where we are heading with the new system.
9
Intel NVM Emulator Instrumented motherboard that slows down access to the memory controller. Two execution interfaces: NUMA (NVM-only) PMFS (DRAM+NVM)
10
NUMA Interface – NVM-Only
Virtual CPU where all memory access uses the NVM portion of DRAM. No change to application code.
11
PMFS Interface – DRAM+NVM
Special filesystem designed for byte-addressable NVM. Avoids overhead of traditional filesystems.
12
DBMS Architectures Disk-oriented. Main memory-oriented.
13
Disk-oriented DBMS Pessimistic assumption that the data a txn needs is not in memory Based on the design assumptions made in the 1970s. Ingres (Berkeley) System R (IBM)
14
Application DRAM PMFS WAL
15
Memory-oriented DBMS Assume that all data fits in memory. Avoid the overhead of concurrency control + recovery. SmallBase (AT&T) Hekaton (Microsoft) H-Store/VoltDB (Me & others…)
16
Application NUMA CMD Log PMFS
17
Experimental Evaluation
Compare the DBMS architectures on the two NVM interfaces. Yahoo! Cloud Serving Benchmark: 10 million records (~10GB) 8x database / memory Variable skew What I want to share with you is two sets of experiments that we’ve done to evaluate the performance of this new version of H-Store. We’re going to compare the performance of H-Store with the MMAP storage manager against an installation of MySQL that we’ve tuned for OLTP workloads. We’re going to use the YCSB benchmark with 10 million records. Each record is about 1KB so that comes out to be about 10GB. For H-Store, we’re going to allow the system to allocate enough memory from PMFS to store the entire database. For MySQL, we’re going to set the buffer pool size such that only an eighth of the database fits in DRAM. This ensures that the systems are reading and writing to PMFS enough for their systems.
18
Evaluated Systems NVM-Only NVM+DRAM H-Store (v2014) MySQL (v5.5)
H-Store + Anti-Caching (v2014)
19
Skew Amount (high→Low)
NUMA Interface (NVM-Only) Read-Only Workload 2x Latency Relative to DRAM YCSB // H-Store MySQL txn/sec Skew Amount (high→Low)
20
Skew Amount (high→Low)
PMFS Interface (NVM+DRAM) Read-Only Workload 2x Latency Relative to DRAM YCSB // Anti-Caching MySQL txn/sec Skew Amount (high→Low)
21
Skew Amount (high→Low)
NUMA Interface (NVM-Only) Write-Heavy Workload 2x Latency Relative to DRAM YCSB // H-Store MySQL txn/sec Skew Amount (high→Low)
22
Skew Amount (high→Low)
PMFS Interface (NVM+DRAM) Write-Heavy Workload 2x Latency Relative to DRAM YCSB // Anti-Caching MySQL txn/sec Skew Amount (high→Low)
23
Discussion NVM latency did not make a big difference in performance.
Logging is major bottleneck in DBMS performance on NVM. Also wears out device quickly MySQL wastes NVM space.
24
N-STORE nstore.cs.cmu.edu
25
N-Store First DBMS for NVM-only operating environment.
OLTP/OLAP hybrid Column-store that supports fast in-place updates. The first possible architecture that we are considering is to keep DRAM in the equation and build a hybrid system that can support high-performance OLTP transactions and longer running analytical operations in the same DBMS. To do this, we would use DRAM as a place to store hot data in a row-oriented format. Overtime the system will migrate tuples into column-oriented storage stored in the NVM. This process will be completely transparent the application. This is sort of the same idea proposed by SAP HANA, but they keep everything in memory data structures. One interesting aspect of this approach is that we could explore the development of new types of indexes that can store keys in different ways based on what storage layer the record is residing in. For example, the keys that correspond to in-memory records will be stored in a regular b-tree but then the data that is out in the NVM can be stored in a different data structure that is more amenable to compression but will be slow to update.
26
Justin DeBrabant Joy Arulraj Rajesh Sankaran Subramanya Dulloor Andy Pavlo Mike Stonebraker Stan Zdonik Jeff Parkhurst
27
END @ANDY_PAVLO
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.