Download presentation
Presentation is loading. Please wait.
Published byOscar Jennings Modified over 8 years ago
1
Transactional Flash V. Prabhakaran, T. L. Rodeheffer, L. Zhou (MSR, Silicon Valley), OSDI 2008 Shimin Chen Big Data Reading Group
2
Introduction SSD: block-level APIs as disks Lost of opportunity Goal: new abstractions for better matching the nature of the new medium as well as the need from file systems and databases
3
Idea: Transactional Flash (Txflash) An SSD (w/ new features) Addressing: a linear array of pages Support read and write operations Support a simple transactional construct Each tranx consists of a series of write operations Atomicity Isolation Durability
4
Why is this useful? Transaction abstraction required in many places: file system journals, etc. Each application implements its own Complexity Redundant work Reliability of the implementation Great if a storage layer provides transactional API
5
Previous Work: disk-based Copy-on-Write + Logging Fragmentation poor read performance Checkpointing and cleaning Cleaning cost SSDs mitigate these problems SSDs already do CoW for flash-related reasons Random read accesses are fast
6
Outline Introduction The Case for TxFlash Commit Protocols Implementation Evaluation Conclusion
7
TxFlash Architecture & API s WriteAtomic(p1…pn) p1…pn are in a tranx followed by write(p1)…write(pn) atomicity, isolation, durability Abort aborting in-progress tranx In-progress tranx Not issue conflict writes Core of TxFlash
8
Simple Interface WriteAtomic: multi-page writes Useful for file systems Not full-fledged tranx: no reads in tranx Reduce complexity Backward compatible
9
Flash is good for this purpose Copy-on-write: already supported by FTL Fast random reads High concurrency multiple flash chips inside New device: New interface more likely
10
Outline Introduction The Case for TxFlash Commit Protocols Implementation Evaluation Conclusion
11
Traditional Commit First write to a log: Intention record: (data, page# & version#, tranx ID) … Intention record Commit record Tranx is committed == commit record exists Intention records modify original data If modifications are done, the records can be garbage collected
12
Traditional Commit on SSDs Optimizations: All writes can be issued in parallel Not update the original data, just update the remap table Problem: commit record Extra latency after other writes Garbage collection is complicated: Must know if all the updates complete or not
13
New Proposal (1): Simple Cyclic Commit No commit record Intension records of the same tranx use next links to form a cycle (data, page# & version#, next page# & version#) Tranx is committed == all intension records are written Flash page (4KB) + metadata (128B) are co-located
14
Problem
15
Solution: Any uncommitted intention on the stable storage must be erased before any new writes are issued to the same or a referenced page
16
Operations Initialization: Setting version# to 0, next-link to self Transaction Garbage Collection: For any uncommitted intention For committed page if a newer version is committed Recovery: scan all pages then look for cycles
17
New Proposal (2): Back Pointer Cyclic Commit Another way to deal with ambiguity Intention record: (data, page#&version#, next-link, link to last committed version)
18
A3 is a straddler of A2 Some complexity in garbage collection and recovery because of this
19
Protocol Comparison
20
Outline Introduction The Case for TxFlash Commit Protocols Implementation Evaluation Conclusion
21
Implementation Simulatior DiskSim trace-driven SSD simulator (UNIX’08) modifications for TxFlash Support tranx of maximum size 4MB Pseudo-device driver for recording traces TxExt3: Employ Txflash for Ext3 file system Tranx: Ext3 journal commit
22
Experimental Setup TxFlash device: 32GB: 8x 4GB flash packages 4 I/O operations within every flash package 15% of space reserved for garbage collection Workload on top of Ext3: IOzone: micro benchmark (no sync writes) Linux-build (no sync writes) Maildir (sync writes) TPC-B: simulate 10,000 credit-debit-like operations on TxExt3 file system (sync writes) Synthetic workloads
23
Cyclic commit vs. Traditional commit
24
Unlike database logging, large tranx sizes: no sync; data are included
25
simple cyclic commit has a high cost if there are aborts
27
TxFlash vs. SSD Remove WriteAtomic from traces Use SSD simulator SSD does not provide any transaction guarantees (so should have better performance)
28
Space comparison: TxFlash needs 25% of more main memory than SSD 4+1 MB per 4GB flash 40 MB for the 32GB TxFlash device
29
End-to-end performance TxFlash: Run pseudo-device driver on real SSD The performance is close to that of TxFlash Ext3: Use SSD as journal SSD cache is disabled in both cases
31
Summary TxFlash: Adding transaction interface in SSD Cyclic commit protocols Nice solution for file system journaling
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.