Semantically-Smart Disk Systems Muthian Sivathanu, Vijayan Prabhakaran, Florentina Popovici, Tim Denehy, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau University of Wisconsin, Madison
The Storage Protocol Stack
A Tale of Two Layers Two layers: The file system and RAID Origins: The hardware/software boundary S/W: The file system H/W: The disk Each increasingly complex Each a world unto itself Separated by a narrow interface (SCSI) File System RAID
Your Innovations In Storage: Where In The Stack Should They Go?
Option #1: In The File System Put innovations in the file system! Pros: Lots of semantic information (directories, files, inodes) Well-defined interfaces (vnode/VFS) Cons: No low-level information (head position, RAID scheme) Hard to deploy: Many FS’s out there File systems change slowly FFS [1984] -> ext2 [2002] File System RAID
Option #2: In The RAID Put innovations in the RAID system! Pros: Lots of low-level knowledge Easy to deploy, sell Lots of “smarts” (CPUs + Memory) Cons: “A stream of block reads + writes, full of bits & bytes, signifying nothing” (no semantic knowledge) File System RAID
The Root Of The Problem: Narrow Interfaces Narrow interfaces stifle info flow SCSI to file system: Linear array of blocks SCSI to disk system: Stream of block reads and writes Result: Innovation is limited No “FS-like” innovations in RAID Anything where info from BOTH subsystems is required can’t be done File System RAID SCSI
So What’s A Storage-System Innovator To Do?
The Ideal Solution? Exploit raw processing and memory of RAID Built in low-level knowledge Easy to deploy Keep interface the same Assume traditional file system/RAID boundary Acquire semantic information of file system Learn about files, directories, inodes, and other file system data structures
Semantically-Smart Disk System (SDS) Disk system that understands file system Data structures Operations Operates underneath unmodified FS Must discover layout + on-disk structures Must “reverse engineer” block stream Exploits knowledge and “smarts” to implement new class of services File System SDS $ CPU
Outline Motivation Semantic Knowledge: Acquisition Semantic Knowledge: Exploitation Case studies Conclusions
Static Knowledge: File System Layout Challenge: How to discover layout information? White-box approach: Embed knowledge in SDS Trend: FS layout does not change frequently Superblk I-Bitmap D-Bitmap Inodes Data I-Bitmap D-Bitmap Inodes Data Group 1Group 2
Have Knowledge, Will Innovate Knowing structures is not enough (sometimes) Data block overloading (data, pointer, directory) High-level operations not known (create, delete) Requires new on-line techniques Direct classification Indirect classification Block association Operation inferencing
Direct Classification Given address, determine type directly Direct classification via bounds check Given disk address, can check bounds to determine type (superblock, bitmaps, inodes, general data block) Super I-Bit D-Bit Inode Data I-Bit D-Bit Inode Data
Beyond Simple Meta-Data Want to cache other meta-data blocks Directory blocks, indirect-pointer blocks Problem: Data blocks are overloaded type Block in “data” region could be: file data, dir, pointer Direct classification necessary but not sufficient Indirect classification via inode snooping Super I-Bit D-Bit Inode Data I-Bit D-Bit Inode Data
Indirect Classification: Directories Super I-Bit D-Bit Inode Data I-Bit D-Bit Inode Data SDSDirectory Hash Table Ptr1 Ptr2 Ptr3 Directory Type Ptr1 Ptr2 Ptr3 Size Inode
Directory Hash Table Indirect Classification In Action Super I-Bit D-Bit Inode Data I-Bit D-Bit Inode Data SDS Ptr1 Ptr2 Ptr3 Check Hash If Present, Cache
Indirect Classification: Issues Space overhead Small overhead per directory and indirect block Time overhead 1 hash update per pointer, 1 lookup per data block Tricky: Sometimes data block is seen “early” Haven’t yet seen pointer associated with data block Solution #1: OK to misclassify some blocks for some amount of time Solution #2: Must defer classification until pointer has been observed
Getting Rid Of The Dead If file blocks are deleted, remove them from cache No need to keep dead blocks around Problem: How to determine if a file is deleted? Need to look for signs of deletion Three different places to look: Inode bitmaps Directory that contains file Inode itself Operation inferencing via block differencing
SDS Operation Inferencing: Detecting Deletes (Inode Bitmap) Super I-Bit D-Bit Inode Data I-Bit D-Bit Inode Data Diff = Read Old Version I-Bitmap Result: Deleted Files
Operation Inferencing: Overheads Space overhead Block cache of inodes, indirect pointers, bitmaps, etc. (could be substantial) Time overhead CPU: Difference operation is like an extra copy Disk: May require block read (if small/no cache) [In paper: Quantified time and space overheads] Main point: There is a CPU and memory cost
Using Semantic Knowledge Fast RAID Reconstruction Only copy “live” data to hot-spare Track-aligned Extents [Schindler et. al. ‘02] Placing files in disk cognizant manner Journaling [Hagmann ‘87] Limit study: can implement complex “FS”-like functionality within a semantically-smart disk File-aware Caching Caching meta-data under BSD FFS Caching journal under Linux ext3
Conclusions
Innovation in traditional storage stack is limited File system: high but not low-level info Storage system: low but not high-level info Semantically-smart disks: Best of both worlds? Takes advantage of “smart” disk systems Exploit low-level information… …with high-level knowledge of file system A remaining challenge Overcoming the “file system obfuscation” problem
“To know that we know what we know, and that we do not know what we do not know, that is true knowledge.” Confucius
“A man cannot inquire about what he knows, because he knows it, and in that case he is in no need of inquiry, nor again can he inquire about what he does not know, since he does not know what he is to inquire.” Socrates