EXT2C: Increasing Disk Reliability Brian Pellin, Chloe Schulze CS736 Presentation May 3 th, 2005
Introduction Problem: Disks can fail silently, corrupting data (Partial) Solution: Checksum the data to verify correctness before returning data to the user (This does not recover lost data, but at least the user knows the data is bad.)
Approach Implement checksumming within EXT2 New read/write functions Wrap new functionality around existing functions New error code
Conclusion Summary Implemented working checksumming on top of EXT2 Achieves added safety at the cost of additional overheads
Outline Fault Model EXT2C Performance Conclusions
Fault Model Fail Stop – all or nothing No longer adequate for today’s disk failures Partial Failure Latent sector errors Misdirection – right data written to wrong location Phantom writes – disk returns okay, but data was not written Malicious writes
Relating Fault Model to Implementation Partial Failure suggests: Detection Notification Verification of data Backup or replication to avoid data loss
Outline Fault Model EXT2C Performance Conclusions
EXT2C EXT2 base file system Modifications Checksum file data New read/write functions to implement the checksumming New error code to notify user of data corruption
Checksums Checksum computed per block of a file One checksum file per inode Named for file’s inode number 20 bytes long (fixed length) Computed by the hash function SHA-1
Checksum Creation inode no. 5 Block no. 1 … SHA-1 foo.c 5 (Checksum file) …
Ext2C_file_read Normal file read Open checksum file Calculate blocks read For each block being read Read in block Compute checksum Read in old checksum Compare – if not equal, return error Close checksum file Return result
Read Operation File Data Blocks Read 4000 bytes 1.Read Data From File 2.Read overlapping data block 3.Read corresponding section of checksum file 4.Hash Data and Compare with stored checksum 5.Repeat for other blocks overlapped by read Checksum File Hash ?= Match or Failure
Ext2c_file_write Normal file write Calculate blocks changed Open checksum file For each changed block Read in block Compute checksum Write checksum to checksum file Close checksum file Return result of normal file write
Targeted Problems Detect Silently corrupted data Partially Detect Phantom writes Misdirection Malicious write
Outline Fault Model EXT2C Performance Conclusions
Correctness Able to run PostMark and additional benchmarks without encountering any errors Injected errors are detected and our error code is returned
EXT2 vs. EXT2C Test Outline Microbenchmarks Measure cold cache small reads/writes Warm Cache small reads Benchmarks capturing larger scale behavior PostMark Large sequential reads
Cold cache read/write Comparison We time the differences between ext2 and ext2c on: Single block reads Single block writes 10 block reads 10 block writes
Warm cache comparison What overhead does ext2c add when data is cached in memory?
EXT2 vs. EXT2C Test Outline Microbenchmarks Measure cold cache small reads/writes Warm Cache small reads Benchmarks capturing larger scale behavior PostMark Large sequential reads
PostMark Benchmark crafted to simulate realistic small file workloads Intersperses read/write/append operations Measures throughput (transactions per second)
PostMark Results (Transactions per second) EXT2EXT2C Total Transactions Create500 Read Append Delete628
Large Sequential Reads Desire: Check summing costs will be amortized over long operations
Outline Fault Model EXT2C Performance Conclusions
Benefit: notification of data corruption, no longer mistake bad or wrong data for good data Cost: overhead of checksum computation and extra I/O costs Throughput is halved on small file workloads Sequential I/O amortizes some overhead
Further Work Optimizations Open/close the checksum when the file is opened and closed Batch checksum creation at time of file system creation Ensuring that checksum data blocks are near file blocks to reduce seeking
References/Influences DesAutels, P. “SHA 1: Secure Hash Algorithm.” Patil, S., Kashyap, A., Sivanthanu, G., Zadok, E. “I3FS: An In-Kernel Integrity Checker and Intrusion Detection File System” Prabhakaran, V., Agrawal, N., Bairavasundaram, L., Gunawi, H., Arpaci-Dusseau, A., Arpaci-Dusseau, R. “IRON File Systems.” Draft 2005 Sivanthanu, G., Wright, C., Zadok, E. “Enhancing File System Integrity Through Checksums.” Technical Report FSL Stein, C., Howard, J., Seltzer, M. “Unifying File System Protection, Proceedings of the 2001 USENIX Annual Technical Conference” 2001 Weinberg, G. “Solaris Dynamic File System.” Sun Microsystems Presentation