Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University
Introduction Disk drives are often bottlenecks Several optimizations have been proposed Disk arrays Fewer disk reads using fancy buffer cache mgmt Optimized disk writes using logs Optimized disk scheduling Disk throughput still problem for data- intensive servers
Modern Disk Drives Substantial processing and memory capacity Disk controller cache Independent segments = sequential streams If #streams > #segments, LRU segm is replaced On access, blocks are read ahead to fill segment Disk arrays Array controller may also cache data Striping affects read-ahead
Key Problem Controller caches not designed for servers Sequential access to small # large files Read-ahead of consecutive blocks Segment is unit of allocation and replacement Data-intensive servers Small files Large # concurrent accesses Large # blocks often miss in the controller cache
This Work Goal Management techniques for disk controller caches that are efficient for servers Techniques File-Oriented Read-ahead (FOR) Host-guided Device Caching (HDC) Exploit processing and memory of drives
Architecture
File-Oriented Read-ahead Disk controller has no notion of file layout Read-ahead can be useless for small files Disk utilization is not amortized Useless blocks pollute the controller cache FOR only reads ahead blocks of same file
File-Oriented Read-ahead FOR needs to know layout of files on disk Bitmap of disk blocks kept by controller 1 block is logical continuation of previous block Initialized at boot, updated on metadata writes # blocks to read-ahead = # consecutive 1’s or max read-ahead size
File-Oriented Read-ahead FOR could underutilize segments, so allocation and replacement based on blocks Replacement policy: MRU FOR benefits Lower disk utilization Higher controller cache hit rates
Host-guided Device Caching Data-intensive servers rely on disk arrays, so non-trivial amount of cache space Current disk controller caches are speed matching and read-ahead buffers More useful if each cache can be managed directly by the host processor
Host-guided Device Caching Our evaluation: Disk controllers permanently cache data with most misses in buffer cache Each controller caches data stored on its disk Assumes block-based organization Support for three simple commands pin_blk() unpin_blk() flush_hdc()
Host-guided Device Caching Execution divided into periods to determine: How many blocks to cache; which blocks those are; when to cache them HDC benefits Higher cache hit rate Lower disk utilization Tradeoff: space for HDC and read-aheads
Methodology Simulation of 8 IBM Ultrastar 36Z15 drives attached to non-caching Ultra160 SCSI card Logical disk blocks striped across array Contention for buses, memories, and other components is simulated in detail Synthetic + real traces (Web, proxy, file)
Real Workloads Web: I/O time as function of striping unit size HDC: 2MB
Real Workloads Web: I/O time as function of HDC memory size Stripes: 16KB
Real Workloads Summary Consistent and significant performance gains Combination achieves best overall performance
Related Work Techniques external to disk controllers Controller cache different than other caches Lack of temporal locality Orders of magnitude smaller than main memory Read-ahead restricted to sequential blocks Explicit grouping Grouping needs to be found and maintained Segment replacements may eliminate benefits
Related Work Controller read-ahead & caching techniques None considered file system info, host-guided caching, or block-based organizations Other disk controller optimizations Scheduling of requests Utilizing free bandwidth Data replication FOR and HDC are orthogonal
Conclusions Current controller cache management is inappropriate for servers FOR and HDC can achieve significant and consistent increases in server throughput Real workloads show improvements of 47, 33 and 21% (Web, proxy, and file server)
Extensions Strategies for servers that use raw I/O Better approach than bitmap Array controllers that cache data and hide individual disks Impact of other replacement policies and sizes for the buffer cache
More Information
Synthetic Workloads I/O time as function of file size
Synthetic Workloads I/O time as function of simultaneous streams
Synthetic Workloads I/O time as function of access frequency
Synthetic Workloads Summary No read-ahead hurts performance for files > 16KB No effect if simply replace segments with blocks FOR gains increase as file size decreases and # simultaneous streams increases HDC gains increase as requests are shifted toward a small # blocks FOR gains decrease as % writes increases
Synthetic Workloads I/O time as function of percentage of writes