Conquest-2: Improving Energy Efficiency and Performance Through a Disk/RAM Hybrid File System An-I Andy Wang Florida State University (NSF CCR , CNS )
2 Conquest-2 Team Members FSU An-I Andy Wang (PI), Charles Weddle, Cory Fox, Jin Qian, Dragan Lojpur, Mark Carpenter, Ryan Fishel UCLA Peter Reiher (Co-PI), Erik Kline Harvey Mudd College Geoff Kuenning Former members: Mathew Oldham, Noriel Lu, RuGang Xu
3 Motivation Computers are becoming cheaper Energy is not Energy consumption by storage devices 8% for laptops 24% for Web servers 77% for proxy servers 27% of the operating costs for data centers Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
4 Laptops: 8% 20 min of battery life Proxy server: Higher energy cost higher cooling cost lower density of servers more space cost Disk Energy Consumption Disk % of system power 5-yr cost of disk power Mobile Intel® Pentium® III laptop8%$5 Pentium® 4 web server24%$120 Pentium® 4 web proxy server77%$1,300 Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
5 Performance vs. Energy Benefits Performance More relevant during peak loads Energy savings Realized instantaneously Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
6 Roadmap Conquest Existing energy-saving approaches Emergence of memory-rich storage era Conquest-2 Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
7 Conquest A disk/persistent-RAM hybrid file system Deliver all file system services from memory, with the exception of high-capacity storage Two separate and specialized data paths Benefits: Simplicity Performance Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
8 Hardware Evolution KHz 1 MHz 1 GHz CPU (50% /yr) Memory (50% /yr) Disk (15% /yr) Accesses Per Second (Log Scale) (1 sec : 6 days)(1 sec : 3 months) Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
[Caceres et al., 1993; Hillyer et al., 1996; Qualstar 1998; Tanisys 1999; Micron Semiconductor Products 2000; Quantum 2000]9 Storage Media Alternatives accesses/sec (log) $/MB (log) persistent RAM Magnetic RAM? (write once) flash memory disk tape battery-backed DRAM Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
[Grochowski 2000]10 Price Trend of Persistent RAM Year $/MB (log) paper/film 3.5” HDD 2.5” HDD 1” HDD Persistent RAM Booming of digital photography 4 to 10 GB of persistent RAM Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
[Iram 1993; Douceur et al., 1999; Roselli et al., 2000]11 User Access Patterns Small files Take little space (10%) Represent most accesses (90%) Large files Take most space Mostly sequential accesses Except database applications Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
12 Files Stored in Persistent RAM Small files (< 1MB) No seek time or rotational delays Fast byte-level accesses Metadata Fast synchronous update No dual representations Executables and shared libraries In-place execution Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
[Devlinux.com 2000]13 Large-File-Only Disk Storage Allocate in big chunks No fragmentation management No tricks for small files Storing data in metadata Wrapping a balanced tree onto disk cylinders Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
[Katcher 1997; Sweeney et al., 1996; Card et al., 1999; Namesys 2002]14 Conquest is comparable to ramfs At least 24% faster than the LRU disk cache ISP workload ( s, web-based transactions) PostMark Benchmark 40 to 250 MB working set with 2 GB physical RAM Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
15 When working set > RAM, Conquest is 1.4 to 2 times faster than ext2fs, reiserfs, and SGI XFS PostMark Benchmark 10,000 files, 3.5 GB working set with 2 GB physical RAM Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
16 Conquest-2 Conquest has made advancements in terms of better performance Can we extend Conquest to improve performance and reduce energy consumption at the same time? Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
17 Conquest-Based Numbers A UCLA Webserver Single disk File size threshold of 32KB Spin down whenever the disk idle time > 10s Conquest: 84% energy savings LRU: 64% energy savings Greater benefits for multiple disks Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
18 Existing Approaches Provide degraded service Reduced disk rotation speed Speculative methods Predicting idle periods for shutting down a disk Not suitable for servers High loads Uniform data striping among disks Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
19 Cannot simply replace server drives with laptop ones Just Use Laptop Drives? Typical server driveTypical laptop drive Power consumption active13 W2 W Performance average latency4 ms7 ms sustained transfer rate30 – 60 MB/s35 MB/s spin-up time10 s1.6 s Cost $/GB$1/GB$4/GB Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
20 RAM performance/energy savings and disk capacity? Persistent RAM Storage? Typical server driveTypical RAM Power consumption active12.5 W735mW Performance average latency4.2 ms1.4 – 2.8 ns sustained transfer rate32 – 58 MB/s240 MB/s spin-up time10 s72 – 200 ns Cost $/GB$1.2/GB$154/GB Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
21 Why not Conventional Caching? High overhead to access data stored in RAM storage 90% cache hit rate ≠ 90% disk idle time 10% of cache misses can keep a drive spinning all the time e.g., multimedia workloads Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
22 What if you have multiple disks? Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
23 And access patterns are skewed Access patterns Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
24 Better Off Caching Cold Disks Spin down cold disks Access patterns Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
25 Conquest-2 Approach Strategic use of memory storage Improve performance Energy-aware memory manager Power down unused banks Power-aware RAIDs (PARAIDs) “Gear-shift” individual drives according to performance demands Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
26 New Roles of Memory Shaping the frequency, timing, and predictability of disk accesses Low frequency of disk access Better performance Energy savings Predictability Hide the latency to spin a disk up Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
27 File Access Characterizations FrequencyArrival timesPredictabilitySizeLocation.tar.gzLowBulkLowLargeDisk.mpgLowScatteredHighLargeDisk.cHighScatteredLowSmallRAM locate.dbHighBulkLowLargeRAM Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
28 Energy-Aware Memory Management indices data frequently used (index, data) infrequently used (index, data) Conceptually simple, but difficult in practice Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
29 Linux Memory Manager (1) Page allocator maintains individual pages Page allocator Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
30 Linux Memory Manager (2) Zone allocator allocates memory in power-of- two sizes Page allocator Zone allocator Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
31 Linux Memory Manager (3) Slab allocator groups allocations by sizes to reduce internal memory fragmentation Page allocator Zone allocator Slab allocator Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
32 Linux Memory Manager (4) Difficult to collocate information according to energy constraints Page allocator Slab allocator Zone allocator Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
33 Conventional RAID load time drives
34 Power-Aware RAID load time drives load time drives
35 Challenges Energy Not enough opportunities to spin down RAIDs Performance Essential for peak loads Reliability Server-class drives are not designed for frequent power switching Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
36 Power-Aware RAID Observations RAIDs are configured for peak performance Uniform striping keeps all drives spinning for light loads Over-provision of storage capacity Unused storage can be traded for energy savings Cyclic fluctuation of loads Infrequent on-off power transitions can be effective Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
37 Cyclic Fluctuation of Loads load time utilization threshold gear 2 gear 1 Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
38 Skewed Striping for Energy Saving Use over-provisioned spare storage Can use fewer drives for light loads gear 1 soft-state block replication gear 2 disk 1disk 2disk 3disk 4 gear 3 RAID-5 layout Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
39 Preserving Peak Performance Based on RAID-5 All drives on for peak loads Full parallelism Fewer drives on for light loads Lower latency for small files Degraded throughput for large files Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
40 Reliability Drives have a limited number of power cycles Form bi-modal distribution of busy/idle drives Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion busy disks power cycled disks idle disks role exchange Disk 1 Gear 1 Gear 2 Gear 3 Disk 2Disk 3Disk 4Disk 5Disk 6
41 Reliability Drives have a limited number of power cycles Form bi-modal distribution of busy/idle drives Rotate drives with more power cycles Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion % of power cycles 100% 0% 020,000 power cycles gear 1 gear 2 gear 3
42 Reliability Drives have a limited number of power cycles Form bi-modal distribution of busy/idle drives Rotate drives with more power cycles Ration number of power cycles Distributed parity (RAID-5) Tolerate single-disk failures Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
43 Other Issues Update propagations Gear-shifting policies Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion disk utilization gear 2 utilization threshold time gear shift gear 1 utilization threshold disk utilization utilization threshold time downshift 300s, 60s, 10s moving averages
44 Gear-Shifting Policies Ideal In practice time load time load
45 Empirical Measurements Servers are not measurement friendly Time consuming Cannot easily apply the trick of skipping idle times Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
46 Workload Selection Need to match with the hardware setup energy consumption load linear scaling everything on or off geared switching no choices time
47 Experiment Set 1 Workload FSU CS Department Web Server trace A single day trace Hardware Dell 2600 with 5 drives PARAID 2 gears (3-disk RAID-0 and 5-disk RAID-0) No energy-aware memory management Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
48 Web Trace Replay 512x 1920x1024x Speed-upPower savings (+stdev) 1920x (290 req/sec)15% (+2.2%) 1024x (144 req/sec)25% (+1.3%) 512x (72 req/sec)34% (+1.1%)
49 Experiment Set 2 Workload Cello99 server I/O trace from HP A 50-hr trace Hardware Dell 2600 with 5 drives PARAID 2 gears (3-disk RAID-5 and 5-disk RAID-5) No energy-aware memory management Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
50 Cello99 50hr Trace Speed-upPower savings (+stdev) 128x (1024 req/sec)7.8% (+0.21%) 64x (548 req/sec)12% (+2.8%) 32x (274 req/sec)13% (+0.26%) 32x 64x128x
51 Experiment Set 3 Workload PostMark benchmark (ISP workload) Hardware Dell 2600 with 5 drives PARAID 2 gears (3-disk RAID-5 and 5-disk RAID-5) No energy-aware memory management Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
52 PostMark
53 Conquest-2 Current Status PARAID Implementing reliability mechanisms Energy-aware memory manager Integrating the memory and the disk components Empirical measurements Exploring different server loads Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
54 Conclusion Energy efficiency and performance can be achieved simultaneously PARAID-0 with “2 gears” has already shown a 15% reduction in power with < 1% performance loss Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
55 Questions Google keywords Conquest file system Power-Aware RAID Andy Wang FSU Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion
56 Gear-Shifting Details