Conquest-2: Improving Energy Efficiency and Performance Through a Disk/RAM Hybrid File System An-I Andy Wang Florida State University (NSF CCR-0098363,

Slides:



Advertisements
Similar presentations
Conserving Disk Energy in Network Servers ACM 17th annual international conference on Supercomputing Presented by Hsu Hao Chen.
Advertisements

Lecture 13 Page 1 CS 111 Online File Systems: Introduction CS 111 On-Line MS Program Operating Systems Peter Reiher.
High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.
MEMS Based Mass Storage Systems. What is MEMS? (M)icro(E)lectric(M)echanical(S)ystems Consist of mech µ(structures, sensors, actuators), electronics,
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
CSCE 212 Chapter 8 Storage, Networks, and Other Peripherals Instructor: Jason D. Bakos.
The Conquest File System: An-I A. Wang Geoffrey H. Kuenning Peter Reiher Gerald J. Popek Life after Disks Abstract The rapidly declining cost of persistent.
Conquest: Preparing for Life After Disks An-I Andy Wang Geoff Kuenning, Peter Reiher, Gerald Popek.
Caching for File Systems Andy Wang COP 5611 Advanced Operating Systems.
Shimin Chen Big Data Reading Group.  Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum.
Lecture 17 I/O Optimization. Disk Organization Tracks: concentric rings around disk surface Sectors: arc of track, minimum unit of transfer Cylinder:
Conquest: Better Performance Through A Disk/Persistent-RAM Hybrid File System USENIX 2002 An-I Andy Wang Peter Reiher Gerald Popek University of California,
1 CS222: Principles of Database Management Fall 2010 Professor Chen Li Department of Computer Science University of California, Irvine Notes 01.
Ruston Panabaker Architect Windows Hardware Innovation Group
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
CS 423 – Operating Systems Design Lecture 22 – Power Management Klara Nahrstedt and Raoul Rivas Spring 2013 CS Spring 2013.
Reliability Analysis of An Energy-Aware RAID System Shu Yin Xiao Qin Auburn University.
Cloud Data Center/Storage Power Efficiency Solutions Junyao Zhang 1.
Slide 1 Windows PC Accelerators Reporter :吳柏良. Slide 2 Outline l Introduction l Windows SuperFetch l Windows ReadyBoost l Windows ReadyDrive l Conclusion.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
Lecture 11: DMBS Internals
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
PARAID: The Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, An-I Andy Wang – Florida State University RuGang Xu, Peter Reiher – University.
Parity Logging O vercoming the Small Write Problem in Redundant Disk Arrays Daniel Stodolsky Garth Gibson Mark Holland.
PARAID: The Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, An-I Andy Wang – Florida State University Peter Reiher – University of California,
Exploiting Flash for Energy Efficient Disk Arrays Shimin Chen (Intel Labs) Panos K. Chrysanthis (University of Pittsburgh) Alexandros Labrinidis (University.
Chapter Twelve Memory Organization
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
1 PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.
PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.
1 Tuning Garbage Collection in an Embedded Java Environment G. Chen, R. Shetty, M. Kandemir, N. Vijaykrishnan, M. J. Irwin Microsystems Design Lab The.
1/14/2005Yan Huang - CSCI5330 Database Implementation – Storage and File Structure Storage and File Structure.
File Processing : Storage Media 2015, Spring Pusan National University Ki-Joune Li.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Towards Dynamic Green-Sizing for Database Servers Mustafa Korkmaz, Alexey Karyakin, Martin Karsten, Kenneth Salem University of Waterloo.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
1 PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.
Conquest: Preparing for Life After Disks October 2, 2003 An-I Andy Wang.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
MEMS and Caching for File Systems Andy Wang COP 5611 Advanced Operating Systems.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
W4118 Operating Systems Instructor: Junfeng Yang.
CS422 Principles of Database Systems Disk Access Chengyu Sun California State University, Los Angeles.
1 PARAID: A Gear-Shifting Power-Aware RAID Charles Weddle, Mathew Oldham, Jin Qian, An-I Andy Wang – Florida St. University Peter Reiher – University of.
Motivation Energy costs are rising –An increasing concern for servers –No longer limited to laptops Energy consumption of disk drives –24% of the power.
Storage HDD, SSD and RAID.
PARAID: A Gear-Shifting Power-Aware RAID
Lecture 16: Data Storage Wednesday, November 6, 2006.
Database Management Systems (CS 564)
Storage Virtualization
Lecture 11: DMBS Internals
PARAID: A Gear-Shifting Power-Aware RAID
RAID RAID Mukesh N Tekwani
Lecture 9: Data Storage and IO Models
CS 140 Lecture Notes: Technology and Operating Systems
CS 140 Lecture Notes: Technology and Operating Systems
Chap. 12 Memory Organization
2.C Memory GCSE Computing Langley Park School for Boys.
Computer Evolution and Performance
RAID RAID Mukesh N Tekwani April 23, 2019
Page Cache and Page Writeback
Dong Hyun Kang, Changwoo Min, Young Ik Eom
Presentation transcript:

Conquest-2: Improving Energy Efficiency and Performance Through a Disk/RAM Hybrid File System An-I Andy Wang Florida State University (NSF CCR , CNS )

2 Conquest-2 Team Members FSU An-I Andy Wang (PI), Charles Weddle, Cory Fox, Jin Qian, Dragan Lojpur, Mark Carpenter, Ryan Fishel UCLA Peter Reiher (Co-PI), Erik Kline Harvey Mudd College Geoff Kuenning Former members: Mathew Oldham, Noriel Lu, RuGang Xu

3 Motivation Computers are becoming cheaper Energy is not Energy consumption by storage devices 8% for laptops 24% for Web servers 77% for proxy servers 27% of the operating costs for data centers Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

4 Laptops: 8%  20 min of battery life Proxy server: Higher energy cost  higher cooling cost  lower density of servers  more space cost Disk Energy Consumption Disk % of system power 5-yr cost of disk power Mobile Intel® Pentium® III laptop8%$5 Pentium® 4 web server24%$120 Pentium® 4 web proxy server77%$1,300 Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

5 Performance vs. Energy Benefits Performance More relevant during peak loads Energy savings Realized instantaneously Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

6 Roadmap Conquest Existing energy-saving approaches Emergence of memory-rich storage era Conquest-2 Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

7 Conquest A disk/persistent-RAM hybrid file system Deliver all file system services from memory, with the exception of high-capacity storage Two separate and specialized data paths Benefits: Simplicity Performance Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

8 Hardware Evolution KHz 1 MHz 1 GHz CPU (50% /yr) Memory (50% /yr) Disk (15% /yr) Accesses Per Second (Log Scale) (1 sec : 6 days)(1 sec : 3 months) Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

[Caceres et al., 1993; Hillyer et al., 1996; Qualstar 1998; Tanisys 1999; Micron Semiconductor Products 2000; Quantum 2000]9 Storage Media Alternatives accesses/sec (log) $/MB (log) persistent RAM Magnetic RAM? (write once) flash memory disk tape battery-backed DRAM Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

[Grochowski 2000]10 Price Trend of Persistent RAM Year $/MB (log) paper/film 3.5” HDD 2.5” HDD 1” HDD Persistent RAM Booming of digital photography 4 to 10 GB of persistent RAM Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

[Iram 1993; Douceur et al., 1999; Roselli et al., 2000]11 User Access Patterns Small files Take little space (10%) Represent most accesses (90%) Large files Take most space Mostly sequential accesses Except database applications Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

12 Files Stored in Persistent RAM Small files (< 1MB) No seek time or rotational delays Fast byte-level accesses Metadata Fast synchronous update No dual representations Executables and shared libraries In-place execution Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

[Devlinux.com 2000]13 Large-File-Only Disk Storage Allocate in big chunks No fragmentation management No tricks for small files Storing data in metadata Wrapping a balanced tree onto disk cylinders Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

[Katcher 1997; Sweeney et al., 1996; Card et al., 1999; Namesys 2002]14 Conquest is comparable to ramfs At least 24% faster than the LRU disk cache ISP workload ( s, web-based transactions) PostMark Benchmark 40 to 250 MB working set with 2 GB physical RAM Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

15 When working set > RAM, Conquest is 1.4 to 2 times faster than ext2fs, reiserfs, and SGI XFS PostMark Benchmark 10,000 files, 3.5 GB working set with 2 GB physical RAM Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

16 Conquest-2 Conquest has made advancements in terms of better performance Can we extend Conquest to improve performance and reduce energy consumption at the same time? Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

17 Conquest-Based Numbers A UCLA Webserver Single disk File size threshold of 32KB Spin down whenever the disk idle time > 10s Conquest: 84% energy savings LRU: 64% energy savings Greater benefits for multiple disks Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

18 Existing Approaches Provide degraded service Reduced disk rotation speed Speculative methods Predicting idle periods for shutting down a disk Not suitable for servers High loads Uniform data striping among disks Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

19 Cannot simply replace server drives with laptop ones Just Use Laptop Drives? Typical server driveTypical laptop drive Power consumption active13 W2 W Performance average latency4 ms7 ms sustained transfer rate30 – 60 MB/s35 MB/s spin-up time10 s1.6 s Cost $/GB$1/GB$4/GB Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

20 RAM performance/energy savings and disk capacity? Persistent RAM Storage? Typical server driveTypical RAM Power consumption active12.5 W735mW Performance average latency4.2 ms1.4 – 2.8 ns sustained transfer rate32 – 58 MB/s240 MB/s spin-up time10 s72 – 200 ns Cost $/GB$1.2/GB$154/GB Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

21 Why not Conventional Caching? High overhead to access data stored in RAM storage 90% cache hit rate ≠ 90% disk idle time 10% of cache misses can keep a drive spinning all the time e.g., multimedia workloads Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

22 What if you have multiple disks? Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

23 And access patterns are skewed Access patterns Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

24 Better Off Caching Cold Disks Spin down cold disks Access patterns Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

25 Conquest-2 Approach Strategic use of memory storage Improve performance Energy-aware memory manager Power down unused banks Power-aware RAIDs (PARAIDs) “Gear-shift” individual drives according to performance demands Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

26 New Roles of Memory Shaping the frequency, timing, and predictability of disk accesses Low frequency of disk access Better performance Energy savings Predictability Hide the latency to spin a disk up Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

27 File Access Characterizations FrequencyArrival timesPredictabilitySizeLocation.tar.gzLowBulkLowLargeDisk.mpgLowScatteredHighLargeDisk.cHighScatteredLowSmallRAM locate.dbHighBulkLowLargeRAM Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

28 Energy-Aware Memory Management indices data frequently used (index, data) infrequently used (index, data) Conceptually simple, but difficult in practice Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

29 Linux Memory Manager (1) Page allocator maintains individual pages Page allocator Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

30 Linux Memory Manager (2) Zone allocator allocates memory in power-of- two sizes Page allocator Zone allocator Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

31 Linux Memory Manager (3) Slab allocator groups allocations by sizes to reduce internal memory fragmentation Page allocator Zone allocator Slab allocator Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

32 Linux Memory Manager (4) Difficult to collocate information according to energy constraints Page allocator Slab allocator Zone allocator Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

33 Conventional RAID load time drives

34 Power-Aware RAID load time drives load time drives

35 Challenges Energy Not enough opportunities to spin down RAIDs Performance Essential for peak loads Reliability Server-class drives are not designed for frequent power switching Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

36 Power-Aware RAID Observations RAIDs are configured for peak performance Uniform striping keeps all drives spinning for light loads Over-provision of storage capacity Unused storage can be traded for energy savings Cyclic fluctuation of loads Infrequent on-off power transitions can be effective Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

37 Cyclic Fluctuation of Loads load time utilization threshold gear 2 gear 1 Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

38 Skewed Striping for Energy Saving Use over-provisioned spare storage Can use fewer drives for light loads gear 1 soft-state block replication gear 2 disk 1disk 2disk 3disk 4 gear 3 RAID-5 layout Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

39 Preserving Peak Performance Based on RAID-5 All drives on for peak loads Full parallelism Fewer drives on for light loads Lower latency for small files Degraded throughput for large files Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

40 Reliability Drives have a limited number of power cycles Form bi-modal distribution of busy/idle drives Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion busy disks power cycled disks idle disks role exchange Disk 1 Gear 1 Gear 2 Gear 3 Disk 2Disk 3Disk 4Disk 5Disk 6

41 Reliability Drives have a limited number of power cycles Form bi-modal distribution of busy/idle drives Rotate drives with more power cycles Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion % of power cycles 100% 0% 020,000 power cycles gear 1 gear 2 gear 3

42 Reliability Drives have a limited number of power cycles Form bi-modal distribution of busy/idle drives Rotate drives with more power cycles Ration number of power cycles Distributed parity (RAID-5) Tolerate single-disk failures Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

43 Other Issues Update propagations Gear-shifting policies Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion disk utilization gear 2 utilization threshold time gear shift gear 1 utilization threshold disk utilization utilization threshold time downshift 300s, 60s, 10s moving averages

44 Gear-Shifting Policies Ideal In practice time load time load

45 Empirical Measurements Servers are not measurement friendly Time consuming Cannot easily apply the trick of skipping idle times Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

46 Workload Selection Need to match with the hardware setup energy consumption load linear scaling everything on or off geared switching no choices time

47 Experiment Set 1 Workload FSU CS Department Web Server trace A single day trace Hardware Dell 2600 with 5 drives PARAID 2 gears (3-disk RAID-0 and 5-disk RAID-0) No energy-aware memory management Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

48 Web Trace Replay 512x 1920x1024x Speed-upPower savings (+stdev) 1920x (290 req/sec)15% (+2.2%) 1024x (144 req/sec)25% (+1.3%) 512x (72 req/sec)34% (+1.1%)

49 Experiment Set 2 Workload Cello99 server I/O trace from HP A 50-hr trace Hardware Dell 2600 with 5 drives PARAID 2 gears (3-disk RAID-5 and 5-disk RAID-5) No energy-aware memory management Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

50 Cello99 50hr Trace Speed-upPower savings (+stdev) 128x (1024 req/sec)7.8% (+0.21%) 64x (548 req/sec)12% (+2.8%) 32x (274 req/sec)13% (+0.26%) 32x 64x128x

51 Experiment Set 3 Workload PostMark benchmark (ISP workload) Hardware Dell 2600 with 5 drives PARAID 2 gears (3-disk RAID-5 and 5-disk RAID-5) No energy-aware memory management Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

52 PostMark

53 Conquest-2 Current Status PARAID Implementing reliability mechanisms Energy-aware memory manager Integrating the memory and the disk components Empirical measurements Exploring different server loads Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

54 Conclusion Energy efficiency and performance can be achieved simultaneously PARAID-0 with “2 gears” has already shown a 15% reduction in power with < 1% performance loss Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

55 Questions Google keywords Conquest file system Power-Aware RAID Andy Wang FSU Motivation – Conquest – Conquest-2 – Power-Aware RAID – Conclusion

56 Gear-Shifting Details