Cooperative Disk Management with ECOSystem Emily Tennant Mentor: Carla Ellis Duke University
Outline Milly Watt Overview Cooperative file operations Building blocks for efficient hard disk management Read and write requests Design Issues Testing
Milly Watt Project Energy consumption in use of computing devices is an important problem – emphasis on power management –Mobile computing: Extend battery lifetime –Traditional platforms: Reduce heat production and fan noise –Conserve energy resources (lessen environmental impact) Energy should be managed as a “first class” system resource –Partnership between applications and system in setting energy policy What can be done within the OS to achieve energy-related goals without requiring applications to change in any way?
ECOSystem Energy Centric Operating System Goals: –Manage allocation of energy to achieve a desired battery lifetime without requiring application modifications –Allocate energy proportionally among active applications as it becomes a scarce resource Unified Currentcy Model –Energy accounting and allocation are expressed in a common currentcy. –Unifies hardware resource management –One unit of currentcy represents right to consume certain amount of energy within a fixed amount of time (0.01 mJ).
ECOSystem Mechanisms Currentcy Allocation –Battery lifetime achieved by limiting battery discharge rate –Per-epoch recalculation of discharge rate and allocation of currentcy –Energy allocated proportionally among tasks – “allowance” Currentcy Accounting –Pay as you go for resource use – resource containers –No more currentcy no more service. –Charging policies specific to device type
The Next Step The next step for ECOSystem –Improve energy efficiency –Take advantage of energy aware applications Vision –Create a set of API extensions permitting applications to pass application-specific information to the OS. –Use this information to manage disk accesses more efficiently. Specific Goals –Simple interface –Minimal changes – backward compatibility –Energy savings in terms of currentcy
Disk Management Maximize time spent in standby Minimize transitions from standby to active state Traditional systems: disks managed individually Goal: use Currentcy Model to explicitly batch requests for disk access more efficiently Terms: read, write, update CostTime Out (sec) Access1.65 mJN/A Idle mW0.5 Idle 2650 mW2 Idle 3400 mW27.5 Standby (disk down) 0 mWN/A Spin up6000 mJN/A Spin down6000 mJN/A Hard drive states One Watt = one Joule per second
Related Work Recent submission to OSDI conference: –Andreas Weissel, Björn Beutel, Frank Bellosa. Cooperative I/O—A Novel I/O Semantics for Energy-Aware Applications. Goal: “demonstrate benefits of application involvement in operating system power management.” Coop-I/O – an approach to reduce power consumption of devices (disk) Involves hardware, operating system, I/O interface for energy-aware applications
Main Elements – Coop-I/O New cooperative file operations –read_coop(), write_coop(), open_coop() –New parameters: time-out and cancel flag –If disk is inactive, delay disk request for length of time-out parameter –Possible to abort accesses after time-out period Energy-efficient update mechanism –Update of cached disk blocks is batched to maximize the time hard disk can spend in standby mode. OS controls hard disk modes –Disk drive is switched into low-power mode according to an adaptive algorithm. –“device-dependent time-out with early shutdown (DDT/ES)”
Cooperative I/O in ECOSystem New cooperative file operations –New system calls (readb, writeb) with added parameter(s). –Deferrable disk requests –Abortable file operations? Energy-efficient update mechanism –Updates deferred to create bursty disk access –Energy-efficient update strategies integrated into write process OS control of disk drive –Motivation of adaptive algorithm for powering down disk drive?
Bidding How do we couch the cooperative, deferrable file operations of Coop-I/O in terms of ECOSystem’s currentcy? “Bidding” process –Inflated entry price delays disk spinup to ensure that multiple processes have generated disk requests. –Each process “bids” the amount of currentcy it is willing to contribute towards the entry price. Time Total Bid Entry Price P2 P1 P3 Disk spins up
Priorities Motivating Question: How can we implement the bidding process in a useful and intuitive interface (similar to Coop- I/O)? Priority (and currentcy bid) Time willing to wait – Currentcy is dynamic – difficult for application to assign directly. – Create static priorities. – Map priorities to a currentcy amount for bid. What type of priorities could be created? – Integer sets (1-10, 1-100) – Real-numbered intervals – Dynamic priorities
Takes place within OS – application knows nothing about currentcy Involves resource container May require resource container alteration –Bid, priority, time-out, etc. Mapping Priority to Bid Resource Container Available_currentcy Ticket priority Percentage of entry price Percentage of available currentcy Available_currentcy Ticket Bid Priority
Assume numeric priority. Priority corresponds directly to percentage of available currentcy that is allocated to bid. Example: priority = 5, available currentcy = 1000mJ BID = 500mJ Remainder of available currentcy can be used (CPU, NIC) while process waits to access disk (asynchronous access). Overhead issues BID = available_currentcy * (priority/10) A Simple Mapping Model
1.System call is generated. readb(……...., [priority]); 2.Verify that data is uncached and disk is inactive. 3.Priority bid. 4.“Issue bid” – store bid amount in resource container. 5.Process enters waitqueue. Sample Read Disk Accesses 6. Update daemon checks Σbids (reads and writes) against the entry price. 7.When Σbids entry price, disk spins up. 8.Disk access enters runqueue. 9.Resource container is debited for cost of access. 10.Data is read into buffer cache.
Write Disk Access More complex! What does “deferrable write” mean? Do we defer writing to the buffer cache or flushing the buffers to the disk? ECOSystem: Unified write/update policy Writes –Data is written to buffer immediately. –Process bids toward disk spin up (and buffer flushing) - only requires resource container. Updates –Bids vs. entry price delays disk spinup. –Disk spinup activates buffer flushing. –Decreasing entry price guarantees updates.
Sample Write Disk Accesses writeb(…,…,…,[priority]); Is buffer already cached? Bid to read buffer in. Write to buffer cache. Is disk active? Bid. Flush buffer.
Design Issues Reads: P1P2 Simplest case (first implementation): - “first-reader” - accumulate only P1’s bid - charge only P1 for disk access Intuitive Goal: - “sum” - accumulate bids from P1 and P2 - charge P1 and P2 Writes: P1P2P1 Simplest case: - “last writer” - save only P2’s bid - charge only P2 Intuitive Goal: -“sum” - accumulate bids from P1 and P2 - charge P1 and P2 pid P1 pid P2
Testing Synthetic Benchmarks –Simple read/write programs. –Create scenarios where energy-savings are most obvious. –Simulate different workloads: Multiple simultaneous cooperative tasks Mix cooperative and non-cooperative tasks. “Real” Application –Audio/video player, image viewer. –Test performance in non- optimized, “real-life” situations. Compare results on : –ECOSystem unthrottled (Coop-I/O) –Current ECOSystem implementation –ECOSystem with cooperative implementation
Summary New system calls for basic file operations. Priority can be specified by application. Priority determines amount of currentcy in bid. Total bids for disk access must exceed entry price. Interaction between decreasing entry price and bids for disk access works toward efficiently batching disk accesses while guaranteeing that non-abortable accesses occur.