June 30 - July 2, 2009AIMS 2009 Towards Energy Efficient Change Management in A Cloud Computing Environment: A Pro-Active Approach H. AbdelSalamK. Maly R. MukkamalaM. Zubair Department of Computer Science, Old Dominion University D. Kaminsky IBM, Raleigh, North Carolina 1
Outline Cloud Computing Change Management Power Management –Pro-active approach –Minimize total power consumption –Constraints: SLAs Prior change management commitments –Compute possible time slots for change management task June 30 - July 2, 2009AIMS 20092
Cloud Computing A cloud can be defined as: – a pool of computer resources that can host a variety of different workloads, including batch-style back-end jobs and interactive user applications. A cloud computing platform dynamically provisions, configures, reconfigures, and deprovisions servers as needed. Servers in the cloud can be physical machines or virtual machines. Customers have Service Level Agreements to buy computing services from cloud manager June 30 - July 2, 2009AIMS 20093
Change Management Managing large IT environments such as computing clouds is expensive and labor intensive. Servers go through several software and hardware upgrades. IT organizations handle change management through human group interactions and coordination. June 30 - July 2, 2009AIMS 20094
Pro-active Approach We proposed earlier and implemented an infrastructure- aware autonomic manager for change management –scheduler that computes possible open time slots in which changes can be applied without violating any of SLAs reservations. Here we propose pro-active energy-aware technique for change management in a cloud computing environment. June 30 - July 2, 2009AIMS 20095
June 30 - July 2, 2009AIMS Cloud Computing Architecture
Job distribution applications in a cloud computing: –intensive compute processing, non- interactive applications –user interactive: Web applications and Web services are typical examples. June 30 - July 2, 2009AIMS 20097
Non-interactive applications dedicate one or more servers to each of these applications, number of dedicated servers depends on the underlying SLA and the availability of servers in the cloud servers should be run at their top speed (frequency) so the application can finish as soon as possible June 30 - July 2, 2009AIMS 20098
Job distribution Assume that, based on its SLA, Job X requires s seconds response time for u users. From the historical data for Job X, we estimate the average processing required for a user query to be l instructions. Assume that job X is to be run on a server that runs on frequency f and on the average requires CPI clock ticks (CPU cycles) to execute an instruction. the server can execute q=(s*f)/(l*CPI) user queries within s seconds. If q<u, then the remaining (u-q) user requests should be routed to another server. June 30 - July 2, 2009AIMS 20099
System model estimate the computing power (MIPS) needed to achieve the required response time client provides a histogram that shows the frequency of each expected query replace the minimum average response time constraint in SLA by the minimum number of instructions that the application is allowed to execute every second June 30 - July 2, 2009AIMS
June 30 - July 2, Distribution of jobs onto servers
System model Conversion of response time to MIPS –If user query has average response time of t1 seconds when it runs solely on a server configuration with x MIPS (million instructions per second), this can be benchmarked for each server configuration), then –to have an average response time of t2 seconds, it is required to run the query such that it can execute a minimum of (t1*x)/t2 million instructions per second. Power management of server –Minimum Fmin –Maximum Fmax –Discrete values in between Power – frequency relation June 30 - July 2, 2009AIMS
Mathematical analysis given k servers that should run on frequencies respectively, such that total compute load is: the total energy consumption is given by, June 30 - July 2, 2009AIMS
Mathematical analysis the number of servers k, that should run to optimize power consumption, is (assuming continuous frequency spectrum): Each server should run at frequency June 30 - July 2, 2009AIMS
Sample cloud load June 30 - July 2, 2009AIMS Actual and Approximated Load due to several SLAs.
Servers available for change management in each time segment, –the number of idle servers in the cloud equals the difference between the total number of cloud servers and k t. –idle server is a candidate for change management. June 30 - July 2, 2009AIMS
June 30 - July 2, 2009AIMS Servers Available for changes as a function of time
Scenario comparison Total energy consumption during one period (one day) using the pro-active approach is Watt-Hour, for an average of 1554 Watt. Total and the average energy consumption when using 5 % over- provisioning at various frequencies: June 30 - July 2, 2009AIMS
June 30 - July 2, 2009AIMS 2009 Conclusion Pro-active management is the computation of when servers will be idle so they can be scheduled for change maintenance. Pro-active power management leads to considerable saving in total energy consumed, for specific examples ranging from 5-75%. Can be modified to include compute intensive jobs Can be modified to include hardware failure rates 19