Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thermal Management of Heterogeneous Data Centers

Similar presentations

Presentation on theme: "Thermal Management of Heterogeneous Data Centers"— Presentation transcript:

1 Thermal Management of Heterogeneous Data Centers
David Bendit System Administrator Mars Space Flight Facility Arizona State University

2 Introduction Mars Space Flight Facility
Located in the Moeur Building (West of the MU) Medium-sized server room Roughly 12’x25’ Over 100 physical servers Small cluster 26 physical machines, each running 4 Xen instances Total of 104 nodes optimally

3 Motivation Chiller had a bad habit of dying over weekends
Come in Monday morning to a 90°F+ server room IMPACT research has studied the HPCI Cluster HCPI is more homogenous, dedicated to cluster processing Proper cooling setup More standard layout MSFF server room is sub-optimal setup, which doesn’t have as much research

4 Problem Statement What qualities are important in selecting a thermal-aware job scheduling algorithm for heterogeneous data centers Based on those qualities, which algorithms fare the best How does the heterogeneity of the data center affect the performance of the algorithms chosen?

5 Approach Physical Virtual
Temperature sensors throughout the server room Wireless, single-hop sensor network Logging of cluster jobs and as many physical server temps as feasible Virtual Simulation of various scheduling strategies using FloVENT Will take into account server room layout and air flow

6 Difficulties Physical
Originally supposed to work with another team to create temperature sensor network My contact in that team ended up dropping the class Logging cluster job traces Limited experience with new scheduling software at MSFF Logging individual server temperatures Heterogeneity of hardware means that not all supports reporting temperatures over SNMP or other means

7 Difficulties, cont. Virtual Server room floor plans
We’ve placed things around where they fit, without more pre-planning than general ideas Needed to create this, and left out important details (namely, air vents and the like) Simulation software License is expired and needs to be renewed before any simulations can be run

8 Difficulties, cont. General MSFF is NASA funded
As a US Government facility, we’re not able to let foreign nationals into the server room Frequent changes to the server room itself As data storage requirements, etc. change, we change equipment, so things can change week-to-week Results may not be implementable Because of very specific user requirements, useful changes may not be implemented anyway

9 Moving Forward Project will continue past the length of this course
We will correlate cluster activity with machine and ambient temperatures End goal Reduce overall strain on the chiller My (rather pessimistic) prediction Because of large number of other computing resources in the server room, and always-on nature of our setup, cluster has negligible impact on overall temperature Need more wide-ranging changes for real temperature reduction

10 Questions?

Download ppt "Thermal Management of Heterogeneous Data Centers"

Similar presentations

Ads by Google