CIT 668: System Architecture Data Centers I
Topics Data Center: A facility for housing a large amount of computer or communications equipment. Racks Power PUE Cooling Image from http://en.wikipedia.org/wiki/Image:Datacenter-telecom.jpg
Google DC in The Dalles, OR http://en.wikipedia.org/wiki/Celilo_Converter_Station http://www.nytimes.com/2006/06/14/technology/14search.html?ex=1307937600&en=d96a72b3c5f91c47&ei=5090 Located near 3.1GW hydroelectric power station using Columbia River dams for power
Google Data Center in The Dalles, OR
Inside a Data Center http://www.thehotaisle.com/2008/04/18/why-the-hot-aisle/
Inside a Container Data Center
Data Center is composed of: A physically safe and secure space Racks that hold computer, network, and storage devices Electric power sufficient to operate the installed devices Cooling to keep the devices within their operating temperature ranges Network connectivity throughout the data center and to places beyond
Data Center Components Figure 4.1 from The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
Data Center Tiers See http://uptimeinstitute.org/ for more details about tiers.
Racks
Racks: The Skeleton of the DC 19” rack standard EIA-310D Other standard numbers. NEBS 21” racks Telecom equipment. 2-post or 4-post Air circulation (fans) Cable management Doors or open Image from http://en.wikipedia.org/wiki/Image:Wikimedia-servers-Sept04.jpg
Rack Units
Rack Sizes http://www.gtweb.net/rackframe.html
Rack Purposes Organize equipment Cooling Wiring Organization Increase density with vertical stacking. Cooling Internal airflow in rack cools servers. Data center airflow determined by arrangement of racks. Wiring Organization Cable guides keep cables within racks. http://commons.wikimedia.org/wiki/File:Perforated-ventilation-tile.jpg http://www.skytelecom.us/
Rack Power Infrastructure Different power sockets can be on different circuits. Individual outlet control (power cycle.) Current monitoring and alarms. Network managed (web or SNMP.) http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=linux&db=bks&fname=/SGI_EndUser/SGI9100_OG/ch02.html
Rack-Mount Servers 1U Dell PowerEdge R415(1U) and R910(4U) 4U
Blade Servers http://www.microsys.org.in/Blade%20Server.html http://commons.wikimedia.org/wiki/File:IBM_HS20_blade_server.jpg
Buying a Rack Buy the right size Be sure it fits your servers. Space for servers. power, patch panels, etc. Be sure it fits your servers. Appropriate mounting rails. Shelves for non-rack servers. Environment options Locking front and back doors Sufficient power and cooling. Power/environment monitors. Console if needed. http://www.americantechsupply.com/cat6patchpanels.htm http://mail.disrv.com/Servers_8_0/Servers_8.0/PE850/printer_friendly.htm http://www.lasystems.be/Belkin/F1DC102PDUSR/F1DC102PDUSR/product/220987.html?osCsid=f4d4ec3a4abafe0d7b54b24745ad765a
Space Aisles Hot spots Work space Capacity Wide enough to move equipment. Separate hot and cold aisles. Hot spots Result from poor air flow. Servers can overheat when average room temperature is too low. Work space A place for SAs to work on servers. Desk space, tools, etc. Capacity Room to grow. http://www.kvaprotection.com/products/racks-enclosures-accessories/ http://www2.electronicproducts.com/Choices_for_data_center_cooling_architectures-article-farcapc-sep2007-html.aspx
Power
Data Center Power Distribution http://www.42u.com/power/data-center-power.htm
UPS (Uninterruptible Power Supply) Provides emergency power when utility fails Most use batteries to store power Conditions power, removing voltage spikes http://www.loftproductions.com/webhosting/popups/popups.html
Standby UPS Power will be briefly interrupted during switch Computers may lockup/reboot during interruption No power conditioning Short battery life Very inexpensive http://myuninterruptiblepowersupply.com/toplogy.htm http://myuninterruptiblepowersupply.com/toplogy.htm
Online UPS AC -> DC -> AC conversion design True uninterrupted power without switching Extremely good power conditioning Longer battery life Higher price http://myuninterruptiblepowersupply.com/toplogy.htm
Power Distribution Unit (PDU) Takes high voltage feed and divides into many 110/120 V circuits that feed servers. Similar to breaker panel in a house. http://www.keyitec.com/PPCbrochure.pdf
Estimating Per-Rack Power
The Power Problem 4-year power cost = server purchase price. Upgrades may have to wait for electricity. Power is a major data center cost $5.8 billion for server power in 2005. $3.5 billion for server cooling in 2005. $20.5 billion for purchasing hardware in 2005.
Measuring Power Efficiency PUE is ratio of total building power to IT power; efficiency of datacenter building infrastructure SPUE is ratio of total server input to its useful power, where useful power is power consumed by CPU, DRAM, disk, motherboard, etc. Excludes losses due to power supplies, fans, etc. Computation efficiency depends on software and workload and measures useful work done per watt Equation 5.1 from The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
Power Usage Efficiency (PUE)
Power Usage Effectiveness (PUE) PUE = Data center power / Computer power PUE=2 indicates that for each watt of power used to power IT equipment, one watt used for HVAC, power distribution, etc. Decreases towards 1 as DC is more efficient. PUE variation Industry average > 2 Microsoft = 1.22 Google = 1.19 http://searchdatacenter.techtarget.com/sDefinition/0,,sid80_gci1307933,00.html http://www.treehugger.com/files/2008/10/microsoft-to-google-my-pue-is-getting-better-than-your-pue.php
Data Center Energy Usage Figure 5.2 from The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
Sources of Efficiency Losses UPS 88-94% efficiency Less if lightly loaded PDU voltage transformation .5% or less Cables from PDU to racks 1-3% depending on distance and cable type Computer Room Air Conditioning (CRAC) Delivery of cool air over long distances uses fan power and increases air temperature
Cooling
Cooling a Data Center Keep temperatures within 18-27 ◦C Cooling equipment rated in BTUs 1 Watt = 3412 BTUH BTUH = British Thermal Unit / Hour Keep humidity between 30-55% High = condensation Low = static shock Avoid hot/cold spots Can produce condensation
Computer Room Air Conditioning (CRAC) Large scale, highly reliable air conditioning units from companies like Liebert. Cooling capacity measured in tons. http://protmp.com/services/special/
Waterworks for Data Center http://www.theregister.co.uk/2009/04/10/google_data_center_video/
Estimating Heat Load Equations from UNIX System Administration Handbook, 4th edition
Hot-Cold Aisle Architecture Server air intake from cold aisles Server air exhaust into hot aisles Improve efficiency by reducing mixture of hot/cold Figure 4.2 from The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines
Free Cooling Cooling towers dissipate heat by evaporating water, reducing or eliminating need to run chillers Google Belgium DC uses 100% free cooling
Improving Cooling Efficiency Air flow handling: Hot air exhausted by servers does not mix with cold air, and path to cooling coil is very short so little energy spent moving Elevated cold aisle temperatures: Cold aisle of containers kept at 27◦C rather than 18-20◦C. Use of free cooling: In moderate climates, cooling towers can eliminate majority of chiller runtime.
Server Power Usage Efficiency (SPUE)
Sources of Server Inefficiency Primary sources of inefficiency Power Supply Unit (PSU) (70-75% efficiency) Voltage Regulator Modules (VRMs) Can lose more than 30% power in conversion losses Cooling fans Software can reduce fan RPM when not needed SPUE ratios of 1.6-1.8 are common today http://www.tomshardware.com/reviews/strong-showing,987-26.html http://www.cpes.vt.edu/public/showcase/multiphase_VRM.php
Power Supply Unit Efficiency 80 PLUS initiative to promote PSU efficiency 80+% efficiency at 20%, 50%, 100% of rated load Can be less than 80% efficient at idle power load First 80 PLUS PSU shipped in 2005
Server Useful Power Consumption Device Power Usage Intel Xeon W5590 3.33 GHz Quad Core 130 W Intel Xeon E5430 2.66 GHz Quad Core 80W Intel Xeon E5502 2.13 GHz Dual Core 7200RPM Hard Drive 7W 10,000RPM Hard Drive 14W 15,000RPM Hard Drive 20W DDR2 DIMM 1.65W Video Card 20-120W Values from newegg.com and "Toward Energy-Efficient Computing", CACM Vol 53 No 03 March 2010 See also http://www.80plus.org/ The best method to determine power usage is to measure it https://www.wattsupmeters.com/
Computation Efficiency
Server Utilization Typically 10-50% “The Case for Energy-Proportional Computing,” Luiz André Barroso, Urs Hölzle, IEEE Computer, December 2007 It is surprisingly hard to achieve high levels of utilization of typical servers (and your home PC is even worse) Figure 1. Average CPU utilization of more than 5,000 servers during a six-month period. Servers are rarely completely idle and seldom operate near their maximum utilization, instead operating most of the time at between 10 and 50 percent of their maximum
Server Power Usage Range: 50-100% “The Case for Energy-Proportional Computing,” Luiz André Barroso, Urs Hölzle, IEEE Computer, December 2007 Energy efficiency = Utilization/Power Figure 2. Server power usage and energy efficiency at varying utilization levels, from idle to peak performance. Even an energy-efficient server still consumes about half its full power when doing virtually no work.
Server Utilization vs. Latency 100%
Improving Power Efficiency USAH 4/e, p. 1101
Improving Power Efficiency Application consolidation Reduce the number of applications by eliminating old applications in favor of new ones that can server the purpose of multiple old ones. Allows elimination of old app servers. Server consolidation Use single DB for multiple applications. Move light services like NTP onto shared boxes. Use SAN storage Local disks typically highly underused Use SAN so servers share single storage pool
Improving Power Efficiency Virtualization Host services on VMs instead of on physical servers Host multiple virtual servers on single physical svr Only-as-needed Servers Power down servers when not in use Works best with cloud computing Granular capacity planning Measure computing needs carefully Buy minimal CPU, RAM, disk configuration based on your capacity measurements and forecasts
Key Points Data center components Physically secure space Racks, the DC skeleton Power, including UPS and PDU Cooling Networking Power efficiency (server cost = 4 years power on avg) PUE = Data center power / IT equipment power Most power in traditional DC goes to cooling, UPS SPUE = Server PUE; inefficiencies from PSU, VRM, fans Heat load estimation Air flow control (hot/cold aisle architecture or containers) Higher cold air temperatures (27C vs. 20C) Free cooling (cooling towers)
References Luiz Andre Barroso and Urs Holzle, The Case for Energy-Proportional Computing, IEEE Computer, Vol 40, Issue 12, December 2007. Luiz Andre Barroso and Urs Holzle, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, 1st edition, Morgan and Claypool Publishers Xiaobo Fan, Wolf-Dietrich Weber, Luiz Andre Barroso, Power provisioning for a warehouse-sized computer, ISCA '07: Proceedings of the 34th annual international symposium on Computer architecture Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A. Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: a scalable and flexible data center network. In Proceedings of the ACM SIGCOMM 2009 conference on Data communication (SIGCOMM '09). ACM, New York, NY, USA, 51-62. Thomas A. Limoncelli, Christina J. Hogan, and Strata R. Chalup, The Practice of System and Network Administration, Second Edition, Addison-Wesley Professional, 2007. Evi Nemeth, Garth Snyder, Trent R. Hein, Ben Whaley, UNIX and Linux System Administration Handbook, 4th edition, Prentice Hall, 2010.