Clustered Systems Introduction Phil Hughes CEO, Clustered Systems Company, Inc. phil@clusteredsystems.com 415 613 9264
Clustered Management Phil Hughes CEO Robert Lipp COO, CTO Business partners since 1998 Invented and developed revolutionary pumped refrigerant cooling architecture - 3 Patents Invented and developed distributed switching architecture for 2.5 Tb/s SONET/SDH switch Raised over $110M in venture funding Collectively have 20 patents 9/17/2018
Cooling System Development Prototype system with 2 phase contact cooling technology Licensed technology to Liebert, Inc. for XDS which “wins” Chill-Off II Built 100kW rack installed at SLAC $3M DoE grant Intel 130W CPUs 9/17/2018
SLAC System Detail Ethernet Switches 480VDC to 380VDC converter 100kW Chassis PCIe Switches 80 ports
Chassis Construction PDU Cold Plates Dual Server Blade May 2013
Cooling System 2 rows of 16 cold plates for motherboard cooling >1200W/plate, 2.4KW/ blade** 4 plates for activeplane cooling 500W per plate Interoperable with Liebert XD refrigerant based cooling system March 17 2013
Plate Thermal Testing May 2013
Blade and Cold Plate 22” x 6” active area >1.2kW cooling capacity 2 per blade, 32 per chassis S2600JF motherboard May 2013
Chassis Rear Exit manifold Switch 1 Switch 2 Coolant inlet manifold Switch Cold plates April 23, 2013
PCIe Switch Architecture Blade server rows External PCIe Switches CPU Blade Dual 10Gb Ethernet I/O Module 8696 PCIe Switch COM Module CPU x4 x8 Switch Blade Dual 10Gb Ethernet ports Row 1 16 CPU blades Switch Blade (optional) CPU Blade 20 rows of CPU blades 8 or 16 external 20-port PCIe switch units CPU Blade 16 CPU blades Switch Blade (optional) Dual 10Gb Ethernet I/O Module 8696 PCIe Switch COM Module CPU x4 x8 Dual 10Gb Ethernet ports (optional) Row 20 May 2013
PCIe Switch Board 16 PCIe 2.0 x 4 to servers 8 PCIe x4 to external switches 16 x 1GbE from servers, 2 external ports May 2013
Recommended Cooling System Refrigerant Lines Cooling Water Dry/Adiabatic Cooler Liebert XDP May also user CRAH return water Due to low CPU to refrigerant thermal resistance, 30C water provides sufficient cooling One Liebert XDP can cool 2-3 racks 23 April 2013
Competition Water Closeteers Dunkers Circulate water in rack All have server to rack liquid connectors Expensive (if reliability required) 20-30kW/rack Fan assist Players IBM, Bull, Fujitsu, Asetec, CoolIT, Cray, SGI, Cool Flo, Hitachi, Eurotech etc. Servers are immersed in a dielectric fluid 20-30kW/rack Players Green Revolution, Iceotope, LiquidCool 9/17/2018 13
2 Phase Contact Cooling offers: HPC Data Center Very high power density which eases communication 0.3 PFLOP/ rack now (100kW) 0.75 PFLOP (200kW) feasible With today’s GPUs Very high power density which allows DCs to be much smaller FB Prineville 75 W/sq ft CSys rack 4,000 W/sq ft “Exascale enabler” John Gustafson “Game Changer” Jack Pouchet, VP Exascale, Emerson PUE 1.07 (mech & elec) measured at SLAC No air movement so systems can be placed virtually anywhere Systems are totally silent (no more OSHA issues) 9/17/2018
Change will happen but it will have to be driven from top down Advantage Summary Clustered Air Cuts DC CAPEX up to 50% Cuts OPEX up to 30% PUE ~1.07 3 month lead times Pay as you go 3-6 year depreciation Low maintenance Small or existing building Simple Easily handles GPUs Disruptive technology Narrow industry support High CAPEX 100% OPEX Best PUE ~1.15 (incl svr fans) 2 year lead times (high risk) Costs 90% up front Up to 39 year depreciation High maintenance (fans, filters) Giant buildings Complex GPU need heroic efforts Known, many practitioners 90% products are air only Change will happen but it will have to be driven from top down 9/17/2018
By the Numbers CAPEX $K/MW OPEX $K/MW Nr of servers Total CAPEX Free Cooling Air w CRAH Air w rear door Clustered W/sq ft Data room 93 150 430 2,500 Built area 10,714 8,000 2,784 400 Nr of servers 2,857 3,200 DC Construction $ 2,000 $ 696 $ 32 Electrical system $ 1,000 $ 910 $ 537 Mechanical $ 791 $ 1,050 $ 342 Cabinets $ 250 $ 390 $ 1,000 Other $ 1,051 $ 792 $ 344 Total CAPEX $ 7,500 $ 5,091 $ 3,839 $ 2,255 Per server $ 2.63 $ 1.78 $ 1.34 $ 0.70 OPEX $K/MW Free Cooling Air w CRAH Air w rear door Clustered Amortization $ 300 $ 422 $ 372 $ 295 Power cost $ 1,008 $ 1,175 $ 1,105 $ 947 Maintenance $ 1,084 $ 1,376 $ 1,250 $ 985 Total OPEX $ 2,392 $ 2,974 $ 2,727 $ 2,227 Per server $ 0.84 $ 1.04 $ 0.95 $ 0.70 9/17/2018
Summary Installed cost significantly less than traditional data centers Includes building, infrastructure Enables increased investment in hardware Low TCO Cooling costs, maintenance, amortization etc. May 2013
By the Numbers CAPEX $K/MW OPEX $K/MW Nr of servers Total CAPEX Free Cooling Air w CRAH Air w rear door Clustered W/sq ft Data room 93 150 430 2,500 Built area 10,714 8,000 2,784 400 Nr of servers 2,857 3,200 DC Construction $ 2,000 $ 696 $ 32 Electrical system $ 1,000 $ 910 $ 537 Mechanical $ 791 $ 1,050 $ 342 Cabinets $ 250 $ 390 $ 1,000 Other $ 1,051 $ 792 $ 344 Total CAPEX $ 7,500 $ 5,091 $ 3,839 $ 2,255 Per server $ 2.63 $ 1.78 $ 1.34 $ 0.70 OPEX $K/MW Free Cooling Air w CRAH Air w rear door Clustered Amortization $ 300 $ 422 $ 372 $ 295 Power cost $ 1,008 $ 1,175 $ 1,105 $ 947 Maintenance $ 1,084 $ 1,376 $ 1,250 $ 985 Total OPEX $ 2,392 $ 2,974 $ 2,727 $ 2,227 Per server $ 0.84 $ 1.04 $ 0.95 $ 0.70 9/17/2018