Diamond Computing Nick Rees et al.. Storage and Computing Requirements.

Slides:



Advertisements
Similar presentations
Computing Infrastructure
Advertisements

Chapter 3: Planning a Network Upgrade
PowerEdge T20 Customer Presentation. Product overview Customer benefits Use cases Summary PowerEdge T20 Overview 2 PowerEdge T20 mini tower server.
©2009 HP Confidential template rev Ed Turkel Manager, WorldWide HPC Marketing 4/7/2011 BUILDING THE GREENEST PRODUCTION SUPERCOMPUTER IN THE.
Company Equipment Upgrade Proposal. The Current Situation  It has been five years since Alt-F4 Inc. has upgraded any of it’s equipment.  200 Computers.
The CDCE BNL HEPIX – LBL October 28, 2009 Tony Chan - BNL.
Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.
Cloud Computing Data Centers Dr. Sanjay P. Ahuja, Ph.D FIS Distinguished Professor of Computer Science School of Computing, UNF.
Windows Azure Conference 2014 Hybrid Cloud Storage: StorSimple and Windows Azure.
Tower Dual Processor 1 x 2.13GHz Quad Core Intel Xeon E5506 1x250GB 7200RPM Drive Four open 3.5” Direct- Cabled SATA Bays 2GB 2 (1 DIMM) PC MHz.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
02/24/09 Green Data Center project Alan Crosswell.
SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.
High Performance Computing Course Notes High Performance Storage.
Bill Wrobleski Director, Technology Infrastructure ITS Infrastructure Services.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
THE CPU Cpu brands AMD cpu Intel cpu By Nathan Ferguson.
Network Topologies.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
5 September 2015 Culrur-exp project CULTURe EXchange Platform (CULTUR-EXP) project kick-off meeting, August 2013, Tbilisi, Georgia Joint Operational.
Autoclaves in the School of Life Sciences at Warwick Janine Kimpton – Technical Services Centre Manager Dave Hibberd – Workshop & Infrastructure Manager.
Online Systems Status Review of requirements System configuration Current acquisitions Next steps... Upgrade Meeting 4-Sep-1997 Stu Fuess.
Air Conditioning and Computer Centre Power Efficiency The Reality Christophe Martel Tony Cass.
UTA Site Report Jae Yu UTA Site Report 4 th DOSAR Workshop Iowa State University Apr. 5 – 6, 2007 Jae Yu Univ. of Texas, Arlington.
INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.
1 Computer and Network Bottlenecks Author: Rodger Burgess 27th October 2008 © Copyright reserved.
Physical Infrastructure Issues In A Large Centre July 8 th 2003 CERN.ch.
INDIACMS-TIFR Tier 2 Grid Status Report I IndiaCMS Meeting, April 05-06, 2007.
Diamond Computing Status Update Nick Rees et al..
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
Looking Ahead: A New PSU Research Cloud Architecture Chuck Gilbert - Systems Architect and Systems Team Lead Research CI Coordinating Committee Meeting.
Spending Plans and Schedule Jae Yu July 26, 2002.
JLAB Computing Facilities Development Ian Bird Jefferson Lab 2 November 2001.
Infrastructure Improvements 2010 – November 4 th – Hepix – Ithaca (NY)
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
ALMA Archive Operations Impact on the ARC Facilities.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
News from Alberto et al. Fibers document separated from the rest of the computing resources
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
High Performance Computing (HPC) Data Center Proposal Imran Latif, Facility Project Manager Scientific & Enterprise Computing Data Centers at BNL 10/14/2015.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CERN Computer Centre Consolidation Project Vincent Doré IT Technical.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
Virtualization Supplemental Material beyond the textbook.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
R. Krempaska, October, 2013 Wir schaffen Wissen – heute für morgen Controls Security at PSI Current Status R. Krempaska, A. Bertrand, C. Higgs, R. Kapeller,
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
CERN - IT Department CH-1211 Genève 23 Switzerland t Power and Cooling Challenges at CERN IHEPCCC Meeting April 24 th 2007 Tony Cass.
Analysis and Forming of Energy Efficiency and Green IT Metrics Framework for Sonera Helsinki Data Center HDC Matti Pärssinen Thesis supervisor: Prof. Jukka.
Unit 2 VIRTUALISATION. Unit 2 - Syllabus Basics of Virtualization Types of Virtualization Implementation Levels of Virtualization Virtualization Structures.
26. Juni 2003Bernd Panzer-Steindel, CERN/IT1 LHC Computing re-costing for for the CERN T0/T1 center.
Dirk Zimoch, EPICS Collaboration Meeting October SLS Beamline Networks and Data Storage.
Canadian Bioinformatics Workshops
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Cluster Status & Plans —— Gang Qin
Unit 2: Chapter 2 Cooling.
EonNAS.
DIRECT IMMERSION COOLED IMMERS HPC CLUSTERS BUILDING EXPERIENCE
CERN Data Centre ‘Building 513 on the Meyrin Site’
Appro Xtreme-X Supercomputers
Luca dell’Agnello INFN-CNAF
Oxford Site Report HEPSYSMAN
DIRECT IMMERSION COOLED IMMERS HPC CLUSTERS BUILDING EXPERIENCE
HIGH-PERFORMANCE COMPUTING SYSTEM FOR HIGH ENERGY PHYSICS
Cloud Computing Data Centers
Design Unit 26 Design a small or home office network
IBM Power Systems.
Cloud Computing Data Centers
QMUL Site Report by Dave Kant HEPSYSMAN Meeting /09/2019
Presentation transcript:

Diamond Computing Nick Rees et al.

Storage and Computing Requirements

Storage – Initial requirements 200TBytes usable disk space 20 Mbytes/sec/Tbyte scalable aggregated throughput. –4 GBytes/sec aggregated throughput for 200 Tbytes. 100 MBytes/sec transfer rate for individual 1 Gbit clients 400 MBytes/sec for individual 10 Gbit clients Highly resilient Support for Linux clients (RHEL4U6 and RHEL5 or later) POSIX with extended attributes and ACLs Groups for file access control based (> 256 groups/user) Ethernet Clients. Extendable by at least 200TB for future growth.

Storage - solution Based on Data Direct Networks S2A9900 system Up to 5.7 GB/s throughput Fault-tolerant architecture Runs Lustre file system –Open source file system recently acquired by Sun –Most popular file system in top 100 supercomputers

Storage - solution 10 Disk trays 60 hot-swap disks/tray 2 redundant trays.

Storage - solution 60 tiers of 10 disks 2 parity disks/tier 2 controllers – both controllers can access all data. FPGA based - no loss of performance when running in degraded mode.

Computing – Initial requirements 1500 SPECfp_rate2006 total. x86 based processors with EM64T extensions 2GBytes RAM per CPU core Gigabit Ethernet Remote management PCI-E 16x slot for GPU accelerator Price includes 3 years power price (50% of total)

Computing - solution 15 Viglen HX2225i, 1U Twin Motherboard Systems Each 1U system comprises: –Supermicro X7DWT Motherboard –Intel 5400 (Seaburg) Chipset –Dual Intel Xeon E5420 (Quad Core, 2.5GHz) Processors –16GB DDR2-667 ECC Registered FBDIMM –Memory Installed as 4x4GB FBDIMMs per motherboard –160GB SATA Hard Disk –Single x16 PCI-Express slot (for GPU extension) Total 20 systems, 16 cores/system, 320 cores.

Computing – solution 1U twin motherboard systems Each 1U motherboard is: –Supermicro X7DWT –Intel 5400 (Seaburg) Chipset –Dual Intel Xeon E5420 (Quad Core, 2.5GHz) Processors –16GB DDR2-667 ECC RAM –Installed as 4x4GB FBDIMMs –160GB SATA Hard Disk –1x16 PCI-Express slot (for GPU extension) Total 20 systems, 16 cores/system, 320 cores.

Network - current layout

Network – new layout Two core switches, one in each computer room Each beamline connected to both switches – so either 2GBit or 20 GBit peak bandwidth available –halved if one core switch fails Cluster switch connected with 2 x 10 GBit to each switch. Storage system connected directly into core. Formally looking at a goal of >99.95% reliability on network infrastructure during beam time. –Corresponds to 3 hours outage/year –Doesn’t include reliability of associated computer services.

Network - new layout

The interesting specifications Each disk tray: –Consumes up to 1750 W –Weighs up to 105 kg –Is 4U high (10 trays in one rack) Each compute node –Consumes up to 600 W –Weighs 18 kg –Is 1U high (38 nodes in each rack) So: –10 disk trays in a rack are 17.5 kW and weigh 1050 kg –38 computers in a rack are 22.8 kW and weigh 684 kg

Where do we put the systems? We need a computer room which provides: –~20kW/rack. –> 1 ton rack load –24 hrs/day operation, even during power and water maintenance. Or there is the CERN alternative: –currently fill their racks with 6 1U servers and then they are “full”.

New computer room Room has up to 350 kW redundant power, (A and B) from two separate sub-stations. Power from A is UPS and generator backed up. Room is provided with up to 350 kW cooling water. Primary cooling is from site chilled water, 220 kW standby chiller in case of problems.

Computer room - racks 10 racks initially – 2 network, 8 computer. Cooling by in-rack cooling units to cope with the high power density. Low power network racks also have cooling units to provide N+1 resiliancy. Computers exhaust into a shared hot-air plenum which can only escape through one of the cooling units. Each rack has four 11 kW 3-phase power strips (two from each sub-station). Status of system continuously monitored and available through browser and SNMP.

Contained cold aisle 19” server / equipment 20 kW in row air to water hear exchanger Double doors Network racks 4,375 mm 800 mm 4,000 mm 1,200 mm Racks - cooling

Racks - layout

Current situation Water pipes Cable Tray

Racks Front view Rear view 20 kW air handling unit 2 x 11 kW power strips

Some calculations Heat capacity of air = J/cm^3/K = 1.3 kJ/m^3/K Volume of 1 rack = 0.6x0.8x2 m 3 = 0.96 m 3 Typical Delta T for computers = 20 C So, air change time for 20 kW rack=20*0.96/(1.3*20) = 0.74 s Want 8 server racks = 160 kW. Volume of cold air in racks = 1.2x2x4.5 = 10.8 m 3 Temperature rise rate of air = 160/(10.8*1.3) = 11.4 deg/sec The room volume = 16*6.5*2.7 = 280 m 3. Temperature rise rate of air = 160/(280*1.3) = 0.44 deg/sec Need safety system in case water stops flowing!

The Services Crisis Server compute performance has been increasing by a factor of three every two years. Computer energy efficiency is only doubles every two years. Computer spend is staying relatively constant. –Computational requirements appear to scale with the compute performance. So, while power consumption per computational unit has dropped dramatically, the power consumption per rack has risen dramatically.

Bottom line Reprinted with permission of The Uptime Institute Inc from a White Paper titled “The Invisible Crisis in the Data Center: The Economic Meltdown of Moore’s Law” $1000 also buys you 190 W over 3 $0.20/kWh

Rack loads Floor loads are approaching 1.5 tons/m 2 Architects had said there were no problems When I looked the floor spec was 350 kg/m 2

Floor strengthening

Conclusions Computing is now all about providing services We are fighting hard against the heat capacity of air – its down to very basic physics. Costs are dominated by services – £1,000k gives: –£45k computing –£300k storage –£105k network –£200k racks –£100k mechanical services (water pipes) –£100k electrical services –£100k floor strengthening –£50k (largely wasted) architectural services

Introduction Over the past year we have spent a lot of time thinking about how to provide the computing resources for Phase 2. Started as a project to just look at tomography, but broadened to all of Phase 2 computing. Taken input from beamlines, data acquisition group, other synchrotrons and management. Had proposals for capital projects in the system since June Visited SLS, CERN and ESRF in November Finally had Gerd signoff late February 2008.

Other requirements Some beamline load is relatively constant (e.g. MX), but others are heavily peaked (Tomography). Total demand can rise rapidly with new detectors (e.g. Pilatus). We need to maintain consistency across beamlines to keep our overheads low. We need a balanced system with no major bottlenecks We need power and cooling infrastructure – beamlines are not designed for the spiraling computing demands.

Typical data analysis Detector Image processing Align Image generation Remove Instrumental signature Display Archive and off-site copy

10 Gbit/s Physical system implementation Disks 1 Gbit/s 2 Gbit/s Phase 1 Beamlines Beamline Switch 0.5 Gbit/s 80 Gbit/s Phase 2 Beamlines Disks Beamline Switch Cluster Switch 1 Gbit/s 10 Gbit/s 40x1 Gbit/s Central Switch Cluster 60 Gbit/s 40 Gbit/s

Other bottlenecks The disk bottleneck is not the only possibility, it is just the major one we encountered in Phase 1 Other bottlenecks include –Latencies as well as throughputs –Network –Backplane –Memory –CPU A balanced system is required CPU is easiest and cheapest to address, but often doesn’t solve the problem Software design changes are often the most effective and cheapest.

Other considerations This is just raw power cost. –Our tenders included the 3 year power can cooling cost as part of the tendered price. Rack power density is doubling every two years. Rack cooling load is doubling every two years. The capital cost of providing the room services is significant.

The snowball effect Initial task: –Provide some computers for tomography for a number of beamlines. Result: –Need to provide network to that room Need to ensure network won’t bottleneck Ensure it keeps working, even during maintenance. –Need to provide disk storage that won’t bottleneck –Need to provide a room to house computers Need to provide cooling for that room Need to provide power for that room Ensure it keeps working even during maintenance. –Need to build in room for expansion –We’ve done so much, why don’t provide enough for all Phase 2 beamlines? –…and, by the way, buy a few compute nodes as well.

Result We are building a new high density computer room with resilient power and cooling. We will upgrade the network to increase performance and provide resiliency. We have purchased a storage system about 200 TBytes in size, with 6 GBytes/sec performance. –Can be expanded to nearly 500 TBytes We have bought a 240 core compute cluster for the next MX beamlines and Tomography. –Each dual quad-core motherboard has a PCI-Express x16 slot for NVidia expansion.

Philosophy Beamlines share highly resilient, high performance, central computing facilities We ensure the network is always available and the computing and storage is always there. Beamlines have local work stations, control computers, media stations printers etc. Design relies on a balance of network, storage and computing performance. –Computing cost £45k –Network cost £105k –Storage cost £300k –Racks, services, and building costs £500k

Layout of Beamline Computing Resources Science Router Beamline Firewall/ filter Cluster Switch Machine Switch Diamond House Compute Cluster RAID Storage cluster Computer Room Cluster Firewall/ filter 1 GBit/s Ethernet 4 GBit/s Fibre Channel 10 GBit/s Ethernet Beamline Switch Detector Server Media Station Consoles Detector Local detector storage Beamline User USB and eSATA drives CA Gateway File Servers

Computer room Rooms in Zone 10. Selected because it was not suitable for offices because the outside windows are being blocked off by the I12 build. Possibility of extending it counter-clockwise, if necessary.

Computer room - location

I12

Computer room - location I12 Standby Chiller Chilled water distribution: – provides cooling directly to racks

New Computer Room A new computer room is being constructed in Zone 10 above I12. Initial computer suite is 8 racks: –Each rack has ~20 kW of cooling –Each rack has ~22 kW of redundant power –Backup chiller is available if site cooling is unavailable. Initial computing fills about 2 racks: –200 TBytes of disk with 6 Gbytes/sec performance Phase 1 disks are bottlenecked at ~50 Mbytes/sec –240 cores of computing. Network is also upgraded: –Core network is two independent stars. We expect the redundancy to be used more to enable maintenance to take place without interrupting services, rather than because of actual equipment failure.

Current situation Water pipes Cable Tray

Future expansion Current systems fill 2.5 racks out of 8. Room services allow a doubling in power and cooling capacity Room area allows more than doubling of number of racks. Disk trays only have 25 disks with a capacity of 60 –Note that adding disks does not increase available bandwidth. Compute nodes can have NVidia accelerators

Programme End March 2008: –Design finalised (PB) –Tenders ready for racks, cooling, network, storage and computing End April 2008: –Prime contractor selected and on board –Electrical enabling works complete May-July 2008 –Build work –Suppliers for racks, cooling etc selected and orders placed. Late July 2008 –Build work completed. –In room fitout starts Will actually start late August Late September 2008 –Systems available for use Will be November at least

Conclusions We are working on an ambitious upgrade to Diamonds computing resources We have a tight timescale We believe we can meet it, but will be needing more staff (and are currently hiring…)

Any Questions?