Nikhef/(SARA) tier-1 data center infrastructure Tier-1 facts Expanding the Nikhef center Wim Heubers / Nikhef Amsterdam NL
LCG Tier-1 Amsterdam Science Park Nikhef - National institute for subatomic physics LHC (ATLAS, LHCb, ALICE), astroparticle physics data center: 500 m2, 800KW incl cooling grid services (disk storage, clusters) internet exchange AMS-IX SARA - Computing and Networking Services colo services, consulting data center: 1500 m2, 2 MW incl cooling, national super, national cluster, netherlight, etc grid services (tape and disk storage, clusters) HEPix May 2008 Wim Heubers / Nikhef
LCG Tier-1 Amsterdam Science Park More infrastructure: SURFnet - national research network provides connectivity to LCG OPN Big Grid - the dutch e-science grid provides resources for LCG tier-1 and other domains 2008-2011. Amsterdam Internet Exchange AMS-IX major and neutral internet exchange six housing locations including SARA and Nikhef HEPix May 2008 Wim Heubers / Nikhef
Nikhef-SARA LCG Tier-1 Nikhef - SARA share … campus, building, on-site security, restaurant LCG OPN connections tier-1 operations (!) Nikhef - SARA do NOT share … power and cooling infrastructure sysadmin tier-1 resources (grid services, clusters, storage) SARA does and Nikhef does not … provide hierarchical storage (tape, dCache) generic grid services Nikhef does and SARA does not … middleware development (VOMS, LCAS, etc) scaling and validation test beds tier-3 services colo HEPix May 2008 Wim Heubers / Nikhef
Computing colo HEPix May 2008 Wim Heubers / Nikhef
Disk Storage colo HEPix May 2008 Wim Heubers / Nikhef
Tape Storage colo HEPix May 2008 Wim Heubers / Nikhef
LCG HEP resources Tier-1 installations: Computing SARA 60% Nikhef 40% Disk storage SARA 60% Nikhef 40% Tape storage SARA 100% Note: ‘Big Grid’ budget until 2011. colo HEPix May 2008 Wim Heubers / Nikhef
Expanding the Nikhef data center HEPix May 2008 Wim Heubers / Nikhef
Data center layout nikhef amsterdam grid internet exchange colo HEPix May 2008 Wim Heubers / Nikhef
Amsterdam Internet Exchange AMS-IX neutral and independent started 15 years ago at Science Park now: distributed housing at 6 locations in Amsterdam large exchange: 300 connected parties Nikhef housing: 200 racks, 100 customers Nikhef provides: UPS power, cooling, security, access assistance during office hours HEPix May 2008 Wim Heubers / Nikhef
Amsterdam Internet Exchange AMS-IX zero-down-time HEPix May 2008 Wim Heubers / Nikhef
Nikhef - power demands controlled linear increase HEPix May 2008 colo controlled linear increase HEPix May 2008 Wim Heubers / Nikhef
Expanding the data center we need more … floor space, power, cooling security, fire suppression, alarm procedures monitoring of critical infrastructure but it has to be … realized within the existing (institute) building without affecting ams-ix operations (zero-down-time) colo HEPix May 2008 Wim Heubers / Nikhef
What happened … many discussions with management gained experience reliable infrastructure is very expensive! gained experience visit commercial data centers, visit conferences like ‘Datacenter Dynamics’ hired technical external expertise and project management incident due to overloaded circuit breaker monitoring and capacity planning are essential put effort into temporarily measures colo HEPix May 2008 Wim Heubers / Nikhef
Temporarily measures (1) backup generators HEPix May 2008 Wim Heubers / Nikhef
Temporarily measures (2) This week: add extra cooling for 50 KW grid resources, just in time for May run CCRC08 HEPix May 2008 Wim Heubers / Nikhef
finished April 2009 (I hope) Planning … install new cooling equipment (on the roof) integrate a 2nd UPS and generator into infrastructure remember ‘zero-down-time’ install new fire suppression and climate handling systems convert library into new data room on the 2nd floor move grid clusters and storage from 1st to 2nd floor extent AMS-IX housing on the 1st floor Make the grid resources visible … colo finished April 2009 (I hope) HEPix May 2008 Wim Heubers / Nikhef
From library to grid … HEPix May 2008 Wim Heubers / Nikhef
Monitoring power to the racks main power distribution: connected to facility control system (alarm -> standby service) current (amps) per phase in power distribution units power drop per phase on distribution rails power usage in racks: connected to ‘our’ IT control system current (amps) and power usage (KWh) per phase in racks needed for capacity planning and billing energy costs to users Note: monitoring the grid clusters and storage is done separately (Ganglia) colo HEPix May 2008 Wim Heubers / Nikhef
Power monitoring Amps and KWhrs HEPix May 2008 Wim Heubers / Nikhef
Cooling AMS-IX housing (can’t change too much): 10 years ago designed for 1.8 KW average per rack yes, this is still the average today! [telco equipment] but we have annoying ‘hot spots’ on the floor too many obstacles under raised floor and above ceiling grid housing (new floor!) maximum 50 racks and 300KW total power raised floor, but limited space above the racks proposed solution: cold corridor principle save energy … free cooling, optimize cold air flow increase room temperature and cold water temperature (10-16 C) colo HEPix May 2008 Wim Heubers / Nikhef
Fire suppression now: only smoke detection choice between: leave it as it is suppression with inert gas (Argon) suppression with chemical gas (Novec-1230) Suggestions? colo HEPix May 2008 Wim Heubers / Nikhef
Extending an existing facility It is expensive in time and money You don’t get what you really want Piping and fitting through concrete floors zero-down-time: stressing colo HEPix May 2008 Wim Heubers / Nikhef
Remarks and conclusions from idea to realization: it takes two years you have to position yourself between IT and infrastructure sustainability: can a data center be green? Cooling Grid : how to guarantee an optimal usage of resources? if you can start all over again: do it! colo HEPix May 2008 Wim Heubers / Nikhef
Questions? HEPix May 2008 Wim Heubers / Nikhef