CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t HEPiX Report Helge Meinhard, David Gutierrez, Jérôme Belleman / CERN-IT Technical Forum/Computing.

Slides:



Advertisements
Similar presentations
Tier-1 Evolution and Futures GridPP 29, Oxford Ian Collier September 27 th 2012.
Advertisements

ATLAS Tier-3 in Geneva Szymon Gadomski, Uni GE at CSCS, November 2009 S. Gadomski, ”ATLAS T3 in Geneva", CSCS meeting, Nov 091 the Geneva ATLAS Tier-3.
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Ewan Roche, Ulrich Schwickerath, Manuel Guijarro,
CERN IT Department CH-1211 Geneva 23 Switzerland t T0 report WLCG operations Workshop Barcelona, 07/07/2014 Maite Barroso, CERN IT.
Tier-1 experience with provisioning virtualised worker nodes on demand Andrew Lahiff, Ian Collier STFC Rutherford Appleton Laboratory, Harwell Oxford,
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
CERN IT Department CH-1211 Genève 23 Switzerland t Summary of the HEPiX Autumn 2013 Meeting Arne Wiebalck Afroditi Xafi Thomas Oulevey CERN.
Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.
HEPiX Catania 19 th April 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 19 th April 2002 HEPiX 2002, Catania.
HEPiX Orsay 27 th April 2001 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 27 th April 2001 HEPiX 2001, Orsay.
RAL Site Report HEPiX Fall 2013, Ann Arbor, MI 28 Oct – 1 Nov Martin Bly, STFC-RAL.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
A. Mohapatra, HEPiX 2013 Ann Arbor1 UW Madison CMS T2 site report D. Bradley, T. Sarangi, S. Dasu, A. Mohapatra HEP Computing Group Outline  Infrastructure.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
Site report: Tokyo Tomoaki Nakamura ICEPP, The University of Tokyo 2014/12/10Tomoaki Nakamura1.
CC - IN2P3 Site Report Hepix Fall meeting 2009 – Berkeley
HEPiX Summary Martin Bly HEPSysMan - RAL, June 2013.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
CERN IT Department CH-1211 Genève 23 Switzerland t Plans and Architectural Options for Physics Data Analysis at CERN D. Duellmann, A. Pace.
Oxford Update HEPix Pete Gronbech GridPP Project Manager October 2014.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
HEPiX Fall 2013 U Mich Ann Arbor, USA Workshop Wrap-Up Helge Meinhard, Sandy Philpott.
HEPiX Fall 2012 IHEP Beijing / PR China Workshop wrap-up Helge Meinhard.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
HEPiX Spring 2012 Highlights Helge Meinhard CERN-IT GDB 09-May-2012.
Developing & Managing A Large Linux Farm – The Brookhaven Experience CHEP2004 – Interlaken September 27, 2004 Tomasz Wlodek - BNL.
CCRC’08 Weekly Update Jamie Shiers ~~~ LCG MB, 1 st April 2008.
Andrea Sciabà CERN CMS availability in December Critical services  CE, SRMv2 (since December) Critical tests  CE: job submission (run by CMS), CA certs.
BNL Tier 1 Service Planning & Monitoring Bruce G. Gibbard GDB 5-6 August 2006.
Virtualised Worker Nodes Where are we? What next? Tony Cass GDB /12/12.
HEPiX FNAL ‘02 25 th Oct 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 25 th October 2002 HEPiX 2002, FNAL.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
HEPiX Spring 2013 CNAF Bologna / Italy Workshop wrap-up Helge Meinhard.
IHEP(Beijing LCG2) Site Report Fazhi.Qi, Gang Chen Computing Center,IHEP.
HEPiX 2 nd Nov 2000 Alan Silverman Proposal to form a Large Cluster SIG Alan Silverman 2 nd Nov 2000 HEPiX – Jefferson Lab.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
HEPiX report Helge Meinhard, Harry Renshall / CERN-IT Computing Seminar / After-C5 27 May 2005.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Site Report Helge Meinhard / CERN-IT HEPiX Fall 2009 LBNL 26 October 2009.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
HEPiX/HEPNT report Helge Meinhard, Alberto Pace, Denise Heagerty / CERN-IT Computing Seminar 05 November 2003.
HEPiX report Autumn/Fall 2015 HEPiX meeting Brookhaven National Lab, Upton NY, U.S.A. Helge Meinhard, CERN-IT Grid Deployment Board 04-Nov-2015 (with.
16-Nov-01D.P.Kelsey, HTASC report1 HTASC - Report to HEP-CCC David Kelsey, RAL rl.ac.uk 16 November 2001, CERN ( )
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
1 Update at RAL and in the Quattor community Ian Collier - RAL Tier1 HEPiX FAll 2010, Cornell.
SL5 Site Status GDB, September 2009 John Gordon. LCG SL5 Site Status ASGC T1 - will be finished before mid September. Actually the OS migration process.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
HEPiX Spring 2015 Oxford University, UK Workshop Wrap-Up Helge Meinhard.
HEPiX report Spring 2015 HEPiX meeting Oxford University, UK Helge Meinhard, CERN-IT Grid Deployment Board 10-Jun Helge Meinhard (at) CERN.ch.
Michele Michelotto – CCR Marzo 2015 Past present future Fall 2012 Beijing IHEP Spring 2013 – INFN Cnaf Fall 2013 – Ann Arbour, Michigan Univ. Spring.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Improving resilience of T0 grid services Manuel Guijarro.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Cluman: Advanced Cluster Management for Large-scale Infrastructures.
HEPiX report Spring 2016 HEPiX meeting DESY Zeuthen (Berlin), Germany Helge Meinhard, CERN-IT Grid Deployment Board 11-May Helge Meinhard (at)
The HEPiX IPv6 Working Group David Kelsey (STFC-RAL) EGI OMB 19 Dec 2013.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
DIRAC for Grid and Cloud Dr. Víctor Méndez Muñoz (for DIRAC Project) LHCb Tier 1 Liaison at PIC EGI User Community Board, October 31st, 2013.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
Farming Andrea Chierici CNAF Review Current situation.
HEPiX spring 2013 report HEPiX Spring 2013 CNAF Bologna / Italy Helge Meinhard, CERN-IT Contributions by Arne Wiebalck / CERN-IT Grid Deployment Board.
Virtual machines ALICE 2 Experience and use cases Services at CERN Worker nodes at sites – CNAF – GSI Site services (VoBoxes)
HEPiX Spring 2014 Annecy-le Vieux May Martin Bly, STFC-RAL
INFN Computing infrastructure - Workload management at the Tier-1
Yaodong CHENG Computing Center, IHEP, CAS 2016 Fall HEPiX Workshop
Oxford Site Report HEPSYSMAN
Helge Meinhard, CERN-IT Grid Deployment Board 10-May-2017
Introduction to HEPiX Helge Meinhard, CERN-IT
HEPiX Spring 2009 Highlights
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t HEPiX Report Helge Meinhard, David Gutierrez, Jérôme Belleman / CERN-IT Technical Forum/Computing Seminar 16 September 2012

Outline Meeting organisation; site reports; computing; miscellaneous (Helge Meinhard) Security and networking; storage (David Gutierrez) IT infrastructure; grids, clouds and virtualisation (Jérôme Belleman) HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

HEPiX Global organisation of service managers and support staff providing computing facilities for HEP Covering infrastructure and all platforms of interest (Unix/Linux, Windows, Grid, …) Aim: Present recent work and future plans, share experience, advise managers Meetings ~ 2 / y (spring in Europe, autumn typically in North America) HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

HEPiX Autumn 2012 (1) Held 15 – 19 October at the Institute of High Energy Physics (IHEP) of the Chinese Academy of Sciences, Beijing, People’s Republic of China –1300 staff, 400 students, 120M $ budget –BEBC II accelerator, BES III detector; members in Belle II, CMS, ATLAS; neutrino experiments –Particle astrophysics, theory, synchrotron lab –Tier 2 centre in LCG for Atlas and CMS Excellent local organisation –Gang Chen and his team made the meeting run very smoothly –Network including Wifi, video conferencing (Vidyo – 4 remote presentations), … all working like a charm –Beijing: Growing and changing at an incredible speed Cars have almost entirely replaced bicycles… Sponsored by Huawei and Western Digital HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

HEPiX Autumn 2012 (2) Format: Pre-defined tracks with conveners and invited speakers per track –Rich, interesting and packed agenda Contrary to last time, Silverman’s law applied once more – agenda was full, but not overcrowded –Judging by number of submitted abstracts, good balance between tracks: IT infrastructure (12 talks), network and security (11 talks), computing (8 talks), grids/clouds/virtualisation (7 talks), storage and file systems (7 talks), miscellaneous (4 talks)… plus one BoF session (on batch systems) and 11 site reports Full details and slides: Trip report by Alan Silverman available, too HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

HEPiX Autumn 2012 (3) 67 registered participants, of which 9/10 from CERN –CERN: Belleman, Cass, Fedorko, Grzywaszewski, Gutierrez, Lopienski, Meinhard, Salter, (Silverman,) Traylen –20 from Asia, 39 from Europe, 6 from USA, 2 from Australia –Plus some more colleagues from IHEP Representing 27 institutes, 2 sponsors –9 from Asia, 15 from Europe, 2 from USA, 1 from Australia –2 worldwide sponsor companies Compare with Prague (spring 2012): 97 participants, of which 12/13 from CERN; Vancouver (autumn 2011): 98 participants, of which 10/11 from CERN HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

HEPiX Autumn 2012 (4) 60 talks, of which 13 from CERN –Compare with Prague: 74 talks, of which 22 from CERN –Compare with Vancouver: 55 talks, of which 15 from CERN Next meetings: –Spring 2013: CNAF, Bologna, Italy, 15 – 19 April Batch systems; energy efficiency; network monitoring?; Windows 8 etc.? –Autumn 2013: U Michigan, Ann Arbor, US, 28 October – 01 November –Spring 2014: Interest by LAPP Annecy (to be confirmed) HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

Site reports (1): Hardware Only few details this time CPU servers: same trends – core dual-CPU servers, GB/core. Typical chassis: 2U Twin2; some A-brand blades (one failing blade has taken entire chassis down) Disk storage –External disk enclosures gaining popularity 4U trays with 48…60 drives becoming popular No positive indication that SAS nearline is taking up –A-brand’s extension disk tray has got firmware… –IBM Sonas HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

Site reports (2): Hardware (cont’d) Tapes –An increasing number of sites mentioned T10kC in production –LTO popular, many sites investigating (or moving to) LTO5; some migration from LTO to T10kC HPC –IB still popular; two large clusters at GSI, one for computing, one for parallel file system (Lustre) –10GE ramping up Odds and ends: Suppliers going bust HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

Site reports (3): Software Storage –CVMFS now a standard service – little issues only –Increasing interest in NFS interfaces for dCache and DPM –Lustre mentioned often – works well with controlled use cases and new, homogeneous hardware, but issues with some use cases and older hardware –Enstore/dCache: small file aggregation in production at FNAL OS –GSI moving from flat to hierarchical Windows domain (domain controllers on VMs); LAL has completed move to Windows domain to IN2P3 “forest” Mail/calendaring services –Exchange 2003 and/or Lotus to Exchange 2010 (FNAL: 3’000 accounts total) –DESY considering alternative solutions to replace Exchange 2003: OpenXchange, Zimbra, Zarafa HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

Site reports (4): Software (cont’d) Virtualisation –Most sites experimenting with KVM –Some use of VMware (and complaints about cost level…) and of HyperV –Australia: migrating from KVM to Citrix –Most sites run critical services on VMs Clouds –Openstack –OpenNebula Miscellaneous: Docuwiki, Redmine, git Configuration management –Puppet seems to be clear winner, still on the rise –Chef, Quattor used as well –Declining interest in cfengine (3) HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

Site reports (5): Software (cont’d), Infrastructure Monitoring –Some sites migrating from Nagios to Icinga, one site considering Zenoss –Ganglia used frequently for performance monitoring –PerfSONAR being deployed everywhere Infrastructure –A number of upgrade projects (IHEP from 800 kW to 1’800 kW) –GSI: Cube prototype working fine even at 32 deg outside –RAL: switch gear in power supply line being replaced, higher risk until end November –LAL: Major chiller failure –FNAL: During hot summer, had to throttle down major services –DESY: During power supply maintenance, batteries on full load – some exploded, acid on the floor… resulting in extended power outage –DESY: 20 kUSD network line card destroyed by concrete dust due to drilling HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

Computing: Batch systems (1) 8 talks, BoF session Site reports: Torque/Maui for small to medium size installations; PBSpro; GridEngine; Slurm (mentioned 3 times) FNAL: HTCondor since 2002 for part of their facilities (CDF) –Many features added on FNAL’s request –Main scalability concern is condor_schedd; single-threaded, supports up to 30 k simultaneous jobs now, goal is 150 k CERN: LSF: large installation, heterogeneous user base, 400 k jobs per day –Issues: slow response to queries and submissions, slow dispatching, fairshare scheduling, setup complex, poorly dynamic, limited scalability –Targeting 12’000 physical nodes, 300’000 job slots –Currently looking at Slurm, GE, Condor, LSF8 –Recent work on monitoring and accounting HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

Computing: Batch systems (2) KIT: 1000 nodes, split into two PBS instances due to PBS limitations –Tested Torque/Maui, GE; selected Univa GE –Migration started in July, to finish in December –GE: learning curve… but stable, flexible, with good support IN2P3-CC: Migration to Oracle GE completed in December 2011 –A lot of interfacing done by IN2P3 –Shadow master abandoned due to instabilities –Difficult to get job information; no native grid support –Oracle support not brilliant; difficult to get in contact with developers; no road map for GE; only serious bugs got fixed –Getting in touch with Univa HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

Computing: Batch systems (3) DESY Zeuthen: Added Certificate Security Protocol support to UGE NDGF: Slurm experience: very positive, easier and more stable than predecessors (Torque/Maui) –Defaults often not adequate, tuning needed INFN Bari: Testing Slurm –Tested a long list of functionalities –Scheduling powerful, but can be improved by using MOAB or LSF scheduler –No RPM; no way to transfer output back to submission host –Rather steep learning curve –Tests with 6’000 cores and 100’000 jobs all successful, very moderate load on master –Grid integration (Cream CE) progressing HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012

Miscellaneous CERN mobile Web site HEPiX report – Helge.Meinhard at cern.ch – 16-Nov-2012