HEPiX Spring 2009 Highlights Stolen for use at HEPSYSMAN by Pete Gronbech 30th June 2009 Michel Jouvin LAL, Orsay jouvin@lal.in2p3.fr http://grif.fr June 10, 2009 GDB, CERN
HEPiX at a Glance 15+ years old informal forum of people involved in “fabric management” in HEP No mandate from any other body More sysadmins than IT managers Open to anybody interested http://www.hepix.org : archive of all meetings Main activity is a 1-week meeting twice a year 60-100 attendees per meeting, “stable” core Focus : exchange of experience and technology review Mix of large and small sites Better understanding of each others Each one benefit of the others Most of the sites involved in grid computing “On-demand” working groups Currently distributed file systems 18/01/201910/6/2009 HEPiX Spring 2009 Highlights
Main Topics Main topics are always the same but main focus changes at each meeting HEPiX value is the periodic report on work in progress Umea: focus on virtualization Main tracks Site reports : very valuable part, update on changes, give a « picture » of what’s happening aroud the world Scientific Linux status and future directions Data Centers : cooling, power consumption, … Less active, projects in building phase Storage : convened by File System WG LUSTRE growing Evolving as a storage-focused forum to share experience Virtualization Benchmarking : still active, new activities related to virtualization Security and networking 18/01/201910/6/2009 HEPiX Spring 2009 Highlights
Virtualization A track at each meeting for at least 2 years… Initial focus was mainly on service consolidation Coverage extended to virtualized environments for applications Virtualized WN Integration with batch schedulers Image management and resource allocation (openNebula) Grid and clouds (eg. StratusLab) CERNVM: very minimal and generic VM approach Position of sites about VO-maintained images “First” discussion rather decision : sites not comfortable but no “No” Feasibility of a limited number of images per VO ? Part of SW area ? GDB is the place for further discussions Benchmarking: CPU, IO, Xen vs. KVM… 18/01/201910/6/2009 HEPiX Spring 2009 Highlights
File Systems WG Set up 2 years ago with a mandate to review distributed file systems technologies Initial mandate from former IHEPCCC Continued since on a voluntary basis with 2 objectives Benchmarking activities with more realistic and diverse use cases: LUSTRE outperforming (2x) all other solutions in every case so far. Share experience and expertise with new technologies Mainly LUSTRE in fact as it is used in several places now New members joined: CERN (LUSTRE evaluation), FNAL, Lisbon Producing a report at each HEPiX meeting This meeting new topic: potential AFS + LUSTRE combination 18/01/201910/6/2009 HEPiX Spring 2009 Highlights
Miscellaneous iSCSI evaluation at CERN Alternative to FC ? SLURM, an alternative to Torque (MAUI?) Command-compatible with Torque First benchmarks of Nehalem-based machines 50% improvement in power efficiency compared to Hapertown CERN R&D on network anomaly (CINBAD) 18/01/201910/6/2009 HEPiX Spring 2009 Highlights
PDG Notes Troy Dawson gave an update on what SL6 will be like! Also Fermi STS based on Fedora (good for desktops/laptops) http://indico.cern.ch/contributionDisplay.py?contribI d=16&sessionId=15&confId=45282 CERN move from CVS to SVN (long overlap), new computer centre will be green (heat to be used for heating other buildings) continuous rounds of procurement, aim for 50000 hepspec06 (~500 systems) in Oct 09 Skype now tolerated if correctly configured http://indico.cern.ch/contributionDisplay.py?contribId=28 &sessionId=19&confId=45282 UMEA File servers have mirrored sys disk on USB sticks, one internal one external. 18/01/201910/6/2009 HEPiX Spring 2009 Highlights
PDG Notes 2 Many sites reported problems with A/C and cooking computers with surprisingly few failures. A few sites eg LAL, Oxford have installed SL5 WNs. LAL reported problems with PBSpro. Triumf – uses bacula for backups One observation that Intel servers lasted longer than AMD over 4 year period CERN Security talk good. http://indico.cern.ch/contributionDisplay.py?contribI d=40&sessionId=5&confId=45282 18/01/201910/6/2009 HEPiX Spring 2009 Highlights
PDG notes 3 Benchmarking from CERN http://indico.cern.ch/contributionDisplay.py?contribId=29&s essionId=23&confId=45282 Platform Processor HEP-SPEC06 Baseline 2 x L5420 2.5 GHz 67.75 1 2 x E5530 2.4 GHz 100.17 2 2 x E5540 2.53 GHz 103.22 3 2 x 2376HE 2.3 GHz 70.47 Platform OS Compiler HEP-SPEC06 1 SLC5 gcc4.1.2 100.17 1 SLC4 gcc3.4.6 84.88 2 SLC5 gcc4.1.2 103.22 2 SLC5 gcc4.3 106.45 18/01/201910/6/2009 HEPiX Spring 2009 Highlights
PDG Notes 4 10G card tests (require some work) http://indico.cern.ch/contributionDisplay.py?contribI d=24&sessionId=9&confId=45282 CERN Lustre evaluation talk good http://indico.cern.ch/contributionDisplay.py?contribI d=14&sessionId=10&confId=45282 Benchmarking of VMs talks from INFN and Victoria, excellent cpu perf. but interesting thing is slow i/o http://indico.cern.ch/contributionDisplay.py?contribI d=5&sessionId=7&confId=45282 18/01/201910/6/2009 HEPiX Spring 2009 Highlights
Conclusions HEPiX is a very « useful » forum opened to any site interested Complementary to GDB, focused on fabric management rather than grid services No formal membership : just register to the meeting.. Next meeting in Berkeley, October 26-30th Look for announcement soon at http://www.hepix.org Ask to be registered on HEPiX mailing list (low volume…) New track on monitoring tools and practices ? Material produced is mainly presentations during the workshops Look at agendas if interested by a presentation. Start at http://www.hepix.org 18/01/201910/6/2009 HEPiX Spring 2009 Highlights