HEPiX Autumn Meeting 2014 University of Nebraska, Lincoln 2 Arne Wiebalck Liviu Valsan Borja Aparicio Wiebalck, Valsan,

Slides:



Advertisements
Similar presentations
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
Advertisements

O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.
15/07/2010Swiss WLCG Operations Meeting Summary of the last GridKA Cloud Meeting (07 July 2010) Marc Goulette (University of Geneva)
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) HEPiX, Oxford 24 Mar 2015.
CERN IT Department CH-1211 Geneva 23 Switzerland t T0 report WLCG operations Workshop Barcelona, 07/07/2014 Maite Barroso, CERN IT.
Tier-1 experience with provisioning virtualised worker nodes on demand Andrew Lahiff, Ian Collier STFC Rutherford Appleton Laboratory, Harwell Oxford,
Agenda Network Infrastructures LCG Architecture Management
CERN IT Department CH-1211 Genève 23 Switzerland t Summary of the HEPiX Autumn 2013 Meeting Arne Wiebalck Afroditi Xafi Thomas Oulevey CERN.
Copyright © 2010 Platform Computing Corporation. All Rights Reserved.1 The CERN Cloud Computing Project William Lu, Ph.D. Platform Computing.
HEPiX Catania 19 th April 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 19 th April 2002 HEPiX 2002, Catania.
RAL Site Report HEPiX Fall 2013, Ann Arbor, MI 28 Oct – 1 Nov Martin Bly, STFC-RAL.
A. Mohapatra, HEPiX 2013 Ann Arbor1 UW Madison CMS T2 site report D. Bradley, T. Sarangi, S. Dasu, A. Mohapatra HEP Computing Group Outline  Infrastructure.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
HEPiX October 2009 Keith Chadwick. Outline Virtualization & Cloud Computing Physical Infrastructure Storage Monitoring Security ITIL HEPiX Conference.
HEPiX Summary Martin Bly HEPSysMan - RAL, June 2013.
CERN IT Department CH-1211 Genève 23 Switzerland t Summary of the HEPiX Spring 2014 Meeting Arne Wiebalck Ben Jones Vincent Brillault CERN.
HEPiX Spring Meeting 2015 University of Oxford, UK 2 Arne Wiebalck Julien Leduc Adam Krajewski Wiebalck, Leduc, Krajewski:
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) WLCG GDB, CERN 8 July 2015.
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP35, Liverpool 11 Sep 2015.
Oxford Update HEPix Pete Gronbech GridPP Project Manager October 2014.
Configuration Management with Cobbler and Puppet Kashif Mohammad University of Oxford.
HEPiX Spring 2012 Highlights Helge Meinhard CERN-IT GDB 09-May-2012.
JLab Scientific Computing: Theory HPC & Experimental Physics Thomas Jefferson National Accelerator Facility Newport News, VA Sandy Philpott.
RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.
Fast Benchmark Michele Michelotto – INFN Padova Manfred Alef – GridKa Karlsruhe 1.
KISTI-GSDC SITE REPORT Sang-Un Ahn, Jin Kim On the behalf of KISTI GSDC 24 March 2015 HEPiX Spring 2015 Workshop Oxford University, Oxford, UK.
An Agile Service Deployment Framework and its Application Quattor System Management Tool and HyperV Virtualisation applied to CASTOR Hierarchical Storage.
HEPiX FNAL ‘02 25 th Oct 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 25 th October 2002 HEPiX 2002, FNAL.
GDB July 2015 Jeremy’s quick summary notes Also refer to the meeting minutes
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
Benchmarking status Status of Benchmarking Helge Meinhard, CERN-IT WLCG Management Board 14-Jul Helge Meinhard (at) CERN.ch.
HEPiX IPv6 Working Group David Kelsey GDB, CERN 11 Jan 2012.
HEPiX Summary Fall 2014 – Lincoln, Nebraska Martin Bly.
Arne Wiebalck -- VM Performance: I/O
HEPiX report Autumn/Fall 2015 HEPiX meeting Brookhaven National Lab, Upton NY, U.S.A. Helge Meinhard, CERN-IT Grid Deployment Board 04-Nov-2015 (with.
HEPiX Fall 2009 Highlights Michel Jouvin LAL, Orsay November 10, 2009 GDB, CERN.
The HEPiX IPv6 working group David Kelsey (STFC-RAL) HEPiX meeting, Bologna 17 Apr 2013.
RALPP Site Report HEP Sys Man, 11 th May 2012 Rob Harper.
A. Mohapatra, T. Sarangi, HEPiX-Lincoln, NE1 University of Wisconsin-Madison CMS Tier-2 Site Report D. Bradley, S. Dasu, A. Mohapatra, T. Sarangi, C. Vuosalo.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
Automated virtualisation performance framework 1 Tim Bell Sean Crosby (Univ. of Melbourne) Jan van Eldik Ulrich Schwickerath Arne Wiebalck HEPiX Fall 2015.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
HEPiX report Spring 2015 HEPiX meeting Oxford University, UK Helge Meinhard, CERN-IT Grid Deployment Board 10-Jun Helge Meinhard (at) CERN.ch.
The RAL PPD Tier 2/3 Current Status and Future Plans or “Are we ready for next year?” Chris Brew PPD Christmas Lectures th December 2007.
Michele Michelotto – CCR Marzo 2015 Past present future Fall 2012 Beijing IHEP Spring 2013 – INFN Cnaf Fall 2013 – Ann Arbour, Michigan Univ. Spring.
The HEPiX IPv6 Working Group David Kelsey HEPiX, Prague 26 April 2012.
STFC in INDIGO DataCloud WP3 INDIGO DataCloud Kickoff Meeting Bologna April 2015 Ian Collier
HEPiX report Spring 2016 HEPiX meeting DESY Zeuthen (Berlin), Germany Helge Meinhard, CERN-IT Grid Deployment Board 11-May Helge Meinhard (at)
HEPiX IPv6 Working Group David Kelsey david DOT kelsey AT stfc DOT ac DOT uk (STFC-RAL) HEPiX, Vancouver 26 Oct 2011.
News from the HEPiX IPv6 Working Group David Kelsey (STFC-RAL) HEPIX, BNL 13 Oct 2015.
The HEPiX IPv6 Working Group David Kelsey (STFC-RAL) EGI OMB 19 Dec 2013.
Instituto de Biocomputación y Física de Sistemas Complejos Cloud resources and BIFI activities in JRA2 Reunión JRU Española.
Farming Andrea Chierici CNAF Review Current situation.
Hepix spring 2012 Summary SITE:
DPM: Future Proof Storage Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI
A Scalable and Resilient PanDA Service for Open Science Grid Dantong Yu Grid Group RHIC and US ATLAS Computing Facility.
HEPiX Fall 2014 U Nebraska Lincoln, USA Workshop Wrap-Up Sandy Philpott, Helge Meinhard.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.
HEPiX spring 2013 report HEPiX Spring 2013 CNAF Bologna / Italy Helge Meinhard, CERN-IT Contributions by Arne Wiebalck / CERN-IT Grid Deployment Board.
WLCG IPv6 deployment strategy
(Prague, March 2009) Andrey Y Shevel
HEPiX Fall 2016 Summary Report
Oxford Site Report HEPSYSMAN
ATLAS Sites Jamboree, CERN January, 2017
Helge Meinhard, CERN-IT Grid Deployment Board 10-May-2017
Introduction to HEPiX Helge Meinhard, CERN-IT
LHC Data Analysis using a worldwide computing grid
HEPiX Spring 2009 Highlights
Presentation transcript:

HEPiX Autumn Meeting 2014 University of Nebraska, Lincoln 2 Arne Wiebalck Liviu Valsan Borja Aparicio Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

HEPiX 3 Global organization of service managers and support staff providing computing facilities for HEP community Participating sites include BNL, CERN, DESY, FNAL, IN2P3, INFN, NIKHEF, RAL, TRIUMF … Meetings are held twice per year - Spring: Europe, Autumn: U.S./Asia Reports on status and recent work, work in progress & future plans - Usually no showing-off, honest exchange of experiences Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Outline Autumn Meeting & HEPiX News Site Reports End User Services & OS Grids, Clouds, and Virtualization Storage and File systems Computing and Batch IT Facilities Networking and Security Basic IT Services Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary Arne Liviu Borja

HEPiX Autumn 2014 Oct 13 – 17, 2014 at the University of Nebraska Lincoln - Well organized, rich program - Eduroam, Indico (intervention, incident, power cut) 93 registered participants - Many first timers again - 6/8 US-CMS Tier-2 sites, 2/5 US-ATLAS Tier-2 sites - 45 sites represented 60 contributions - 96 slides (in 25 minutes!) words per slide … 5Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

6 Lincoln, Nebraska About 22 hours door to door …

HEPiX Autumn 2014 Oct 13 – 17, 2014 at the University of Nebraska Lincoln - Well organized, rich program - Eduroam, Indico (intervention, incident, power cut) 93 registered participants - Many first timers again - 6/8 US-CMS Tier-2 sites, 2/5 US-ATLAS Tier-2 sites - 45 sites represented 60 contributions - 96 slides (in 25 minutes!) words per slide … 7Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

HEPiX News Tony Wong (BNL) new HEPiX co-chair - 3-year term Next meetings - Spring 2015: Oxford (UK) March 23 – 27 - Autumn 2015: BNL (US) Oct 12 – 16 - Spring 2016: DESY Zeuthen (DE), Berlin/Potsdam (TBC) 8Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

HEPiX Working Groups IPv6 - Deployment/readiness following Tier structure - Experiments pushing for services at T1/T2 Benchmarking - Awaiting SPEC CPUv6 - Suggestion of a “fast” benchmark (minutes) 9Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Site Reports 15 site reports: T0, 7x T1s, 7x T2s (Move to) HTCondor still very visible - Talk from HTCondor team - INFN (on LSF now) will start evaluation KIT’s “Dropbox”: bwSync&Share - 8’000 users - Based on PowerFolder Ganeti used at multiple sites - VM cluster management tool from Google - Overall positive experience 10Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Site Reports Ceph - Still gaining momentum: many PoCs (RAL: 1PB, BNL: 3PB) - Vivid mail exchange, BoF Session in Oxford? Energy efficiency - No WG, but many activities (refurbishments) - “Energy accounting” discussions INFN still investigating micro-server options - Moonshot and other Avoton based solutions - Experiments seem fine with performance/power ratio During “dark data” cleanup NDGF deleted all ALICE tape data due to misunderstanding of what “NDGF data” means - ALICE::NDGF vs. ALICE::NDGF_tape - 200TB of data now being backfilled … 11Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

CERN Site Report “What about CERN?” 12Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

CERN Site Report “What about CERN?” “Are there ever power cuts at CERN?” 13Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

End User Services & OS Six talks in total, three from CERN - Thomas: CC7 - Borja: Issue tracking and VCS - Michail: FTS3 Scientific Linux / CentOS - FNAL SL team continue to provide Scientific Linux - No competition with other rebuilds - Rebuild from git.centos.org: difficult (as not supported) So, after the initial discussions at the Annecy meeting, the community seems to part ways … 14Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Virtualization Six talks in total, five from CERN - Laurence: Experiment’s Cloud Computing Adoption - Andrea: WLCG Monitoring - Helge: Volunteer Computing - Arne: Cloud Report, VM IO Performance RAL starting batch virtualization - “Burst batch into the cloud” - Successful PoC: Vacuum model integration with HTCondor GSI: MS Windows on KVM - Windows domain restructuring: all on VMs, all on KVM - Partly in prod (CA, TS), partly in testing (DC, Exchange) - No support issue 15Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Outline Autumn Meeting & HEPiX News Site Reports End User Services & OS Grids, Clouds, and Virtualization Storage and File systems Computing and Batch IT Facilities Networking and Security Basic IT Services Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary Arne Liviu Borja

Storage and Filesystems  Ten talks in total, five from CERN: – Luca: – EOS across 1000 km – CERNbox + EOS: Cloud Storage for Science – Andrea: DPM performance tuning hints for HTTP/WebDAV and Xrootd – Ruben: Experience in running relational databases on clustered storage – Liviu: SSD Benchmarking at CERN 17Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

OpenZFS on Linux  OpenZFS  Large set of features  Independent of the Linux kernel  LLNL:  Three Lustre filesystems, ~100 PB, OpenZFS backend  Moving to commodity JBODs  Work ongoing for improving Linux boot time with large number of drives 18Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Ceph Based Storage Systems for RACF  Deployment of same scale as at CERN  Lots of performance and stability tests  Object storage, block storage and file system (Ceph FS)  On several platforms (including HP Moonshot)  Different networking solutions 19Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Using XRootD to Minimize Hadoop Replication  Hadoop replication via XRootD  Reduced local Hadoop replication to 1  In case of corrupt local blocks:  Request blocks via XRootD  Cache locally  Repair broken blocks locally in Hadoop 20Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Computing and Batch Systems 21  Six talks in total, one from CERN:  Two presentations on benchmarking  Four presentations on batch systems Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Benchmarking activities  Intel Xeon E v3 (Haswell)  Showing good performance  Intel Avoton: very good HS06 / Watt ratio  ARM 32-bit HS06 / Watt in between Xeon & Avoton 22Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Fast Benchmark  Some requirements are clear:  Open source  Easy to run  Small  Others requirements not so clear:  How fast? Reproducible? Reliable?  Single core or multicore? 23Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Fast Benchmark Proposals  Geant4 based  Linux x86-64 & ARM  Realistic detector geometry  Footprint: 1/4 to 1/3 of real experiment  CPU bound, no I/O  LHCb fast benchmark  Small python script, single threaded 24Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Next generation HEP-SPEC06  Next SPEC CPU benchmark (CPUv6) in beta  Should be released before the end of the year  Will probably not run with the default SLC 6 compiler  Gcc on CentOS 7 should be fine, config file will be provided by GridKa 25Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Batch Systems  All four talks about HTCondor:  Two talks from developers  Jérôme’s talk: HTCondor CERN  Open Science Grid adopting HTCondor 26Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

IT Facilities and Business Continuity  Three talks, two from CERN  First Experience with the Wigner Data Centre  Joint procurement of IT equipment and services  UPS Monitoring with Sensaphone  Multi-level / SMS alerting  Gradual shutdown of servers in case of power cut or cooling failure  Wireless temperature sensors used to build 3D heatmap 27Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

NeRSC  New Computational Research and Theory (CRT) Building  Year-round free air and water cooling  PUE < 1.1  42 MW to building  12.5 MW provisioned 28Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Outline Autumn Meeting & HEPiX News Site Reports End User Services & OS Grids, Clouds, and Virtualization Storage and File systems Computing and Batch IT Facilities Networking and Security Basic IT Services Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary Arne Liviu Borja

Networking and Security 30Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary Four networking talks, two security, one from CERN - Stefan: Situational Awareness: Computer Security IPv6 Deployment - HEPiX Ipv6 Working Group: WLCG dual-stack services deployment. Testing - Open Sciences Grid: Client/Server are dual-stack? Server is but not the client? Infiniband Based Networking evaluation - Brookhaven National Laboratory (USA) ESNet: Extension to Europe - US Department of Energy - “Scientific progress will be completely unconstrained by the physical location of instruments, people, computational resources or data”

Basic IT Services 1/2 Seven talks, three from CERN - Ben: Configuration Services at CERN: Update - Rubén: Database on Deman: insight how to build your DbaaS - Aris: Ermis service for DNS Load Balancer configuration Monitoring with Nagios - NERSC – US Department of Energy - Monitoring clusters of 1000's of compute nodes 31Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary

Basic IT Services 2/2 CFEngine - ATLAS Great Lakes Tier 2 (AGLT2) - Change management: SVN → Push to production Puppet at USCMS-T1 – FermiLab - Modules + Data in Hiera approach. PuppetDashboard instead of TheForeman - Change management: Git branches → Push to production - Continuous Integration? Not yet but Beaker is the main candidate - Secrets? “hiera-eyaml” Not a good solution Puppet at BNL - RICH and ATLAS computing Facility - Emphasis in Change Management and Cultural Management - Test environments + self-approve delay - Looking for automatic testing 32Wiebalck, Valsan, Aparicio: HEPiX Autumn 2014 Summary