Communicating Machine Features to Batch Jobs GDB, April 6 th 2011 Originally to WLCG MB, March 8 th 2011 June 13 th 2012.

Slides:



Advertisements
Similar presentations
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
Advertisements

HEPiX Virtualisation Working Group Status, July 9 th 2010
Batch Scheduling Workshop HEPiX Karlsruhe May 12 th /13 th 2005 CERN.ch.
Accounting Update Dave Kant Grid Deployment Board Nov 2007.
K. Harrison CERN, 15th May 2003 GANGA: GAUDI/ATHENA AND GRID ALLIANCE - Development strategy - Ganga prototype - Release plans - Conclusions.
1 port BOSS on Wenjing Wu (IHEP-CC)
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
LCG Introduction John Gordon, SFTC GDB December 2 nd 2009.
Large-scale Deployment in P2P Experiments Using the JXTA Distributed Framework Gabriel Antoniu, Luc Bougé, Mathieu Jan & Sébastien Monnet PARIS Research.
Distribution After Release Tool Natalia Ratnikova.
LHCb and DataGRID - the workplan for 2001 Eric van Herwijnen Wednesday, 28 march 2001.
10-Jun-03D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security (Report from the LCG Security Group) CERN, 10 June 2003 David Kelsey CCLRC/RAL, UK
Privilege separation in Condor Bruce Beckles University of Cambridge Computing Service.
LCG Introduction John Gordon, STFC-RAL GDB September 9 th, 2008.
Grid Operations Centre LCG Accounting Trevor Daniels, John Gordon GDB 8 Mar 2004.
Machine/Job Features Update Stefan Roiser. Machine/Job Features Recap Resource User Resource Provider Batch Deploy pilot Cloud Node Deploy VM Virtual.
LCG Pilot Jobs + glexec John Gordon, STFC-RAL GDB 7 November 2007.
Virtualised Worker Nodes Where are we? What next? Tony Cass GDB /12/12.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
NA61/NA49 virtualisation: status and plans Dag Toppe Larsen CERN
Proposal for a IS schema Massimo Sgaravatto INFN Padova.
Trusted Virtual Machine Images a step towards Cloud Computing for HEP? Tony Cass on behalf of the HEPiX Virtualisation Working Group October 19 th 2010.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Priorities update Andrea Sciabà IT/GS Ulrich Schwickerath IT/FIO.
Virtualization and Databases Ashraf Aboulnaga University of Waterloo.
HEPiX IPv6 Working Group David Kelsey GDB, CERN 11 Jan 2012.
Storage Accounting John Gordon, STFC GDB March 2013.
LCG Introduction John Gordon, STFC GDB September 14 th 2011.
LCG Introduction John Gordon, STFC GDB December14 th 2011.
2012 Objectives for CernVM. PH/SFT Technical Group Meeting CernVM/Subprojects The R&D phase of the project has finished and we continue to work as part.
Benchmarking Benchmarking in WLCG Helge Meinhard, CERN-IT HEPiX Fall 2015 at BNL 16-Oct Helge Meinhard (at) CERN.ch.
K. Harrison CERN, 3rd March 2004 GANGA CONTRIBUTIONS TO ADA RELEASE IN MAY - Outline of Ganga project - Python support for AJDL - LCG analysis service.
LCG Report from GDB John Gordon, STFC-RAL MB meeting February24 th, 2009.
Workload management, virtualisation, clouds & multicore Andrew Lahiff.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
NA61/NA49 virtualisation: status and plans Dag Toppe Larsen Budapest
HEPiX Virtualisation Working Group Status, February 10 th 2010 April 21 st 2010.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
LCG Accounting Update John Gordon, CCLRC-RAL WLCG Workshop, CERN 24/1/2007 LCG.
WP1 Status and plans Francesco Prelz, Massimo Sgaravatto 4 th EDG Project Conference Paris, March 6 th, 2002.
HEPiX Virtualisation Working Group Status, February 10 th 2010 April 21 st 2010 May 12 th 2010.
Automated virtualisation performance framework 1 Tim Bell Sean Crosby (Univ. of Melbourne) Jan van Eldik Ulrich Schwickerath Arne Wiebalck HEPiX Fall 2015.
LCG Issues from GDB John Gordon, STFC WLCG MB meeting September 28 th 2010.
HTCondor-CE for USATLAS Bob Ball AGLT2/University of Michigan OSG AHM March, 2015 Bob Ball AGLT2/University of Michigan OSG AHM March, 2015.
LCG Pilot Jobs + glexec John Gordon, STFC-RAL GDB 7 December 2007.
WLCG Operations Coordination report Maria Alandes, Andrea Sciabà IT-SDC On behalf of the WLCG Operations Coordination team GDB 9 th April 2014.
At the February meeting we agreed; - the requirements for Network Monitoring found at - 7 sites.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
Trusted Virtual Machine Images the HEPiX Point of View Tony Cass October 21 st 2011.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Storage Accounting John Gordon, STFC OMB August 2013.
LCG Introduction John Gordon, STFC-RAL GDB June 11 th, 2008.
Ruth Pordes, March 2010 OSG Update – GDB Mar 17 th 2010 Operations Services 1 Ramping up for resumption of data taking. Watching every ticket carefully.
The HEPiX Virtualisation Working Group Towards a Grid of Clouds Tony Cass CHEP 2012 May 24 th 2012.
HEPiX spring 2013 report HEPiX Spring 2013 CNAF Bologna / Italy Helge Meinhard, CERN-IT Contributions by Arne Wiebalck / CERN-IT Grid Deployment Board.
CH-1211 Genève 23 Job efficiencies at CERN Review of job efficiencies at CERN status report James Casey, Daniel Rodrigues, Ulrich Schwickerath.
CERN IT Department CH-1211 Genève 23 Switzerland The CERN internal Cloud Sebastien Goasguen, Belmiro Rodrigues Moreira, Ewan Roche, Ulrich.
Jeremy Coles, STFC-RAL GDB April 4, 2007
Update on revised HEPiX Contextualization
OpenPBS – Distributed Workload Management System
Virtualization and Clouds ATLAS position
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
WLCG Memory Requirements
Proposal for obtaining installed capacity
Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.
CPU accounting of public cloud resources
FCT Follow-up Meeting 31 March, 2017 Fernando Meireles
Michel Jouvin / Sandy Philpott
WLCG Collaboration Workshop;
Module 01 ETICS Overview ETICS Online Tutorials
The LHCb Computing Data Challenge DC06
Presentation transcript:

Communicating Machine Features to Batch Jobs GDB, April 6 th 2011 Originally to WLCG MB, March 8 th 2011 June 13 th 2012

CERN.ch Reminder  The HEPiX Virtualisation Working Group proposed the creation of 3 files in /etc/machinefeatures with machine information during virtual machine instantiation: –hs06 — the HS06 rating for a single core, –shutdowntime — timestamp for when the node is expected to be shutdown, and –shutdowncommand — the full path to a command that can be invoked to arrange early termination of the machine.  Passing similar information to jobs in real machines is considered to be useful. 2

CERN.ch Proposal for real machines  Physical machines, just as virtual machines, should have in /etc/machinefeatures the files –hs06 — the HS06 rating for a single core –shutdowntime — timestamp for when the node is expected to be rebooted (may be empty)  Additionally, the batch system should provide each job with an environment variable, $JOBFEATURES, pointing to a directory with the files –jobstart timestamp for the start of job execution –cpu_limit the allowed number of cpu-seconds that can be used by the job. The value must be comparable directly against system reported cpu consumption. –wall_limit the allowed number of wall clock seconds. The job will be terminated at time jobstart+wall_limit whether or not cpu_limit has been reached. 3 With thanks to Steve Traylen and Ulrich Schwickerath

CERN.ch Implementation Status  LSF –Initial prototype available in SVN repository; initial tests suggest the implementation works OK, no firm date for deployment as yet, but possible before end-March. »Developed by Ulrich Schwickerath; $JOBFEATURES points to /var/tmp/jobfeatures/ »mem_limit is also provided  PBS –NIKHEF asked to deliver an implementation; waiting to see whether or not this request will be accepted. »Accepted—and work has started to deliver the scripts.  Other workload management systems –Which? –Who? 4 GridEngine IN2P3

CERN.ch MB Conclusion – I  The implementation for LSF may have some CERN specifics. Other sites interested in this implementation are suggested to contact the developer, Ulrich Schwickerath.  NIKHEF clarified that it is not responsible within WLCG for PBS, but in the context of best effort, a project plan will be submitted to a national project in the Netherlands and if accepted NIKHEF will provide a PBS implementation.  Compared to the data in the Information System, this proposal makes data dynamically available at run time for the specific machine the job is running on.  Whilst this proposal brings some value, making it compulsory for all batch systems was questioned. Tony Cass clarified that the principle had been agreed in the MB meeting on 28 Sep 2010 and that the goal of this presentation had been to present a deployment plan.  A WLCG discussion would be needed on the distribution mechanism once packages for different batch systems are available. 5

CERN.ch MB Conclusion - II  The MB agreed that the next step is for the Experiments to test with the available implementations, LSF and PBS (if/when available). Based on their experience a decision could then be made on extending the implementation to other batch systems. A technical discussion will be scheduled on this topic for the April GDB. 6 Michel’s aim for today: Have this tested…