Developments in Batch and the Grid

Slides:



Advertisements
Similar presentations
Batch System Operation & Interaction with the Grid LCG/EGEE Operations Workshop May 25 th 2005 CERN.ch.
Advertisements

4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
Scheduling under LCG at RAL UK HEP Sysman, Manchester 11th November 2004 Steve Traylen
CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Memory Management CS 342 – Operating Systems Ibrahim Korpeoglu Bilkent University.
IFIN-HH LHCB GRID Activities Eduard Pauna Radu Stoica.
Operating Systems (CSCI2413) Lecture 3 Processes phones off (please)
Enabling Grids for E-sciencE Medical image processing web portal : Requirements analysis. An almost end user point of view … H. Benoit-Cattin,
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.
1 Evolution of OSG to support virtualization and multi-core applications (Perspective of a Condor Guy) Dan Bradley University of Wisconsin Workshop on.
1 BIG FARMS AND THE GRID Job Submission and Monitoring issues ATF Meeting, 20/06/03 Sergio Andreozzi.
F. Brasolin / A. De Salvo – The ATLAS benchmark suite – May, Benchmarking ATLAS applications Franco Brasolin - INFN Bologna - Alessandro.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
ATLAS DC2 seen from Prague Tier2 center - some remarks Atlas sw workshop September 2004.
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
Belle MC Production on Grid 2 nd Open Meeting of the SuperKEKB Collaboration Soft/Comp session 17 March, 2009 Hideyuki Nakazawa National Central University.
LCG / ARC Interoperability Status Michael Grønager, PhD (UNI-C / NBI) January 19, 2006, Uppsala.
WLCG Service Requirements WLCG Workshop Mumbai Tim Bell CERN/IT/FIO.
Some Title from the Headrer and Footer, 19 April Overview Requirements Current Design Work in Progress.
Installing, running, and maintaining large Linux Clusters at CERN Thorsten Kleinwort CERN-IT/FIO CHEP
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
Stephen Burke – Data Management - 3/9/02 Partner Logo Data Management Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF.
Batch Scheduling at CERN (LSF) Hepix Spring Meeting 2005 Tim Bell IT/FIO Fabric Services.
Proposal for a IS schema Massimo Sgaravatto INFN Padova.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
LCG Accounting John Gordon Grid Deployment Board 13 th January 2004.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Priorities update Andrea Sciabà IT/GS Ulrich Schwickerath IT/FIO.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 3: Process-Concept.
CENG 476 Projects 2014 (10’th November 2014) 1. Projects One problem for each student One problem for each student 2.
INFSO-RI Enabling Grids for E-sciencE Policy management and fair share in gLite Andrea Guarise HPDC 2006 Paris June 19th, 2006.
What’s Coming? What are we Planning?. › Better docs › Goldilocks – This slot size is just right › Storage › New.
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
Western Tier 2 Site at SLAC Wei Yang US ATLAS Tier 2 Workshop Harvard University August 17-18, 2006.
DataTAG is a project funded by the European Union CERN, 8 May 2003 – n o 1 / 10 Grid Monitoring A conceptual introduction to GridICE Sergio Andreozzi
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
GRID & Parallel Processing Koichi Murakami11 th Geant4 Collaboration Workshop / LIP - Lisboa (10-14/Oct./2006) 1 GRID-related activity in Japan Go Iwai,
CE design report Luigi Zangrando
EGEE-II INFSO-RI Enabling Grids for E-sciencE Simone Campana (CERN) Job Priorities: status.
Introduction to Distributed HTC and overlay systems Tuesday morning, 9:00am Igor Sfiligoi Leader of the OSG Glidein Factory Operations University of California.
Availability of ALICE Grid resources in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
Kevin Thaddeus Flood University of Wisconsin
Bulk production of Monte Carlo
First proposal for a modification of the GIS schema
Resource Management IB Computer Science.
Processes and threads.
Xiaomei Zhang CMS IHEP Group Meeting December
L’analisi in LHCb Angelo Carbone INFN Bologna
OpenPBS – Distributed Workload Management System
Connecting LRMS to GRMS
VOs and ARC Florido Paganelli, Lund University
Farida Naz Andrea Sciabà
Bulk production of Monte Carlo
Distributed Job Submission in a Dynamic Virtual Environment
ALICE FAIR Meeting KVI, 2010 Kilian Schwarz GSI.
CS 425 / ECE 428 Distributed Systems Fall 2016 Nov 10, 2016
lcg-infosites documentation (v2.1, LCG2.3.1) 10/03/05
Accounting at the T1/T2 Sites of the Italian Grid
The CREAM CE: When can the LCG-CE be replaced?
Raw Wallclock in APEL John Gordon, STFC-RAL
William Stallings Computer Organization and Architecture
Practical aspects of multi-core job submission at CERN
Grid Deployment Board meeting, 8 November 2006, CERN
PES Lessons learned from large scale LSF scalability tests
CS 425 / ECE 428 Distributed Systems Fall 2017 Nov 16, 2017
Simulation use cases for T2 in ALICE
NGS computation services: APIs and Parallel Jobs
Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF
Francesco Giacomini – INFN JRA1 All-Hands Nikhef, February 2008
Presentation transcript:

Developments in Batch and the Grid Hepix Autumn Meeting 2005 Tim Bell CERN/IT/FIO

Plugins Community Support Problem Grid batch plugins very basic Mappings incomplete or inaccurate Testing difficult for Grid Developers Solution Develop community support structure LRMS Administrators support the code LCG CVS/Savannah are available to assist 13th October 2005 Developments in Batch Tim.Bell@cern.ch

Developments in Batch Tim.Bell@cern.ch HEPiX Batch Web Pages Started using HEPiX Spring input Hosted at Caspur http://hepix.caspur.it/afs/hepix.org/project/batch/index.html Pages Sites specific information Batch system Contacts Overview presentation Product information Batch provides web sites Grid Developments and Requirements What is ongoing ? What is needed ? Please check that the information is complete and correct. Send changes to me. Check for any new grid requirements in the batch area 13th October 2005 Developments in Batch Tim.Bell@cern.ch

HEPiX Batch Systems Site 13th October 2005 Developments in Batch Tim.Bell@cern.ch

Developments in Batch Tim.Bell@cern.ch GLUE 1.2 Improvements Slots rather than CPUs Per-VO views Response Times Free Slots Queue State Open/Draining/Closed Sub-Cluster concept introduced but no link with batch section of schema 13th October 2005 Developments in Batch Tim.Bell@cern.ch

Developments in Batch Tim.Bell@cern.ch GLUE – Improved ERT Old calculation was based on number of waiting jobs and wall clock time for the grid queues This did not consider Group priorities Most jobs finish early Result was that big sites became unattractive very quickly New calculation based on the waiting time of jobs in the queue Take average for those jobs of your VO If free slots and no waiting jobs for the VO, ERT is immediate. 13th October 2005 Developments in Batch Tim.Bell@cern.ch

GLUE – CERN ERT - Results #1 13th October 2005 Developments in Batch Tim.Bell@cern.ch

GLUE – CERN ERT - Results #2 13th October 2005 Developments in Batch Tim.Bell@cern.ch

GLUE – ERT Implementation Implemented by Jeff Templon, NIKHEF with input from Laurence Field, CERN Common front end program with text input with list of jobs, VOs, submit time, start time, etc. Backends developed for LSF and PBS PBS RPM available from Jeff LSF RPM under tests at CERN Volunteers for other batch systems ? Need to implement soon so that NIKHEF does not get sent all the grid jobs 13th October 2005 Developments in Batch Tim.Bell@cern.ch

Developments in Batch Tim.Bell@cern.ch GLUE V2 Requirements GLUE V2 starting in November Requirements to be submitted Slot information for sub-clusters (e.g. how many free slots on machines with more than 4GB RAM running SLC3) CPU units based on Spec benchmarks Contact Laurence.Field@cern.ch 13th October 2005 Developments in Batch Tim.Bell@cern.ch

Developments in Batch Tim.Bell@cern.ch Grid Job Submission Resource Broker/Compute Element/BLAHP chain is losing information Local batch job has no parameters from user JDL Requirements submitted Job Name, CPU Time, Wall Clock Time Total RAM, Swap space, Temporary disk space Specific operating system, Speed of processor Target is to be able to move away from per-VO job queues Send requirements to Maarten.Litmaath@cern.ch 13th October 2005 Developments in Batch Tim.Bell@cern.ch

Developments in Batch Tim.Bell@cern.ch Summary There has been improvement since Spring We need to keep emphasising the issues and participating in the solutions We need to take ownership where we can contribute 13th October 2005 Developments in Batch Tim.Bell@cern.ch