Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th 2006 1 BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.

Slides:



Advertisements
Similar presentations
1 14 Feb 2007 CMS Italia – Napoli A. Fanfani Univ. Bologna A. Fanfani University of Bologna MC Production System & DM catalogue.
Advertisements

CMS Grid Batch Analysis Framework
EU 2nd Year Review – Jan – Title – n° 1 WP1 Speaker name (Speaker function and WP ) Presentation address e.g.
Workload management Owen Maroney, Imperial College London (with a little help from David Colling)
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
P-GRADE and WS-PGRADE portals supporting desktop grids and clouds Peter Kacsuk MTA SZTAKI
WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.
INFSO-RI Enabling Grids for E-sciencE EGEE Middleware The Resource Broker EGEE project members.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
1 Grid services based architectures Growing consensus that Grid services is the right concept for building the computing grids; Recent ARDA work has provoked.
Workload Management Massimo Sgaravatto INFN Padova.
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Zach Miller Condor Project Computer Sciences Department University of Wisconsin-Madison Flexible Data Placement Mechanisms in Condor.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Use of R-GMA in BOSS Henry Nebrensky (Brunel University) VRVS 26 April 2004 Some slides stolen from various talks at EDG 2 nd Review (
Tech talk 20th June Andrey Grid architecture at PHENIX Job monitoring and related stuff in multi cluster environment.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL July 15, 2003 LCG Analysis RTAG CERN.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
INFSO-RI Enabling Grids for E-sciencE Workload Management System Mike Mineter
Grid Workload Management Massimo Sgaravatto INFN Padova.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
Andrey Meeting 7 October 2003 General scheme: jobs are planned to go where data are and to less loaded clusters SUNY.
13 May 2004EB/TB Middleware meeting Use of R-GMA in BOSS for CMS Peter Hobson & Henry Nebrensky Brunel University, UK Some slides stolen from various talks.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
Transformation System report Luisa Arrabito 1, Federico Stagni 2 1) LUPM CNRS/IN2P3, France 2) CERN 5 th DIRAC User Workshop 27 th – 29 th May 2015, Ferrara.
Use of the gLite-WMS in CMS for production and analysis Giuseppe Codispoti On behalf of the CMS Offline and Computing.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
INFSO-RI Enabling Grids for E-sciencE CRAB: a tool for CMS distributed analysis in grid environment Federica Fanzago INFN PADOVA.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
Summary from WP 1 Parallel Section Massimo Sgaravatto INFN Padova.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
STAR Scheduling status Gabriele Carcassi 9 September 2002.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Practical using C++ WMProxy API advanced job submission
CMS High Level Trigger Configuration Management
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Introduction to Grid Technology
CRAB and local batch submission
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Short update on the latest gLite status
Scalability Tests With CMS, Boss and R-GMA
Wide Area Workload Management Work Package DATAGRID project
Status and plans for bookkeeping system and production tools
Production Manager Tools (New Architecture)
Presentation transcript:

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P. Capiluppi, G. Codispoti, C. Grandi INFN - Bologna, Italy D.Colling, B.McEvoy, S.Wakefield, Y.J. Zhang Imperial College London, UK

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: Batch Object Submission System A tool for batch job submission, real time monitoring and bookkeeping – interface to local and grid schedulers – retrieval of user defined info from process STDOUT – store job-specific logging and bookkeeping info in a relational DB – provide real time monitor Optimized for use in distributed environment – Glite ready

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th Interfacing batch schedulers BOSS allows transparent use of any local or distributed scheduler (LSF, PBS, Condor, gLite,...) – allow to perform standard operations: submit, query, kill, output retrieval – plug in system: Script interface: administrator site can modify existing scripts or add its own – Sample plug-in’s for many schedulers provided Particular effort developing glite scripts

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th CMS requirements PB of data produced by the online farm and MC simulations – Data stored in a distributed environment – Access to data in the site where they are stored – Multiple processing over the same dataset Chains of processing to be done over datasets (e.g.: sim-digi- reco, analysis processes etc.) complex jobs to handle – A lot of homogeneous processes to be run simultaneously over several datasets

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS job concept A BOSS job is a single elaboration unit – Can be made of multiple processing steps (user executables) – allows complex workflows: executables chaining – Chaining tool may be external Multiple identical jobs can be grouped in Tasks: – Logical grouping of jobs – compact description of multiple homogeneous jobs using iterators – Multiple iterators allowed – XML description

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th User defined information A program type can be defined for a given elaboration: – Schema for the information to be monitored A new table is created in the BOSS database with a defined structure – Algorithms to retrieve the information from the job The program filters are stored in the database Defined user filters: – Perform stage in/out actions – retrieve wanted info from STDOUT One or more program types can be specified for a program Jobs standardization (analysis, MC,...) => Applications may define their own program type

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS wrappers A Wrapping system is needed to manage the execution of several processes on the WN BOSS Wrapping system is made of: – a wrapper of the user job (jobExecutor): access to local runtime infos starts chaining of programs starts real time monitor – a chainer: allow complex programs workflow Linear chaining provided by default External tools may be also used – a program wrapper (programExecutor) access to local runtime info's for the single program starts pre-runtime-post filters, allows access to specific info's

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th Job wrapper Job Journal JobMonitor (real-time updater) JobChaining JobExecutor (wrapper) Program stdin user exec runtime-filter pre-filter post-filter stdout stderr programExecutor stdin user exec runtime-filter pre-filter post-filter stdout stderr programExecutor Program stdin user exec runtime-filter pre-filter post-filter stdout stderr programExecutor stdin user exec runtime-filter pre-filter post-filter stdout stderr programExecutor Program stdin user exec runtime-filter pre-filter post-filter stdout stderr programExecutor stdin user exec runtime-filter pre-filter post-filter stdout stderr programExecutor

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th Logging and Monitoring Logging provides long term storage of information –allowed using a relational DB –allowed personal db implementation in SQLite on local disk – logging database updated from journal file retrieved at end of job and, optionally at runtime from information in RT server Monitoring provides real time access to logging info – using an intermediate Real Time Server – RT-clients registered to the BOSS client as plug-in’s – may use different servers and technologies : R-GMA, Clarens, MonaLisa etc… – allow different RT mechanisms for each job

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th Real-time system Two RT-clients – the real-time updater that runs on the execution host inserts or updates information about the running job – a plug-in used by the boss client on the user interface fetches information about selected jobs and deletes it afterwards One server – the real-time DB server stores temporarily job information while the job is running shared by many users simple structure – identification of BOSS-client and of user – identification of destination table/variable – value of parameter and time-stamp final L&B doesn’t rely on it

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th User Interface BOSS CLIENT LOCAL OR GRID SCHEDULER REAL-TIME BOSS DB SERVER Worker Node BOSS JOB WRAPPER USER PROCESS BOSS REAL-TIME UPDATER Submit or control job Get job running status Pop job monitoring info Job control and logging File I/O control Set job logging info (possibly via proxy) Retrieve output files BOSS JOURNAL BOSS DB Components

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS in CMS computing Used in CMS MC production for 4 years Prototype CMS distributed analysis system (GROSS) based on BOSS and later new analysis system using BOSS BOSS v4 with new architecture and many new features BOSS Logging & bookkeeping monitoring Production / analysis tool

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th Glite Bulk submission BOSS modified to profit from glite bulk submission – Chains grouped for submission to allow creation of an unique input sandbox with common files – Actual implementation of a bulk submission delegated to the scheduler submit script Submission scripts implemented to efficiently use jdl job types: – Normal, for single submission – Parametric, most compact jdl, iteration over the boss job id: still to investigate if limitations can arise from the possibility to iterate over an unique parameter – Collection, more general bulk submission possibilities: a single file keeps all the job jdl's, shared input sandbox allowed Tested version 1.4.1, planned as soon as it will be ready on the pre-production system (PPS)

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th Summarizing Transparent use of any batch system Provide persistent storage of the logging infos Logging specific info Real Time Monitoring Glite optimization Sandbox packing for efficient use in distributed systems XML task description Command Line Interface Integrability in an experiment framework through API (C++ & Python)

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th First version released with a full set of basic functionalities – MySQL and SQLite back-ends for local DB –MySQL real-time DB – full working RT monitoring –XML task description at declaration time, nested iterators allowed –Full task description in the database –Glite Bulk Submission –Basic executable linear chaining, default solution –plug-in system for chainer implemented but we need to better understand how to handle external chainers (mainly to configure them allowing the use of the program wrapper) –MonaLisa monitoring allowed via APMon –plug-in for many schedulers: local submission, lsf, edg, glite; we are experiencing also some effort for condorG with v3.6 via end user support to allow use within OSG Status

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th Future plans Allowing the use of chainer plugins, mainly external programs (e.g. SHREEK) Implement more backend possibilities (i.e. ORACLE) Implement more RT monitoring solutions (i.e. R-GMA, Clarens, web services, but also mysql query encryption to avoid firewall problems) Finalize API, increase query possibilities Use external standard libraries (mainly from BOOST) Look at writing wrapper in scripting language i.e Perl/Python