Presentation is loading. Please wait.

Presentation is loading. Please wait.

Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)

Similar presentations


Presentation on theme: "Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)"— Presentation transcript:

1 Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)

2 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 2 BOSS “Batch Object Submission System” Is a tool for job monitoring and book-keeping Allows to deal with job-specific information Is not a job scheduler, but can be interfaced with most schedulers: LSF (CERN, INFN) PBS (Bristol, Caltech, UFL, Imperial College, INFN) FBSNG (Fermilab) Condor (INFN, U.Wisconsin) Has been designed to work on computing farms Is compatible with use on a WAN, but is not robust against network failures (yet)

3 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 3 Basic BOSS components boss executable: the BOSS interface to the user MySQL database: where BOSS stores job information jobExecutor executable: the BOSS wrapper around the user job dbUpdator executable: the process that writes to the database while the job is running Local scheduler may be a “Grid” scheduler

4 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 4 Basic flow Accepts job submission from users Stores info about job in a DB Builds a wrapper around the job (jobExecutor) Sends the wrapper to the local scheduler The wrapper sends to the DB info about the job boss submit boss query boss kill BOSS DB BOSS Local Scheduler farm node Wrapper

5 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 5 User defined information User registers a job type: Schema for the information to be monitored A new table is created in the BOSS database with a defined structure Algorithms to retrieve the information from the job The user programs (filters) are stored in the database as blobs User submits jobs: One or more job types can be specified for the job A new entry is created for the job in the database tables The filters are extracted from the database and made available to the running job

6 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 6 test JOBID COUNTER 12345 0 BOSS DB stdout The job interface to BOSS #!/usr/bin/perl while( ){ if($_=~/.*counter\s+(\d+).*/){ print “COUNTER=$1\n"; } BOSS jobExecutor counter 1 counter 2 counter 3COUNTER=1COUNTER=2COUNTER=3 123 #!/usr/bin/perl $i = 0; while($i<3){ sleep(1); $i++; print "counter $i\n"; } User job Filter journal 1234 test counter 1 1234 JOB T_START xxx 1234 JOB …… …… 1234 test counter 2 1234 test counter 3 1234 JOB …… …… 1234 JOB T_STOP yyy BOSS dbUpdator The job interfaces to BOSS are its standard input, output and error streams The user defined algorithms are filters that read stdin/out/err and write key=value pairs The key s are the user-defined schema variables

7 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 7 Runtime data flow STDIN STDOUT STDERR LOG USER OUT pipe ERR pipe tee Standard input or output Standard error Other I/O streams User supplied or returned to the user Temporary processes and files BOSS Processes and files RunTime Filter pipe jobExecutor RunTime Filter pipe RunTime Filter pipe Journal tee pipe tee pipe tee pipe BOSS DB dbUpdator

8 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 8 Queries Standard queries: Get job status and user defined quantities % boss q -all -specific -type test ID S_USR EXECUTABLE ST EXE_HOST START TIME STOP TIME comment counter 1 grandi test.pl 15 E pccms10.bo 14:30:00 06/06 14:30:16 06/06...STOP 15 2 grandi test.pl 15 R pccms10.bo 14:30:02 06/06 -------------- START... 13 Advanced queries: Use SQL to query job info (standard + user defined) Output suitable for parsing by a script: % boss SQL -query "select JOB.ID,EXEC,counter from JOB,test WHERE JOB.ID=test.JOBID" 3,4,23,9 ID EXEC counter 1 test.pl 15 2 test.pl 13 number of fields Width of 1 st field … Width of n th field Information line Header line

9 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 9 Interface to the scheduler User registers a scheduler: Scripts for job submission, deletion and query The scripts are stored in the database as blobs The fork scheduler is already registered User submits/deletes/queries jobs: The scheduler can be specified for the submission The boss executable fetches the scripts from the database and uses them as interface to the scheduler Job submission via ClassAd file is supported BOSS manages the keys it understands and passes the others to the submission script User-defined keys are possible!

10 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 10 BOSS as a grid-tool boss submit boss query boss kill BOSS DB Local BOSS gateway GRID Scheduler boss registerScheduler gatekeeper farm node Tested on the European DataGrid testbed Interface scripts incluided in BOSS distribution See talk by P.Capiluppi dbUpdator uses native MySQL calls Proof of concept using R-GMA (from EDG-WP3) as BOSS transport layer (H.Nebrensky, Brunel Univ.)

11 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 11 Input/Output Sandbox BOSS and R-GMA BOSS DB R-GMA Receiver servlets R-GMA Registry boss executable User Interface Computing Element Worker Node R-GMA enabled dbUpdator jobExecutor starts user job BOSS journal User output R-GMA Producer servlets EDG WP1 + GRAM Firewall subscribe lookup

12 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 12

13 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 13 Current use of BOSS CMS 2002 productions: –about 500,000 jobs running in about 20 regional centers –complete book-keepig of every single job CMS/EDG stress test (Nov.-Dec. 2002): –about 10,000 jobs submitted by 4 user interfaces on the European DataGrid testbed –allowed validation of jobs for which the output sandbox was lost due to EDG internals R-GMA demo at EDG review (Feb. 2002): –proof of concept

14 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 14 BOSS data analysis boss2root (by D.Bonacorsi) –Produce root trees from BOSS MySQL tables –Used to analyze the data of the CMS/EDG stress test - complete classification of problems - graphical representation of results

15 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 15 Summary BOSS is a tool that allows real-time monitoring and book-keeping of batch jobs User-defined information is archived for different job types Has been used by CMS for 2002 official productions Has been used during the CMS/EDG stress test in a grid environment Is a general tool: nothing CMS or even HEP specific Web site: http://www.bo.infn.it/cms/computing/BOSS/http://www.bo.infn.it/cms/computing/BOSS/

16 Claudio Grandi INFN Bologna March 27th 2003 CHEP'03 Conference, San Diego 16


Download ppt "Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)"

Similar presentations


Ads by Google