STAR Scheduler Gabriele Carcassi STAR Collaboration.

Slides:



Advertisements
Similar presentations
INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.
Advertisements

FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Workload Management Massimo Sgaravatto INFN Padova.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
OracleAS Reports Services. Problem Statement To simplify the process of managing, creating and execution of Oracle Reports.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL July 15, 2003 LCG Analysis RTAG CERN.
STAR scheduling future directions Gabriele Carcassi 9 September 2002.
Daniel Vanderster University of Victoria National Research Council and the University of Victoria 1 GridX1 Services Project A. Agarwal, A. Berman, A. Charbonneau,
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
SUMS ( STAR Unified Meta Scheduler ) SUMS is a highly modular meta-scheduler currently in use by STAR at there large data processing sites (ex. RCF /
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
Computational grids and grids projects DSS,
Grid Computing I CONDOR.
Integrating HPC into the ATLAS Distributed Computing environment Doug Benjamin Duke University.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Overview Why are STAR members encouraged to use SUMS ? Improvements and additions to SUMS Research –Job scheduling with load monitoring tools –Request.
Aug 13 th 2003Scheduler Tutorial1 STAR Scheduler – A tutorial Lee Barnby – Kent State University Introduction What is the scheduler and what are the advantages?
DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.
Grid Workload Management Massimo Sgaravatto INFN Padova.
Grid infrastructure analysis with a simple flow model Andrey Demichev, Alexander Kryukov, Lev Shamardin, Grigory Shpiz Scobeltsyn Institute of Nuclear.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.
Condor Week 2004 The use of Condor at the CDF Analysis Farm Presented by Sfiligoi Igor on behalf of the CAF group.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
August 30, 2002Jerry Gieraltowski Launching ATLAS Jobs to either the US-ATLAS or EDG Grids using GRAPPA Goal: Use GRAPPA to launch a job to one or more.
© Geodise Project, University of Southampton, Geodise Middleware Graeme Pound, Gang Xue & Matthew Fairman Summer 2003.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL November 17, 2003 SC2003 Phoenix.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Data Management The European DataGrid Project Team
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
STAR Scheduling status Gabriele Carcassi 9 September 2002.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
EGEE 3 rd conference - Athens – 20/04/2005 CREAM JDL vs JSDL Massimo Sgaravatto INFN - Padova.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
10 March Andrey Grid Tools Working Prototype of Distributed Computing Infrastructure for Physics Analysis SUNY.
1. 2 Introduction SUMS (STAR Unified Meta Scheduler) overview –Usage Architecture Deprecated Configuration Current Configuration –Configuration via Information.
Ganga/Dirac Data Management meeting October 2003 Gennady Kuznetsov Production Manager Tools and Ganga (New Architecture)
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL May 19, 2003 BNL Technology Meeting.
U.S. ATLAS Grid Production Experience
DUCKS – Distributed User-mode Chirp-Knowledgeable Server
Genre1: Condor Grid: CSECCR
Michael P. McCumber Task Force Meeting April 3, 2006
Wide Area Workload Management Work Package DATAGRID project
Presentation transcript:

STAR Scheduler Gabriele Carcassi STAR Collaboration

What is the STAR scheduler? Resource Broker –receives job requests from the user and decides how to assign them to the resources available Wrapper on evolving technologies –by and by that GRID middleware fit for STAR needs is available is integrated in the scheduler flexible architecture

Scheduler benefits Enables the Distributed Disk framework –Data files are distributed on the local disk of each node of the farm –The job requiring a given files is dispatched where the file can be found Interfacing with STAR file catalog –User specify job input through a metadata/catalog query (ex. Gold-Gold at 200 GeV, Fullfield, minbias,...) –File catalog implementation is modular

Scheduler benefits User interface: description and specification –Well defined user interface and job model –Abstract description allows us to embed in the scheduler the logic on how to use resources –Allows us to experiment and migrate to other tools with minimal impact for the user (for job submission) –Makes it clearer for other groups collaborating with us to understand our needs Extensible architecture

Technologies used Scheduler is written in Java Job description language is an XML file Current implementation uses –LSF for job submission –STAR catalog as the file catalog Experimenting with Condor-g for GRID submission

/star/data09/reco/productionCentral/FullFie sched _1.list /star/data09/reco/productionCentral/FullFie sched _2.list How does it work? root4star -q -b rootMacros/numberOfEventsList.C\(\"$FI LELIST\"\) Job description test.xml /star/data09/reco/productionCentral/FullFie sched _0.list /star/data09/reco/productionCentral/FullFie... /star/data09/reco/productionCentral/FullFie Query/Wildcard resolution

How does it work? root4star -q -b rootMacros/numberOfEventsList.C\(\"$FI LELIST\"\) Job description test.xml Output files #!/bin/csh # # Script generated at Wed Jan # bsub -q star_cas_dd -o /star/u/carca... # sched _2.csh sched _0.csh #!/bin/csh # # Script generated at Wed Jan # bsub -q star_cas_dd -o /star/u/carca... # #!/bin/csh # # Script generated at Wed Jan # bsub -q star_cas_dd -o /star/u/carca... # sched _1.csh

Distributed disk Motives –Scalability: NFS requires more work to scale –Performance: reading/writing on local disk is faster –Availability: every computer has local disk, not every computer has distributed disk Current model –Files are distributed by hand (Data carousel) according to user needs –File catalog is updated during distribution –Scheduler queries the file catalog and divides the job according to the distribution Future model –Dynamic distribution

File catalog integration Enables distributed disk –If not present, users would have to know where the files are distributed on which machines Allows users to specify their input according to the metadata On small number of files requests, the scheduler can choose which files are more available

File catalog integration Implemented through an interface (pure abstract class –The query itself is an opaque string passed directly to the file catalog –Other tags tell the scheduler how to extract the desired group single copy or all copies of the same files prefer files on NFS or local disk number of files requires

User Interface Job description –an XML and it’s tag used to describe to the scheduler which command is to be dispatched and on which input files Job specification –a set of simple rules that define how the user job is supposed to behave

The Job description XML file with the description of our request root4star -q -b rootMacros/numberOfEventsList.C\(\"$FILELIST\"\)

Job specification The scheduler prepares some environment variables to communicate the job its decision about job splitting –$FILELIST, $INPUTFILECOUNT and $INPUTFILExx contain information about the input files assigned to the job –$SCRATCH is a local directory available to the job to put it’s output for later retrieval

Job specification The other main requirement is that the output of the different processes won’t clash one another –One can use $JOBID to create filenames that are unique for each process

STAR Scheduling architecture UI UJDL Perl interface MySQL Dispatcher JobInitializer Policy LSF File Catalog Queue manager Scheduler / Resource broker File catalog interface Monitoring Ganglia MDS Abstract component

Job Initializer Parses the xml job request Checks the request to see if it is valid –Checks for elements outside specification (typically errors) –Checks for consistency (existence of input files on disk,...) –Checks for requirements (require the output file,...) Creates the Java objects representing the request (JobRequest)

Job Initializer Current implementation –Strict parser: any keyword outside the specification stops the process –Checks for the existence of the stdin file and the stdout directory –Forces the stdout to prevent side effects (such as LSF would accidentally send the output by mail)

Policy The core of resource brokering: –From one request, creates a series of processes to fulfill that request –Processes are created according to farm administrator’s decisions –The policy may query the file catalog, the queues or other middleware to make an optimal decision (ex. MDS, Ganglia,...)

Policy We anticipate a lot of the work in finding an optimal policy Policy is easily changeable, to allow the administrator to change the behavior of the system

Policy Current policy –Resolves the queries and the wildcards to form a single file list –Divide the list into several sub-lists, according to where the input files are located and the maximum number of files set per process –Creates one process for every file list.

Dispatcher From the abstract process description, creates everything that is needed to dispatch the jobs –Talks to the underlying queue system –Takes care of creating the script that will be executed: csh based (widely supported) –Creates environment variables and the file list

Dispatcher Current implementation: –creates file list and script in the directory where the job was submitted from –creates environment variables containing the job id, the list of files and all the files in the list, assigns a scratch directory. –creates a command line for LSF –submits job to LSF

Conclusion The tool is available and working –In production since September 2002 and slowly acquiring acceptance (difficult to get people to try, but once they try it they like it) Allows the use of local disks Architecture is open to allow changes –Different policies –Catalog implementation (MAGDA, RLS, GDMP,... ?) –Dispatcher implementation (Condor, Condor-g – Globus,... ) We are preparing an implementation that uses Condor-g and allows us to dispatch jobs to the GRID