STAR Scheduling status Gabriele Carcassi 9 September 2002.

Slides:

Advertisements

Similar presentations

Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.

Advertisements

INFSO-RI Enabling Grids for E-sciencE Workload Management System and Job Description Language.

GUMS status Gabriele Carcassi PPDG Common Project 12/9/2004.

CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.

Web Applications Development Using Coldbox Platform Eddie Johnston.

WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.

GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.

Workload Management Workpackage Massimo Sgaravatto INFN Padova.

GRID Workload Management System Massimo Sgaravatto INFN Padova.

Workload Management Massimo Sgaravatto INFN Padova.

First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova

Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.

DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.

Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”

The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.

QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.

5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)

Don Quijote Data Management for the ATLAS Automatic Production System Miguel Branco – CERN ATC

Elisabetta Ronchieri - How To Use The UI command line - 10/29/01 - n° 1 How To Use The UI command line Elisabetta Ronchieri by WP1 elisabetta.ronchieri.

Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.

Nightly Releases and Testing Alexander Undrus Atlas SW week, May

STAR scheduling future directions Gabriele Carcassi 9 September 2002.

WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.

SUMS ( STAR Unified Meta Scheduler ) SUMS is a highly modular meta-scheduler currently in use by STAR at there large data processing sites (ex. RCF /

03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.

Presentation on SubmissionTrackingTool: by Anjan Sharma.

Requirements Engineering Requirements Elicitation Process Lecture-8.

Computational grids and grids projects DSS,

:: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: GridKA School 2009 MPI on Grids 1 MPI On Grids September 3 rd, GridKA School 2009.

Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.

Overview Why are STAR members encouraged to use SUMS ? Improvements and additions to SUMS Research –Job scheduling with load monitoring tools –Request.

Aug 13 th 2003Scheduler Tutorial1 STAR Scheduler – A tutorial Lee Barnby – Kent State University Introduction What is the scheduler and what are the advantages?

Part 6: (Local) Condor A: What is Condor? B: Using (Local) Condor C: Laboratory: Condor.

DataGrid WP1 Massimo Sgaravatto INFN Padova. WP1 (Grid Workload Management) Objective of the first DataGrid workpackage is (according to the project "Technical.

UNIX Commands. Why UNIX Commands Are Noninteractive Command may take input from the output of another command (filters). May be scheduled to run at specific.

Grid Workload Management Massimo Sgaravatto INFN Padova.

Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.

1 Sergio Maffioletti Grid Computing Competence Center GC3 University of Zurich Swiss Grid School 2012 Develop High Throughput.

Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”

Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.

Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 BOSS: a tool for batch job monitoring and book-keeping Claudio Grandi (INFN Bologna)

Report from USA Massimo Sgaravatto INFN Padova. Introduction Workload management system for productions Monte Carlo productions, data reconstructions.

© Geodise Project, University of Southampton, Geodise Middleware & Optimisation Graeme Pound, Hakki Eres, Gang Xue & Matthew Fairman Summer 2003.

July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.

Unified scripts ● Currently they are composed of a main shell script and a few auxiliary ones that handle mostly the local differences. ● Local scripts.

APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.

FRANEC and BaSTI grid integration Massimo Sponza INAF - Osservatorio Astronomico di Trieste.

DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.

MySQL and GRID status Gabriele Carcassi 9 September 2002.

ClearQuest XML Server with ClearCase Integration Northwest Rational User’s Group February 22, 2007 Frank Scholz Casey Stewart

Job Submission with Globus, Condor, and Condor-G Selim Kalayci Florida International University 07/21/2009 Note: Slides are compiled from various TeraGrid.

© Geodise Project, University of Southampton, Geodise Middleware Graeme Pound, Gang Xue & Matthew Fairman Summer 2003.

Grid Interoperability Update on GridFTP tests Gregor von Laszewski

Tool Integration with Data and Computation Grid “Grid Wizard 2”

Summary from WP 1 Parallel Section Massimo Sgaravatto INFN Padova.

Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.

Matthew Farrellee Computer Sciences Department University of Wisconsin-Madison Condor and Web Services.

Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group

Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.

STAR Scheduler Gabriele Carcassi STAR Collaboration.

10 March Andrey Grid Tools Working Prototype of Distributed Computing Infrastructure for Physics Analysis SUNY.

First evaluation of the Globus GRAM service Massimo Sgaravatto INFN Padova.

CE design report Luigi Zangrando

Workload Management Workpackage

U.S. ATLAS Grid Production Experience

BOSS: the CMS interface for job summission, monitoring and bookkeeping

BOSS: the CMS interface for job summission, monitoring and bookkeeping

BOSS: the CMS interface for job summission, monitoring and bookkeeping

Wide Area Workload Management Work Package DATAGRID project

Presentation transcript:

STAR Scheduling status Gabriele Carcassi 9 September 2002

Objectives  Have something for September  Stabilize the user interface used to submit jobs, based on the user perspective  Provide an architecture that allow easy change  Provide a way for the administrator to change the behavior of the system

STAR Scheduling architecture UI UJDL Perl interface MySQL Dispatcher JobInitializer Policy LSF File Catalog Queue manager Scheduler / Resource broker (?) Current architecture for job submission

User interface  Driven by use cases, and not by the tools used to implement it  user basically gives the job and the list of input files, which can also be a catalog query  User specify what he wants to do, and not how to do it  simpler to use  gives the administrator more flexibility in the implementation

User interface  User job description in XML  Scheduler developed at Wayne State uses XML  Easy to extend:  ex. multiple ways to describe the input  Parsers already available

Job Initializer  Parses the xml job request  Checks the request to see if it is valid  Checks for elements outside specification (typically errors)  Checks for consistency (existence of input files on disk,...)  Checks for requirements (require the output file,...)  Creates the Java objects representing the request (JobRequest)

Job Initializer  Current implementation  Strict parser: any keyword outside the specification stops the process  Checks for the existence of the stdin file and the stdout directory  Forces the stdout to prevent side effects (such as LSF would accidentally send the output by mail)

Policy  From one request, creates a series of processes to fulfill that request  Processes are created according to farm administrator’s decisions  The policy may query the file catalog, the queues or other middleware to make an optimal decision

Policy  We anticipate a lot of the work in finding an optimal policy  Policy is easily changeable, to allow the administrator to change the behavior of the system

Policy  Current policy  The query is resolved by simply querying the catalog  Divide the job into several processes, according to where the input file is located  No more than 10 input files per job

File Catalog integration  In the job description a user can specify one or more queries  Depending on how these queries are resolved, the farm can be more or less efficient  Mechanism to execute the query is separate from the query description  easy to change catalog implementation

File Catalog integration  Current implementation:  Very simple to allow fast implementation  Forwards the query as it is to the perl script interface of STAR catalog  main advantage: same syntax for the user  No “smart” selection is made  no effort is done for selecting those files that would optimize the use of the farm

Dispatcher  Talks to the underlying queue system  Takes care of creating the script that will be executed  Creates environment variables and the file list

Dispatcher  Current implementation:  creates file list and script in the directory where the job was submitted from  creates environment variables containing the job id, the list of files and all the files in the list.  creates a command line for LSF  submits job to LSF

Other functionalities  Log  The logging services provided in Java 1.4 are used to create a detailed log  Each entry has a level attribute (FINEST, FINER, FINE, INFO, WARNING, SEVERE), and the log can be selected to produce output only starting from one level  We will use FINEST during beta, INFO during the first months of production, and WARNING after that  Logging goes on behind the back of the user, providing full information about usage essential to trace bugs and problems associated with the policy.

Conclusion  The tool is available and working  beta quality: works reliably, some small feature might be needed, QA test still required.  Allows the use of local disks  Architecture is open to allow changes  Catalog implementation (MAGDA, RLS, GDMP,... ?)  Dispatcher implementation (Condor, Condor-g – Globus,... )