Part Five: Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun.

Slides:



Advertisements
Similar presentations
Globus Workshop at CoreGrid Summer School 2006 Dipl.-Inf. Hamza Mehammed Leibniz Computing Centre.
Advertisements

A3.1 Assignment 3 Simple Job Submission Using GT 4 GRAM.
Grid Resource Allocation Management (GRAM) GRAM provides the user to access the grid in order to run, terminate and monitor jobs remotely. The job request.
NorduGrid Grid Manager developed at NorduGrid project.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
CSF4, SGE and Gfarm Integration Zhaohui Ding Jilin University.
Part 7: CondorG A: Condor-G B: Laboratory: CondorG.
CENG 546 Dr. Esma Yıldırım.  A fundamental enabling technology for the "Grid," letting people share computing power, databases, and other tools securely.
13/05/2004Janusz Martyniak Imperial College London 1 Using Ganga to Submit BaBar Jobs Development Status.
A Computation Management Agent for Multi-Institutional Grids
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
GlueX Computing GlueX Collaboration Meeting – Glasgow Edward Brash – University of Regina August 4 th, 2003.
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
6a.1 Globus Toolkit Execution Management. Data Management Security Common Runtime Execution Management Information Services Web Services Components Non-WS.
Sun Grid Engine Grid Computing Assignment – Fall 2005 James Ruff Senior Department of Mathematics and Computer Science Western Carolina University.
Assignment 3 Using GRAM to Submit a Job to the Grid James Ruff Senior Western Carolina University Department of Mathematics and Computer Science.
Workload Management Massimo Sgaravatto INFN Padova.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Grids and Globus at BNL Presented by John Scott Leita.
First ideas for a Resource Management Architecture for Productions Massimo Sgaravatto INFN Padova.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
Overview of TeraGrid Resources and Usage Selim Kalayci Florida International University 07/14/2009 Note: Slides are compiled from various TeraGrid Documentations.
Part Three: Data Management 3: Data Management A: Data Management — The Problem B: Moving Data on the Grid FTP, SCP GridFTP, UberFTP globus-URL-copy.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
Computational grids and grids projects DSS,
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
Part 6: (Local) Condor A: What is Condor? B: Using (Local) Condor C: Laboratory: Condor.
Rochester Institute of Technology Job Submission Andrew Pangborn & Myles Maxfield 10/19/2015Service Oriented Cyberinfrastructure Lab,
Part 8: DAGMan A: Grid Workflow Management B: DAGMan C: Laboratory: DAGMan.
Grid Workload Management Massimo Sgaravatto INFN Padova.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Grid Compute Resources and Job Management. 2 Local Resource Managers (LRM)‏ Compute resources have a local resource manager (LRM) that controls:  Who.
June 21-25, 2004Lecture2: Basic Grid Skills1 Lecture 2 Basic Grid Skills Presenter Name Presenter Institution Presenter address Grid Summer Workshop.
Grid NERSC demo Shreyas Cholia Open Software and Programming NERSC User Group Meeting September 19, 2007.
Grid Compute Resources and Job Management New Mexico Grid School – April 9, 2009 Marco Mambelli – University of Chicago
© 2007 UC Regents1 Track 1: Cluster and Grid Computing NBCR Summer Institute Session 1.1: Introduction to Cluster and Grid Computing July 31, 2007 Wilfred.
Report from USA Massimo Sgaravatto INFN Padova. Introduction Workload management system for productions Monte Carlo productions, data reconstructions.
Condor-G A Quick Introduction Alan De Smet Condor Project University of Wisconsin - Madison.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Review of Condor,SGE,LSF,PBS
APST Internals Sathish Vadhiyar. apstd daemon should be started on the local resource Opens a port to listen for apst client requests Runs on the host.
1 High-Performance Grid Computing and Research Networking Presented by David Villegas Instructor: S. Masoud Sadjadi
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
Basic Grid Projects - Globus Sathish Vadhiyar Sources/Credits: Project web pages, publications available at Globus site. Some of the figures were also.
Creating and running an application.
Grid Compute Resources and Job Management. 2 How do we access the grid ?  Command line with tools that you'll use  Specialised applications Ex: Write.
File Systems cs550 Operating Systems David Monismith.
Job Submission with Globus, Condor, and Condor-G Selim Kalayci Florida International University 07/21/2009 Note: Slides are compiled from various TeraGrid.
Globus Grid Tutorial Part 2: Running Programs Across Multiple Resources.
Grid Compute Resources and Job Management. 2 Job and compute resource management This module is about running jobs on remote compute resources.
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
Manchester Computing Supercomputing, Visualization & eScience Seamless Access to Multiple Datasets Mike AS Jones ● Demo Run-through.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
First evaluation of the Globus GRAM service Massimo Sgaravatto INFN Padova.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
Workload Management Workpackage
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
Peter Kacsuk – Sipos Gergely MTA SZTAKI
Creating and running applications on the NGS
Globus Job Management. Globus Job Management Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun.
Part Three: Data Management
Presentation transcript:

Part Five: Globus Job Management

A: GRAM B: Globus Job Commands C: Laboratory: globusrun

A: GRAM

GRAM: What is it? Given a job specification: Create an environment for a job Stage files to/from the environment Submit a job to a local scheduler Monitor a job Send job state change notifications Stream a job’s stdout/err during execution

GRAM: Some Terminology We speak loosely most of the time, but: Globus Job Management Service Starts up and monitors jobs Stages data in and out GRAM Protocol to communicate with the job management service We often say “GRAM” as a shorthand for either of these

GRAM: How Does it Work? Process Local Resource Manager Gatekeeper (Authenticates & Authorizes) Client Head Node a.k.a “Gatekeeper” Job Manager (Submits job & Monitors job) Compute Resource GRAM Results

GRAM: What is a “Local Resource Manager?” It’s usually a batch system that allows you to run jobs across a cluster of computers Examples: Condor PBS LSF Sun Grid Engine Most systems allow you to access “fork” It’s the default It runs on the gatekeeper: a bad idea in general, but okay for testing

GRAM: RSL The client describes the job with the Resource Specification Language (RSL) & (executable = a.out) (directory = /home/nobody ) (arguments = arg1 "arg 2") You don’t usually need to specify RSL directly, unless you have special needs.

GRAM: Security GRAM uses GSI for security Submitting a job requires a full proxy The remote system & your job will get a limited proxy The job will run—you had a full proxy when you submitted But your job cannot submit other jobs

Making your job batch ready Must be able to run in the background: no interactive input, windows, GUI, etc. Can still use STDIN, STDOUT, and STDERR (the keyboard and the screen), but files are used for these instead of the actual devices Organize data files Must be able to be run multiple times, sometimes incomplete

GRAM: Basic Usage globus-job-run hostX /bin/hostname This runs /bin/hostname on hostX It expects /bin/hostname to already be there globusrun -o -r hostX ‘&(executable=/bin/echo) (arguments=Hello Grid)’ This is the RSL We could specify lots of things here, but we didn’t These just ran with the fork job manager, not an “interesting” batch system

GRAM: Running on a Batch System Append the batch system to the hostname: globus-job-run hostX/jobmanager-condor /bin/hostname You will do this for most real work The batch system can handle many more jobs Batch systems are reliable and track your jobs Fork is not reliable, and your job may be lost

B: Globus Job Commands

Globus Job Commands globus-job-run ‘contact-string’ command globus-job-submit ‘contact-string’ command globus-job-status ‘contact-string’ globus-job-get-output ‘contact-string’ globus-job-clean ‘contact-string’ globusrun

Lab 5: globusrun

In this lab, you’ll: Set up your environment for job submission Submit simple jobs with globus-job-run and globus-job-submit Use globus & RSL Stage data with globusrun & RSL

Credits NSF disclaimer Portions of this presentation were adapted from the following sources: Jaime Frey, Condor Group, UW-Madison