Condor Birdbath Web Service interface to Condor

Slides:



Advertisements
Similar presentations
GridSAM Overview Grid Job S ubmission A nd M onitoring Service What is GridSAM? Funded by the OMII Managed Programme (Started in Sept, 04) Client Perspective.
Advertisements

Community Grids Lab1 CICC Project Meeting VOTable Developed VotableToSpreadsheet Service which accepts VOTable file location as an input, converts to Excel.
WS-JDML: A Web Service Interface for Job Submission and Monitoring Stephen M C Gough William Lee London e-Science Centre Department of Computing, Imperial.
1 Concepts of Condor and Condor-G Guy Warner. 2 Harvesting CPU time Teaching labs. + Researchers Often-idle processors!! Analyses constrained by CPU time!
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
GRID Workload Management System Massimo Sgaravatto INFN Padova.
Workload Management Massimo Sgaravatto INFN Padova.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Harnessing the Capacity of Computational.
Grid Computing 7700 Fall 2005 Lecture 17: Resource Management Gabrielle Allen
- 1 - Grid Programming Environment (GPE) Ralf Ratering Intel Parallel and Distributed Solutions Division (PDSD)
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Condor Tugba Taskaya-Temizel 6 March What is Condor Technology? Condor is a high-throughput distributed batch computing system that provides facilities.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
Job Submission Condor, Globus, Java CoG Kit Young Suk Moon.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Grid Computing I CONDOR.
INFSO-RI Module 01 ETICS Overview Alberto Di Meglio.
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
London e-Science Centre GridSAM A Standards Based Approach to Job Submission A. Stephen M C Gough Imperial College London A Standards Based Approach to.
3-2.1 Topics Grid Computing Meta-schedulers –Condor-G –Gridway Distributed Resource Management Application (DRMAA) © 2010 B. Wilkinson/Clayton Ferner.
INFSO-RI Module 01 ETICS Overview Etics Online Tutorial Marian ŻUREK Baltic Grid II Summer School Vilnius, 2-3 July 2009.
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Rochester Institute of Technology Job Submission Andrew Pangborn & Myles Maxfield 10/19/2015Service Oriented Cyberinfrastructure Lab,
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
London e-Science Centre GridSAM Job Submission and Monitoring Web Service William Lee, Stephen McGough.
Grid Workload Management Massimo Sgaravatto INFN Padova.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
GridSAM - A Standards Based Approach to Job Submission Through Web Services William Lee and Stephen McGough London e-Science Centre Department of Computing,
Report from USA Massimo Sgaravatto INFN Padova. Introduction Workload management system for productions Monte Carlo productions, data reconstructions.
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor RoadMap.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
1 Condor BirdBath SOAP Interface to Condor Charaka Goonatilake Department of Computer Science University College London
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
Developer APIs to Condor + A Tutorial on Condor’s Web Service Interface Todd Tannenbaum, UW-Madison Matthew Farrellee, Red Hat.
Pilot Factory using Schedd Glidein Barnett Chiu BNL
how Shibboleth can work with job schedulers to create grids to support everyone Exposing Computational Resources Across Administrative Domains H. David.
Scheduling & Resource Management in Distributed Systems Rajesh Rajamani, May 2001.
Condor Services for the Global Grid: Interoperability between OGSA and Condor Clovis Chapman 1, Paul Wilson 2, Todd Tannenbaum 3, Matthew Farrellee 3,
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.
Matthew Farrellee Computer Sciences Department University of Wisconsin-Madison Condor and Web Services.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor NT Condor ported.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
GGF10 DRMAA Working Group Hrabri Rajic Intel GGF10 Berlin, Germany March, 2004 GLOBALGRIDFORUM.ORG.
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
Introduction to the Application Hosting Environment
Basic Grid Projects – Condor (Part I)
Module 01 ETICS Overview ETICS Online Tutorials
Wide Area Workload Management Work Package DATAGRID project
Condor-G Making Condor Grid Enabled
Condor-G: An Update.
Presentation transcript:

Condor Birdbath Web Service interface to Condor Clovis Chapman, Charaka Goonatilake, Wolfgang Emmerich, Matthew Farrellee, Todd Tannenbaum, Miron Livny, Mark Calleja and Martin Dove c.chapman@cs.ucl.ac.uk

Condor Overview Jobs (command-line) Submission Machine (Schedules Jobs) Central manager Execution machines UCL 940 machines – CamGrid High-throughput computing resource manager and job scheduler. Allows idle computing resources to be exploited , etc. … UCL Condor Pool: ~ 940 Windows Terminal Servers spread over 30 student clusters (3 years running) CamGrid: ~140 Linux machines spread over 8 pools brought together through flocking

Motivations Facilitate the development of third-party tools and applications capable of interacting with Condor remotely. E.g. Build higher-level application specific scheduler that submits jobs to one or more Condor pools according to application semantics These can be built using a wide range of languages/SOAP packages – Birdbath tested on: Java (Apache Axis, XSUL) Python (ZSI) C# (.Net) C/C++ (gSOAP) Can also use Web Service based workflow modelling languages such as BPEL (Business Process Execution Language) Build Web Service support within the Condor architecture itself i.e. Rather than rely on external “wrappers” (e.g. Globus Toolkit) Expose a wide range of high-throughput computing services based on Condor capabilities (e.g. workload/queue management, scheduling, checkpointing, job migration, etc.) Rather than wrap up a condor pool – provide different services Mention embedding grid considerations within the condor architecture Musnt affect features that made it popular 3

The big picture 4 BPEL Site D: Bath Condor-G Client SOAP SOAP Schedd Globus/gridSAM Client BPEL SOAP SOAP Schedd Schedd Schedd flocking Site A: CamGrid Site B: UCL Site C: RI 4

Why SOAP? Alternatives 5 Building wrappers around command line tools Globus Toolkit 3 Good when interacting with different types of resource managers. Otherwise Birdbath has a richer, more scalable and streamlined interface which is more suited to Condor. Condor’s GAHP protocol: ASCII stream-based protocol. Built within Condor itself. Simple Condor specific protocol for remote submission No Type Safety / interface declaration / session management. DRMAA API: built-in support API for the submission and control of jobs to one or more Distributed Resource Management Systems defined by the GGF. Allows access to Condor through Local procedure calls C binding only / Weak fault tolerance 5

Submission machine(s) Condor Architecture overview Central manager Negotiator Collector Job ClassAd Resource classAd Submission machine(s) Execution machine(s) Startd Schedd Starter Shadow 6

Condor Architecture overview ClassAds Universe = Vanilla Requirements = (OpSys == “WINNT50”) \ && (Arch == “INTEL”) \ && (Memory > 256) Executable = cpusoak.exe Input = input.txt Output = output.txt TransferFiles = IF_NEEDED Arch = "INTEL" OpSys = "WINNT50" State = "Unclaimed" Activity = "Idle" LoadAvg = 0.000260 Memory = 256 Cpus = 1 7

Condor SOAP extensions Remote Client Obtain pool characteristics (resource ClassAds) Submit Job (send job ClassAd) Monitor Job (Obtain Job ClassAds) Get/Send file Cancel/hold/release job Execution machine Execution machine Submission machine(s) Central manager Schedd Collector Negotiator Execution machine Shadow Execution machine 8

Clients for the Condor Web Services Custom Client holdJob() Submission machine(s) SOAP library Schedd All libraries can be generated (automatically) based on interface description provided by us (WSDL) The WSDL interface defines which operations can be called and their parameters (e.g. queryAds(Constraint)) Alternatively use a Web Service based orchestration language like BPEL (Business Process Execution Language) 9

Quick application example: Collector Query Using Axis API. Generate stubs from WSDL file using WSDL2Java e.g. Obtain information about all resources that have over 512Mb RAM locator = new CondorLocator(); collector = locator.getCollector(new URL(“http://server.cs.ucl.ac.uk:8080”)); ads = collector.QueryStartdAds(“Memory>512”); 10

Quick Application example: Job Submission Transaction based process New Transaction() New Job ID () Send Files ({input files, binaries}) Send Job Description(ClassAd) Commit Transaction() File transfer is in segments (specified by users) – using simple protocol Two Phase Commit for increased robustness Can also submit DAG-Man jobs 11

Available in Condor 6.7.5 dev releases Simply add the following to your configuration file: ENABLE_SOAP = TRUE ENABLE_WEB_SERVER = TRUE WEB_ROOT_DIR = $RELEASE_DIR/lib/webservice 12

OMII Collaboration: GridSAM plugin Web Service based job submission service Uses JSDL (Job Submission Description Language) for standardized job submission to a wide range of resource managers Job Manager SGE connector Condor LSF Network JSDL SOAP Client GridSAM Web Service <?xml version="1.0" encoding="UTF-8"?> <JobDefinition xmlns="http://…"> <JobDescription> <Application> <Executable>/bin/echo</Executable> <Argument>hello world</Argument> </Application> </JobDescription> </JobDefinition> submit

Condor WS plug-in for GridSAM Client Advantages: Independent deployment of GridSAM Submission to multiple schedulers – load balancing Increased robustness and fault handling Access to a wider range of Condor functionality – Condor-G, etc. Network GridSAM Web Service Job Manager SGE connector WS-Condor connector LSF connector Network Condor Pool UCL Schedd 14

Condor WS plug-in for GridSAM Client Advantages: Independent deployment of GridSAM Submission to multiple schedulers – load balancing Increased robustness and fault handling Access to a wider range of Condor functionality – Condor-G, etc. Network GridSAM Web Service Job Manager SGE connector WS-Condor connector LSF connector Network UCL Pool CamGrid RI Schedd Schedd Schedd 15

BPEL orchestration of scientific workflows 16

Still to be done Security (SSL for now) Certificate based security Using mappings to local accounts Take advantage of additional capabilities of WSRF and WS-notification (scheduler instances, etc.) Exposing more components (e.g. shadow for remote checkpointing and retrieval of partial output files) 17

Download Condor SOAP interface available from version 6.7.5: http://www.cs.wisc.edu/condor/ Condor-WS GridSAM soon to be provided with GridSAM: http://www.lesc.ic.ac.uk/gridsam 18