Developer APIs to Condor + A Tutorial on Condor’s Web Service Interface Todd Tannenbaum, UW-Madison Matthew Farrellee, Red Hat.

Slides:



Advertisements
Similar presentations
Computer Sciences Department University of Wisconsin-Madison Developer APIs to Condor + A Tutorial.
Advertisements

Three types of remote process invocation
WS-JDML: A Web Service Interface for Job Submission and Monitoring Stephen M C Gough William Lee London e-Science Centre Department of Computing, Imperial.
COM vs. CORBA.
RPC Robert Grimm New York University Remote Procedure Calls.
Web Services Darshan R. Kapadia Gregor von Laszewski 1http://grid.rit.edu.
Using EC2 with HTCondor Todd L Miller 1. › Introduction › Submitting an EC2 job (user tutorial) › New features and other improvements › John Hover talking.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
1 Workshop 20: Teaching a Hands-on Undergraduate Grid Computing Course SIGCSE The 41st ACM Technical Symposium on Computer Science Education Friday.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Zach Miller Condor Project Computer Sciences Department University of Wisconsin-Madison Flexible Data Placement Mechanisms in Condor.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
T Network Application Frameworks and XML Web Services and WSDL Sasu Tarkoma Based on slides by Pekka Nikander.
Rsv-control Marco Mambelli – Site Coordination meeting October 1, 2009.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor.
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
COM vs. CORBA Computer Science at Azusa Pacific University September 19, 2015 Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department.
Elisabetta Ronchieri - How To Use The UI command line - 10/29/01 - n° 1 How To Use The UI command line Elisabetta Ronchieri by WP1 elisabetta.ronchieri.
A Guide to Secure Web Services with GJXML Hey I downloade d an IEPD! Cool, how do you write a web service? I use.NET Moo! I use Java.
Condor Tugba Taskaya-Temizel 6 March What is Condor Technology? Condor is a high-throughput distributed batch computing system that provides facilities.
Progress Report Barnett Chiu Glidein Code Updates and Tests (1) Major modifications to condor_glidein code are as follows: 1. Command Options:
1 HawkEye A Monitoring and Management Tool for Distributed Systems Todd Tannenbaum Department of Computer Sciences University of.
Web Server Administration Web Services XML SOAP. Overview What are web services and what do they do? What is XML? What is SOAP? How are they all connected?
Peter Keller Computer Sciences Department University of Wisconsin-Madison Quill Tutorial Condor Week.
Grid Computing I CONDOR.
Compiled Matlab on Condor: a recipe 30 th October 2007 Clare Giacomantonio.
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
.Net and Web Services Security CS795. Web Services A web application Does not have a user interface (as a traditional web application); instead, it exposes.
Condor Birdbath Web Service interface to Condor
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
Presentation: SOAP/WS in a distributed object framework, Application Servers & AXIS SOAP.
Part 6: (Local) Condor A: What is Condor? B: Using (Local) Condor C: Laboratory: Condor.
Intermediate Condor Rob Quick Open Science Grid HTC - Indiana University.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor-G Operations.
Grid job submission using HTCondor Andrew Lahiff.
Stuart Wakefield Imperial College London Evolution of BOSS, a tool for job submission and tracking W. Bacchi, G. Codispoti, C. Grandi, INFN Bologna D.
Presentation: SOAP/WS in a distributed object framework, Application Servers & AXIS SOAP.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor RoadMap.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
1 Getting popular Figure 1: Condor downloads by platform Figure 2: Known # of Condor hosts.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Quill / Quill++ Tutorial.
1 Condor BirdBath SOAP Interface to Condor Charaka Goonatilake Department of Computer Science University College London
© Geodise Project, University of Southampton, Geodise Middleware & Optimisation Graeme Pound, Hakki Eres, Gang Xue & Matthew Fairman Summer 2003.
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Review of Condor,SGE,LSF,PBS
Condor Week 2004 The use of Condor at the CDF Analysis Farm Presented by Sfiligoi Igor on behalf of the CAF group.
Greg Thain Computer Sciences Department University of Wisconsin-Madison Configuring Quill Condor Week.
© Geodise Project, University of Southampton, Geodise Middleware Graeme Pound, Gang Xue & Matthew Fairman Summer 2003.
Intro to Web Services Dr. John P. Abraham UTPA. What are Web Services? Applications execute across multiple computers on a network.  The machine on which.
.NET Mobile Application Development XML Web Services.
Matthew Farrellee Computer Sciences Department University of Wisconsin-Madison Condor and Web Services.
JSS Job Submission Service Massimo Sgaravatto INFN Padova.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor Job Router.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
WP3 Implementing R-GMA grid services in GT3 Abdeslem Djaoui & WP3 Grid Services Task Force 7 th EU Datagrid meeting 26/09/2003
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
Core and Framework DIRAC Workshop October Marseille.
Condor Project Computer Sciences Department University of Wisconsin-Madison Extend/alter Condor via developer APIs/plugins CERN Feb Todd Tannenbaum.
Quick Architecture Overview INFN HTCondor Workshop Oct 2016
Operating a glideinWMS frontend by Igor Sfiligoi (UCSD)
WEB SERVICES.
T Network Application Frameworks and XML Web Services and WSDL Sasu Tarkoma Based on slides by Pekka Nikander.
Basic Grid Projects – Condor (Part I)
Condor-G Making Condor Grid Enabled
Condor-G: An Update.
Presentation transcript:

Developer APIs to Condor + A Tutorial on Condor’s Web Service Interface Todd Tannenbaum, UW-Madison Matthew Farrellee, Red Hat

2 Interfacing Applications w/ Condor › Suppose you have an application which needs a lot of compute cycles › You want this application to utilize a pool of machines › How can this be done?

3 Some Condor APIs › Command Line tools  condor_submit, condor_q, etc › DRMAA › Condor GAHP › JSDL › RDBMS › Condor Perl Module › Event Log › SOAP

4 Command Line Tools › Don’t underestimate them! › Your program can create a submit file on disk and simply invoke condor_submit: system(“echo universe=VANILLA > /tmp/condor.sub”); system(“echo executable=myprog >> /tmp/condor.sub”);... system(“echo queue >> /tmp/condor.sub”); system(“condor_submit /tmp/condor.sub”);

5 Command Line Tools › Your program can create a submit file and give it to condor_submit through stdin: PERL:fopen(SUBMIT, “|condor_submit”); print SUBMIT “universe=VANILLA\n”;... C/C++:int s = popen(“condor_submit”, “r+”); write(s, “universe=VANILLA\n”, 17/*len*/);...

6 Command Line Tools › Using the +Attribute with condor_submit: universe = VANILLA executable = /bin/hostname output = job.out log = job.log +webuser = “ zmiller ” queue

7 Command Line Tools › Use -constraint and –format with condor_q: % condor_q -constraint 'webuser=="zmiller"' -- Submitter: bio.cs.wisc.edu : : bio.cs.wisc.edu ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD zmiller 10/11 06: :00:00 I hostname % condor_q -constraint 'webuser=="zmiller"' -format "%i\t" ClusterId -format "%s\n" Cmd /bin/hostname

8 Command Line Tools › condor_wait will watch a job log file and wait for a certain (or all) jobs to complete: system(“condor_wait job.log”); › can specify a timeout

9 Command Line Tools › condor_q and condor_status –xml option › So it is relatively simple to build on top of Condor’s command line tools alone, and can be accessed from many different languages (C, PERL, python, PHP, etc). › However…

10 DRMAA › DRMAA is a OGF standardized job- submission API › Has C (and now Java) bindings › Is not Condor-specific › SourceForge Project

11 DRMAA › Unfortunately, the DRMAA 1.x API does not support some very important features, such as:  Fault tolerance  Transactions

12 Condor GAHP › The Condor GAHP is a relatively low-level protocol based on simple ASCII messages through stdin and stdout › Supports a rich feature set including two-phase commits, transactions, and optional asynchronous notification of events Good stuff, Todd!

13 GAHP, cont Example: R: $GahpVersion: Nov NCSA\ CoG\ Gahpd $ S: GRAM_PING 100 vulture.cs.wisc.edu/fork R: E S: RESULTS R: E S: COMMANDS R: S COMMANDS GRAM_JOB_CANCEL GRAM_JOB_REQUEST GRAM_JOB_SIGNAL GRAM_JOB_STATUS GRAM_PING INITIALIZE_FROM_FILE QUIT RESULTS VERSION S: VERSION R: S $GahpVersion: Nov NCSA\ CoG\ Gahpd $ S: INITIALIZE_FROM_FILE /tmp/grid_proxy_ txt R: S S: GRAM_PING 100 vulture.cs.wisc.edu/fork R: S S: RESULTS R: S 0 S: RESULTS R: S 1 R: S: QUIT R: S

14 JSDL and Condor › GridSAM: open source web service for job submission and monitoring › Condor plugin for GridSAM enables JSDL submissions to Condor.  This plugin uses Condor’s Web Service API

15 RDMS: Quill › Condor operational data mirrored into an RDBMS › Job, machine, historical info, … › Read-only › Load balancing benefits QuillSchedd Job Queue log RDBMS Startd … Master Queue + History Tables

16 Condor Perl Module › Perl module to parse the “job log file” › Can use instead of polling w/ condor_q › Call-back event model › (Note: job log can be written in XML)

Event Logging Condor can generate an “Event Log”:  This is a log of significant events about the life of the jobs on a system.  Competing objectives:  Limit the amount of space used by the logging, so that event logging doesn’t become a problem itself  Never “drop” events  C++ reader library

18 Web Service Interface › Simple Object Access Protocol  Mechanism for doing RPC using XML (typically over HTTP or HTTPS)  A World Wide Web Consortium (W3C) standard › SOAP Toolkit: Transform a WSDL to a client library

19 Benefits of a Condor SOAP API › Can be accessed with standard web service tools › Condor accessible from platforms where its command-line tools are not supported › Talk to Condor with your favorite language and SOAP toolkit

20 Condor SOAP API functionality › Get basic daemon info (version, platform) › Submit jobs › Retrieve job output › Remove/hold/release jobs › Query machine status › Advertise resources › Query job status

21 Getting machine status via SOAP Your program SOAP library queryStartdAds() condor_collector Machine List SOAP over HTTP

22 Lets get some details…

23 The API › Core API, described with WSDL, is designed to be as flexible as possible  File transfer is done in chunks  Transactions are explicit › Wrapper libraries aim to make common tasks as simple as possible  Currently in Java and C#  Expose an object-oriented interface

24 Things we will cover › Condor setup › Necessary tools › Job Submission › Job Querying › Job Retrieval › Authentication with SSL and X.509

25 Condor setup › Start with a working condor_config › The SOAP interface is off by default  Turn it on by adding ENABLE_SOAP=TRUE › Access to the SOAP interface is denied by default  Set ALLOW_SOAP and DENY_SOAP, they work like ALLOW_READ/WRITE/…  Example: ALLOW_SOAP=*/*.cs.wisc.edu

26 Necessary tools › You need a SOAP toolkit  Apache Axis (Java) -  Microsoft.Net -  gSOAP (C/C++) -  ZSI (Python) -  SOAP::Lite (Perl) - › You need Condor’s WSDL files  Find them in lib/webservice/ in your Condor release › Put the two together to generate a client library  $ java org.apache.axis.wsdl.WSDL2Java condorSchedd.wsdl › Compile that client library  $ javac condor/*.java All our examples are in Java using Apache Axis

27 Client wrapper libraries › The core API has some complex spots › A wrapper library is available in Java and C#  Makes the API a bit easier to use (e.g. simpler file transfer & job ad submission)  Makes the API more OO, no need to remember and pass around transaction ids › We are going to use the Java wrapper library for our examples  You can download it from

28 Submitting a job › The CLI way… universe = vanilla executable = /bin/cp arguments = cp.sub cp.worked should_transfer_files = yes transfer_input_files = cp.sub when_to_transfer_output = on_exit queue 1 $ condor_submit cp.sub cp.sub: Explicit bits clusterid = X procid = Y owner = matt requirements = Z Implicit bits

29 Repeat to submit multiple jobs in a single cluster Repeat to submit multiple clusters The SOAP way… 1.Begin transaction 2.Create cluster 3.Create job 4.Send files 5.Describe job 6.Commit transaction Submitting a job

Begin transaction 2. Create cluster 3. Create job 4&5. Send files & describe job 6. Commit transaction Schedd schedd = new Schedd(“ Transaction xact = schedd.createTransaction(); xact.begin(30); int cluster = xact.createCluster(); int job = xact.createJob(cluster); File[] files = { new File(“cp.sub”) }; xact.submit(cluster, job, “owner”, UniverseType.VANILLA, “/bin/cp”, “cp.sub cp.worked”, “requirements”, null, files); xact.commit(); Submission from Java

31 Schedd’s location Max time between calls (seconds) Job owner, e.g. “matt” Requirements, e.g. “OpSys==\“Linux\”” Extra attributes, e.g. Out=“stdout.txt” or Err=“stderr.txt” Schedd schedd = new Schedd(“ Transaction xact = schedd.createTransaction(); xact.begin(30); int cluster = xact.createCluster(); int job = xact.createJob(cluster); File[] files = { new File("cp.sub") }; xact.submit(cluster, job, “owner”, UniverseType.VANILLA, “/bin/cp”, “cp.sub cp.worked”, “requirements”, null, files); xact.commit(); Submission from Java

32 Querying jobs › The CLI way… $ condor_q -- Submitter: localhost : : localhost ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 1.0 matt 10/27 14: :46:42 C sleep … 42 jobs; 1 idle, 1 running, 1 held, 1 unexpanded

33 Also, getJobAds given a constraint, e.g. “Owner==\“matt\”” String[] statusName = { “”, “Idle”, “Running”, “Removed”, “Completed”, “Held” }; int cluster = 1; int job = 0; Schedd schedd = new Schedd(“ ClassAd ad = new ClassAd(schedd.getJobAd(cluster, job)); int status = Integer.valueOf(ad.get(“JobStatus”)); System.out.println(“Job is “ + statusName[status]); Querying jobs › The SOAP way from Java…

34 Retrieving a job › The CLI way.. › Well, if you are submitting to a local Schedd, the Schedd will have all of a job’s output written back for you › If you are doing remote submission you need condor_transfer_data, which takes a constraint and transfers all files in spool directories of matching jobs

35 Discover available files Remote file Local file Retrieving a job › The SOAP way in Java… int cluster = 1; int job = 0; Schedd schedd = new Schedd(“ Transaction xact = schedd.createTransaction(); xact.begin(30); FileInfo[] files = xact.listSpool(cluster, job); for (FileInfo file : files) { xact.getFile(cluster, job, file.getName(), file.getSize(), new File(file.getName())); } xact.commit();

36 Authentication for SOAP › Authentication is done via mutual SSL authentication  Both the client and server have certificates and identify themselves › It is not always necessary, e.g. in some controlled environments (a portal) where the submitting component is trusted › A necessity in an open environment -- remember that the submit call takes the job’s owner as a parameter  Imagine what happens if anyone can submit to a Schedd running as root…

37 Live DEMOS! A live demo Todd? You are nuts!!

38 Thank you! Questions?

39 Details on setting up authenticated SOAP over HTTPS

40 Authentication setup › Create and sign some certificates › Use OpenSSL to create a CA  CA.sh -newca › Create a server cert and password-less key  CA.sh -newreq && CA.sh -sign  mv newcert.pem server-cert.pem  openssl rsa -in newreq.pem -out server-key.pem › Create a client cert and key  CA.sh -newreq && CA.sh -sign && mv newcert.pem client-cert.pem && mv newreq.pem client-key.pem

41 Authentication config › Config options…  ENABLE_SOAP_SSL is FALSE by default  _SOAP_SSL_PORT Set this to a different port for each SUBSYS you want to talk to over ssl, the default is a random port Example: SCHEDD_SOAP_SSL_PORT=1980  SOAP_SSL_SERVER_KEYFILE is required and has no default The file containing the server’s certificate AND private key, i.e. “keyfile” after cat server-cert.pem server-key.pem > keyfile

42 Authentication config › Config options continue…  SOAP_SSL_CA_FILE is required The file containing public CA certificates used in signing client certificates, e.g. demoCA/cacert.pem › All options except SOAP_SSL_PORT have an optional SUBSYS_* version  For instance, turn on SSL for everyone except the Collector with ENABLE_SOAP_SSL=TRUE COLLECTOR_ENABLE_SOAP_SSL=FALSE

43 One last bit of config › The certificates we generated have a principal name, which is not standard across many authentication mechanisms › Condor maps authenticated names (here, principal names) to canonical names that are authentication method independent › This is done through mapfiles, given by SEC_CANONICAL_MAPFILE and SEC_USER_MAPFILE › Canonical map: SSL. \1 › User map: (.*) \1 › “SSL” is the authentication method, “.* Address….*” is a pattern to match against authenticated names, and “\1” is the canonical name, in this case the username on the in the principal

44 HTTPS with Java › Setup keys…  keytool -import -keystore truststore -trustcacerts -file demoCA/cacert.pem  openssl pkcs12 -export -inkey client-key.pem -in client- cert.pem -out keystore › All the previous code stays the same, just set some properties  javax.net.ssl.trustStore, javax.net.ssl.keyStore, javax.net.ssl.keyStoreType, javax.net.ssl.keyStorePassword  Example: java -Djavax.net.ssl.trustStore=truststore - Djavax.net.ssl.keyStore=keystore - Djavax.net.ssl.keyStoreType=PKCS12 - Djavax.net.ssl.keyStorePassword=pass Example