Condor Project Computer Sciences Department University of Wisconsin-Madison Using New Features in Condor 7.2.

Slides:

Advertisements

Similar presentations

Jaime Frey Computer Sciences Department University of Wisconsin-Madison OGF 19 Condor Software Forum Routing.

Advertisements

UK Condor Week NeSC – Scotland – Oct 2004 Condor Team Computer Sciences Department University of Wisconsin-Madison The Bologna.

Dan Bradley Computer Sciences Department University of Wisconsin-Madison Schedd On The Side.

CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.

Greg Thain Computer Sciences Department University of Wisconsin-Madison Condor Parallel Universe.

1 Concepts of Condor and Condor-G Guy Warner. 2 Harvesting CPU time Teaching labs. + Researchers Often-idle processors!! Analyses constrained by CPU time!

More HTCondor 2014 OSG User School, Monday, Lecture 2 Greg Thain University of Wisconsin-Madison.

Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.

Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor-G: A Case in Distributed.

Efficiently Sharing Common Data HTCondor Week 2015 Zach Miller Center for High Throughput Computing Department of Computer Sciences.

Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machine Universe in.

Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.

Derek Wright Computer Sciences Department, UW-Madison Lawrence Berkeley National Labs (LBNL)

Zach Miller Condor Project Computer Sciences Department University of Wisconsin-Madison Flexible Data Placement Mechanisms in Condor.

Condor Project Computer Sciences Department University of Wisconsin-Madison What’s new in Condor? What’s coming up? Condor Week 2009.

Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”

Condor Project Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.

Jaime Frey Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.

Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.

Alain Roy Computer Sciences Department University of Wisconsin-Madison An Introduction To Condor International.

High Throughput Parallel Computing (HTPC) Dan Fraser, UChicago Greg Thain, Uwisc.

Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.

High Throughput Computing with Condor at Purdue XSEDE ECSS Monthly Symposium Condor.

Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.

The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.

Condor Tugba Taskaya-Temizel 6 March What is Condor Technology? Condor is a high-throughput distributed batch computing system that provides facilities.

Condor Project Computer Sciences Department University of Wisconsin-Madison Advanced Condor mechanisms CERN Feb

1 Evolution of OSG to support virtualization and multi-core applications (Perspective of a Condor Guy) Dan Bradley University of Wisconsin Workshop on.

Grid Computing I CONDOR.

Greg Thain Computer Sciences Department University of Wisconsin-Madison cs.wisc.edu Interactive MPI on Demand.

1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison

Condor Project Computer Sciences Department University of Wisconsin-Madison Condor-G Operations.

Grid job submission using HTCondor Andrew Lahiff.

Privilege separation in Condor Bruce Beckles University of Cambridge Computing Service.

The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison

Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.

Condor Usage at Brookhaven National Lab Alexander Withers (talk given by Tony Chan) RHIC Computing Facility Condor Week - March 15, 2005.

Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Quill / Quill++ Tutorial.

July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.

Review of Condor,SGE,LSF,PBS

Dan Bradley University of Wisconsin-Madison Condor and DISUN Teams Condor Administrator’s How-to.

Condor Project Computer Sciences Department University of Wisconsin-Madison Grids and Condor Barcelona,

Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.

Derek Wright Computer Sciences Department University of Wisconsin-Madison New Ways to Fetch Work The new hook infrastructure in Condor.

Peter Couvares Associate Researcher, Condor Team Computer Sciences Department University of Wisconsin-Madison

Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.

Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.

Condor Project Computer Sciences Department University of Wisconsin-Madison Condor Job Router.

HTCondor Private Cloud Integration Andrew Lahiff STFC Rutherford Appleton Laboratory European HTCondor Site Admins Meeting 2014.

Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor NT Condor ported.

Condor Project Computer Sciences Department University of Wisconsin-Madison Running Interpreted Jobs.

HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.

Five todos when moving an application to distributed HTC.

Condor Week May 2012No user requirements1 Condor Week 2012 An argument for moving the requirements out of user hands - The CMS experience presented.

CHTC Policy and Configuration

Quick Architecture Overview INFN HTCondor Workshop Oct 2016

Dynamic Deployment of VO Specific Condor Scheduler using GT4

Operating a glideinWMS frontend by Igor Sfiligoi (UCSD)

Things you may not know about HTCondor

Matchmaker Policies: Users and Groups HTCondor Week, Madison 2017

High Availability in HTCondor

Building Grids with Condor

Negotiator Policy and Configuration

Accounting, Group Quotas, and User Priorities

Condor and Multi-core Scheduling

Basic Grid Projects – Condor (Part I)

HTCondor Training Florentia Protopsalti IT-CM-IS 1/16/2019.

The Condor JobRouter.

Condor-G Making Condor Grid Enabled

PU. Setting up parallel universe in your pool and when (not

Presentation transcript:

Condor Project Computer Sciences Department University of Wisconsin-Madison Using New Features in Condor 7.2

Outline › Startd Hooks › Job Router › Job Router Hooks › Power Management › Dynamic Slot Partitioning › Concurrency Limits › Variable Substitution › Preemption Attributes 2

Startd Job Hooks › Users wanted to take advantage of Condor’s resource management daemon (condor_startd) to run jobs, but they had their own scheduling system.  Specialized scheduling needs  Jobs live in their own database or other storage rather than a Condor job queue 3

Our solution › Make a system of generic “hooks” that you can plug into:  A hook is a point during the life-cycle of a job where the Condor daemons will invoke an external program  Hook Condor to your existing job management system without modifying the Condor code 4

How does Condor communicate with hooks? › Passing around ASCII ClassAds via standard input and standard output › Some hooks get control data via a command-line argument (argv) › Hooks can be written in any language (scripts, binaries, whatever you want) so long as you can read/write Stdin/out 5

What hooks are available? › Hooks for fetching work (startd):  FETCH_JOB  REPLY_FETCH  EVICT_CLAIM › Hooks for running jobs (starter):  PREPARE_JOB  UPDATE_JOB_INFO  JOB_EXIT 6

HOOK_FETCH_JOB › Invoked by the startd whenever it wants to try to fetch new work  FetchWorkDelay expression › Stdin: slot ClassAd › Stdout: job ClassAd › If Stdout is empty, there’s no work 7

HOOK_REPLY_FETCH › Invoked by the startd once it decides what to do with the job ClassAd returned by HOOK_FETCH_WORK › Gives your external system a chance to know what happened › argv[1]: “accept” or “reject” › Stdin: slot and job ClassAds › Stdout: ignored 8

HOOK_EVICT_CLAIM › Invoked if the startd has to evict a claim that’s running fetched work › Informational only: you can’t stop or delay this train once it’s left the station › Stdin: both slot and job ClassAds › Stdout: ignored 9

HOOK_PREPARE_JOB › Invoked by the condor_starter when it first starts up (only if defined) › Opportunity to prepare the job execution environment  Transfer input files, executables, etc. › Stdin: both slot and job ClassAds › Stdout: ignored, but starter won’t continue until this hook exits › Not specific to fetched work 10

HOOK_UPDATE_JOB_INFO › Periodically invoked by the starter to let you know what’s happening with the job › Stdin: slot and job ClassAds  Job ClassAd is updated with additional attributes computed by the starter: ImageSize, JobState, RemoteUserCpu, etc. › Stdout: ignored 11

HOOK_JOB_EXIT › Invoked by the starter whenever the job exits for any reason › Argv[1] indicates what happened:  “exit”: Died a natural death  “evict”: Booted off prematurely by the startd (PREEMPT == TRUE, condor_off, etc)  “remove”: Removed by condor_rm  “hold”: Held by condor_hold 12

HOOK_JOB_EXIT … › “HUH!?! condor_rm? What are you talking about?”  The starter hooks can be defined even for regular Condor jobs, local universe, etc. › Stdin: copy of the job ClassAd with extra attributes about what happened:  ExitCode, JobDuration, etc. › Stdout: ignored 13

Defining hooks › Each slot can have its own hook ”keyword”  Prefix for config file parameters  Can use different sets of hooks to talk to different external systems on each slot  Global keyword used when the per-slot keyword is not defined › Keyword is inserted by the startd into its copy of the job ClassAd and given to the starter 14

Defining hooks: example # Most slots fetch work from the database system STARTD_JOB_HOOK_KEYWORD = DATABASE # Slot4 fetches and runs work from a web service SLOT4_JOB_HOOK_KEYWORD = WEB # The database system needs to both provide work and # know the reply for each attempted claim DB_DIR = /usr/local/condor/fetch/db DATABASE_HOOK_FETCH_WORK = $(DB_DIR)/fetch_work.php DATABASE_HOOK_REPLY_FETCH = $(DB_DIR)/reply_fetch.php # The web system only needs to fetch work WEB_DIR = /usr/local/condor/fetch/web WEB_HOOK_FETCH_WORK = $(WEB_DIR)/fetch_work.php 15

Semantics of fetched jobs › Condor_startd treats them just like any other kind of job:  All the standard resource policy expressions apply (START, SUSPEND, PREEMPT, RANK, etc).  Fetched jobs can coexist in the same pool with jobs pushed by Condor, COD, etc.  Fetched work != Backfill 16

Semantics continued › If the startd is unclaimed and fetches a job, a claim is created › If that job completes, the claim is reused and the startd fetches again › Keep fetching until either:  The claim is evicted by Condor  The fetch hook returns no more work 17

Limitations of the hooks › If the starter can’t run your fetched job because your ClassAd is bogus, no hook is invoked to tell you about it  We need a HOOK_STARTER_FAILURE › No hook when the starter is about to evict you (so you can checkpoint)  Can implement this yourself with a wrapper script and the SoftKillSig attribute 18

Job Router › Automated way to let jobs run on a wider array of resources  Transform jobs into different forms  Reroute jobs to different destinations 19

What is “job routing”? 20 Universe = “vanilla” Executable = “sim” Arguments = “seed=345” Output = “stdout.345” Error = “stderr.345” ShouldTransferFiles = True WhenToTransferOutput = “ON_EXIT” Universe = “grid” GridType = “gt2” GridResource = \ “cmsgrid01.hep.wisc.edu/jobmanager-condor” Executable = “sim” Arguments = “seed=345” Output = “stdout” Error = “stderr” ShouldTransferFiles = True WhenToTransferOutput = “ON_EXIT” JobRouter Routing Table: Site 1 … Site 2 … final status routed (grid) joboriginal (vanilla) job

Routing is just site-level matchmaking › With feedback from job queue number of jobs currently routed to site X number of idle jobs routed to site X rate of recent success/failure at site X › And with power to modify job ad change attribute values (e.g. Universe) insert new attributes (e.g. GridResource) add a “portal” grid proxy if desired 21

Configuring the Routing Table › JOB_ROUTER_ENTRIES list site ClassAds in configuration file › JOB_ROUTER_ENTRIES_FILE read site ClassAds periodically from a file › JOB_ROUTER_ENTRIES_CMD read periodically from a script example: query a collector such as Open Science Grid Resource Selection Service 22

Syntax › List of sites in new ClassAd format [ Name = “Grid Site 1”; … ] [ Name = “Grid Site 2”; … ] [ Name = “Grid site 3”; … ] … 23

Syntax [ Name = “Site 1”; GridResource = “gt2 gk.foo.edu”; MaxIdleJobs = 10; MaxJobs = 200; FailureRateThreshold = 0.01; JobFailureTest = other.RemoteWallClockTime < 1800 Requirements = target.WantJobRouter is True; delete_WantJobRouter = true; set_PeriodicRemove = JobStatus == 5; ] 24

What Types of Input Jobs? › Vanilla Universe › Self Contained (everything needed is in file transfer list) › High Throughput (many more jobs than cpus) 25

Grid Gotchas › Globus gt2  no exit status from job (reported as 0) › Most grid universe types  must explicitly list desired output files 26

JobRouter vs. Glidein › Glidein - Condor overlays the grid  job never waits in remote queue  job runs in its normal universe  private networks doable, but add to complexity  need something to submit glideins on demand › JobRouter  some jobs wait in remote queue (MaxIdleJobs)  job must be compatible with target grid semantics  simple to set up, fully automatic to run 27

Job Router Hooks › Truly transform jobs, not just reroute them  E.g. stuff a job into a virtual machine (either VM universe or Amazon EC2) › Hooks invoked like startd ones 28

HOOK_TRANSLATE › Invoked when a job is matched to a route › Stdin: route name and job ad › Stdout: transformed job ad › Transformed job is submitted to Condor 29

HOOK_UPDATE_JOB_INFO › Invoked periodically to obtain extra information about routed job › Stdin: routed job ad › Stdout: attributes to update in routed job ad 30

HOOK_JOB_FINALIZE › Invoked when routed job has completed › Stdin: ads of original and routed jobs › Stdout: modified original job ad or nothing (no updates) 31

HOOK_JOB_CLEANUP › Invoked when original job returned to schedd (both success and failure) › Stdin: Original job ad › Use for cleanup of external resources 32

Power Management › Hibernate execute machines when not needed › Condor doesn’t handle waking machines up yet › Information to wake machines available in machine ads 33

Configuring Power Management › HIBERNATE  Expression evaluated periodically by all slots to decide when to hibernate  All slots must agree to hibernate › HIBERNATE_CHECK_INTERVAL  Number of seconds between hibernation checks 34

Setting HIBERNATE › HIBERNATE must evaluate to one of these strings:  “NONE”, “0”  “S1”, “1”, “STANDBY”, “SLEEP”  “S2”, “2”  “S3”, “3”, “RAM”, “MEM”  “S4”, “4”, “DISK”, “HIBERNATE”  “S5”, “5”, “SHUTDOWN” › These numbers are ACPI power states 35

Power Management on Linux › On linux, theses methods are tried in order for setting power level:  pm-UTIL tools  /sys/power  /proc/ACPI › LINUX_HIBERNATION_METHOD can be set to pick a favored method 36

Sample Configuration ShouldHibernate = \ ((KeyboardIdle > $(StartIdleTime)) \ && $(CPUIdle) \ && ($(StateTimer) > (2 * $(HOUR))) HIBERNATE = ifThenElse( \ $(ShouldHibernate), “RAM”, “NONE” ) HIBERNATE_CHECK_INTERVAL = 300 LINUX_HIBERNATION_METHOD = “/proc” 37

Dynamic Slot Partitioning › Divide slots into chunks sized for matched jobs › Readvertise remaining resources › Partitionable resources are cpus, memory, and disk 38

How It Works › When match is made…  New sub-slot is created for job and advertised  Slot is readvertised with remaining resources › Slot can be partitioned multiple times › Original slot ad never enters Claimed state  But may eventually have too few resources to be matched › When claim on sub-slot is released, resources are added back to original slot 39

Configuration › Resources still statically partitioned between slots › SLOT_TYPE_ _PARTITIONABLE  Set to True to enable dynamic partition within indicated slot 40

New Machine Attributes › In original slot machine ad  PartitionableSlot = True › In ad for dynamically-created slots  DynamicSlot = True › Can reference these in startd policy expressions 41

Job Submit File › Jobs can request how much of partitionable resources they need  request_cpus = 3  request_memory = 1024  request_disk =

Dynamic Partitioning Caveats › Cannot preempt original slot or group of sub-slots  Potential starvation of jobs with large resource requirements › Partitioning happens once per slot each negotiation cycle  Scheduling of large slots may be slow 43

Concurrency Limits › Limit job execution based on admin- defined consumable resources  E.g. licenses › Can have many different limits › Jobs say what resources they need › Negotiator enforces limits pool-wide 44

Concurrency Example › Negotiator config file  MATLAB_LIMIT = 5  NFS_LIMIT = 20 › Job submit file  concurrency_limits = matlab,nfs:3  This requests 1 Matlab token and 3 NFS tokens 45

New Variable Substitution › $$(Foo) in submit file  Existing feature  Attribute Foo from machine ad substituted › $$([Memory * 0.9]) in submit file  New feature  Expression is evaluated and then substituted 46

More Info For Preemption › New attributes for these preemption expressions in the negotiator…  PREEMPTION_REQUIREMENTS  PREEMPTION_RANK › Used for controlling preemption due to user priorities 47

Preemption Attributes › Submitter/RemoteUserPrio  User priority of candidate and running jobs › Submitter/RemoteUserResourcesInUse  Number of slots in use by user of each job › Submitter/RemoteGroupResourcesInUse  Number of slots in use by each user’s group › Submitter/RemoteGroupQuota  Slot quota for each user’s group 48

Thank You! › Any questions? 49