Presentation is loading. Please wait.

Presentation is loading. Please wait.

Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Job Delegation and Planning.

Similar presentations


Presentation on theme: "Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Job Delegation and Planning."— Presentation transcript:

1 Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison tannenba@cs.wisc.edu http://www.cs.wisc.edu/condor Job Delegation and Planning in Condor-G ISGC 2005 Taipei, Taiwan

2 www.cs.wisc.edu/condor 2 The Condor Project (Established ‘85) Distributed High Throughput Computing research performed by a team of ~35 faculty, full time staff and students.

3 www.cs.wisc.edu/condor 3 The Condor Project (Established ‘85) Distributed High Throughput Computing research performed by a team of ~35 faculty, full time staff and students who:  face software engineering challenges in a distributed UNIX/Linux/NT environment  are involved in national and international grid collaborations,  actively interact with academic and commercial users,  maintain and support large distributed production environments,  and educate and train students. Funding – US Govt. (DoD, DoE, NASA, NSF, NIH), AT&T, IBM, INTEL, Microsoft, UW-Madison, …

4 www.cs.wisc.edu/condor 4 A Multifaceted Project › Harnessing the power of clusters – dedicated and/or opportunistic (Condor) › Job management services for Grid applications (Condor-G, Stork) › Fabric management services for Grid resources (Condor, GlideIns, NeST) › Distributed I/O technology (Parrot, Kangaroo, NeST) › Job-flow management (DAGMan, Condor, Hawk) › Distributed monitoring and management (HawkEye) › Technology for Distributed Systems (ClassAD, MW) › Packaging and Integration (NMI, VDT)

5 www.cs.wisc.edu/condor 5 Some software produced by the Condor Project › Condor System › ClassAd Library › DAGMan › Fault Tolerant Shell (FTSH) › Hawkeye › GCB › MW › NeST › Stork › Parrot › VDT › And others… all as open source Data!

6 www.cs.wisc.edu/condor 6 Who uses Condor? › Commercial  Oracle, Micron, Hartford Life Insurance, CORE, Xerox, Exxon/Mobile, Shell, Alterra, Texas Instruments, … › Research Community  Universities, Govt Labs  Bundles: NMI, VDT  Grid Communities: EGEE/LCG/gLite, Particle Physics Data Grid (PPDG), USCMS, LIGO, iVDGL, NSF Middleware Initiative GRIDS Center, …

7 www.cs.wisc.edu/condor 7 Condor Pool Schedd Startd Schedd MatchMaker Jobs

8 www.cs.wisc.edu/condor 8 Condor Pool Schedd Startd Schedd MatchMaker Jobs

9 www.cs.wisc.edu/condor 9 Condor-G Globus 2 Globus 4 Unicore (Nordugrid) Startd Schedd Jobs LSF PBS Schedd - Condor-G - Condor-C

10 www.cs.wisc.edu/condor 10 User/Application/Portal Fabric ( processing, storage, communication ) Grid Condor Pool Middleware (Globus 2, Globus 4, Unicore, …) Condor-G

11 www.cs.wisc.edu/condor 11 › Transfer of responsibility to schedule and execute a job  Stage in executable and data files  Transfer policy “instructions”  Securely transfer (and refresh?) credentials, obtain local identities  Monitor and present job progress (tranparency!)  Return results Job Delegation › Multiple delegations can be combined in interesting ways

12 www.cs.wisc.edu/condor 12 Simple Job Delegation in Condor-G Condor-G Globus GRAM Batch System Front-end Execute Machine

13 www.cs.wisc.edu/condor 13 Expanding the Model › What can we do with new forms of job delegation? › Some ideas  Mirroring  Load-balancing  Glide-in schedd, startd  Multi-hop grid scheduling

14 www.cs.wisc.edu/condor 14 Mirroring › What it does  Jobs mirrored on two Condor-Gs  If primary Condor-G crashes, secondary one starts running jobs  On recovery, primary Condor-G gets job status from secondary one › Removes Condor-G submit point as single point of failure

15 www.cs.wisc.edu/condor 15 Mirroring Example Condor-G 1 Execute Machine Condor-G 2 Jobs

16 www.cs.wisc.edu/condor 16 Mirroring Example Condor-G 1 Execute Machine Condor-G 2 Jobs

17 www.cs.wisc.edu/condor 17 Load-Balancing › What it does  Front-end Condor-G distributes all jobs among several back-end Condor-Gs  Front-end Condor-G keeps updated job status › Improves scalability › Maintains single submit point for users

18 www.cs.wisc.edu/condor 18 Load-Balancing Example Condor-G Back-end 1 Condor-G Front-end Condor-G Back-end 3 Condor-G Back-end 2

19 www.cs.wisc.edu/condor 19 Glide-In › Schedd and Startd are separate services that do not require any special privledges  Thus we can submit them as jobs! › Glide-In Schedd  What it does Drop a Condor-G onto the front-end machine of a remote cluster Delegate jobs to the cluster through the glide-in schedd  Can apply cluster-specific policies to jobs Not fork-and-forget…  Send a manager to the site, instead of manage across the internet

20 www.cs.wisc.edu/condor 20 Glide-In Schedd Example Condor-G Glide-In Schedd Batch System Jobs Frontend Middleware

21 www.cs.wisc.edu/condor 21 Glide-In Startd Example Condor-G (Schedd) Batch System Frontend Middleware Startd Job

22 www.cs.wisc.edu/condor 22 Glide-In Startd › Why?  Restores all the benefits that may have been washed away by the middleware  End-to-end management solution Preserves job semantic guarantees Preserves policy  Enables lazy planning

23 www.cs.wisc.edu/condor 23 Sample Job Submit file universe = grid grid_type = gt2 globusscheduler = cluster1.cs.wisc.edu/jobmanager-lsf executable = find_particle arguments = …. output = …. log = … But we want metascheduling…

24 www.cs.wisc.edu/condor 24 Represent grid clusters as ClassAds › ClassAds  are a set of uniquely named expressions; each expression is called an attribute and is an attribute name/value pair  combine query and data  extensible  semi-structured : no fixed schema (flexibility in an environment consisting of distributed administrative domains)  Designed for “MatchMaking”

25 www.cs.wisc.edu/condor 25 Example of a ClassAd that could represent a compute cluster in a grid: Type = "GridSite"; Name = "FermiComputeCluster"; Arch = “Intel-Linux”; Gatekeeper_url = "globus.fnal.gov/lsf" Load = [ QueuedJobs = 42; RunningJobs = 200; ]; Requirements = ( other.Type == "Job" && Load.QueuedJobs < 100 ); GoodPeople = { "howard", "harry" }; Rank = member(other.Owner, GoodPeople) * 500

26 www.cs.wisc.edu/condor 26 Another Sample - Job Submit universe = grid grid_type = gt2 owner = howard executable = find_particle.$$(Arch) requirements = other.Arch == “Intel-Linux” || other.Arch == “Sparc-Solaris” rank = 0 – other.Load.QueuedJobs; globusscheduler = $$(gatekeeper_url) … Note: We introduced augmentation of the job ClassAd based upon information discovered in its matching resource ClassAd.

27 www.cs.wisc.edu/condor 27 Multi-Hop Grid Scheduling › Match a job to a Virtual Organization (VO), then to a resource within that VO › Easier to schedule jobs across multiple VOs and grids

28 www.cs.wisc.edu/condor 28 Multi-Hop Grid Scheduling Example Experiment Condor-G Experiment Resource Broker VO Condor-G VO Resource Broker Globus GRAM Batch Scheduler HEPCMS

29 www.cs.wisc.edu/condor 29 Endless Possibilities › These new models can be combined with each other or with other new models › Resulting system can be arbitrarily sophisticated

30 www.cs.wisc.edu/condor 30 Job Delegation Challenges › New complexity introduces new issues and exacerbates existing ones › A few…  Transparency  Representation  Scheduling Control  Active Job Control  Revocation  Error Handling and Debugging

31 www.cs.wisc.edu/condor 31 Transparency › Full information about job should be available to user  Information from full delegation path  No manual tracing across multiple machines › Users need to know what’s happening with their jobs

32 www.cs.wisc.edu/condor 32 Representation › Job state is a vector › How best to show this to user  Summary Current delegation endpoint Job state at endpoint  Full information available if desired Series of nested ClassAds?

33 www.cs.wisc.edu/condor 33 Scheduling Control › Avoid loops in delegation path › Give user control of scheduling  Allow limiting of delegation path length?  Allow user to specify part or all of delegation path

34 www.cs.wisc.edu/condor 34 Active Job Control › User may request certain actions  hold, suspend, vacate, checkpoint › Actions cannot be completed synchronously for user  Must forward along delegation path  User checks completion later

35 www.cs.wisc.edu/condor 35 Active Job Control (cont) › Endpoint systems may not support actions  If possible, execute them at furthest point that does support them › Allow user to apply action in middle of delegation path

36 www.cs.wisc.edu/condor 36 Revocation › Leases  Lease must be renewed periodically for delegation to remain valid  Allows revocation during long-term failures › What are good values for lease lifetime and update interval?

37 www.cs.wisc.edu/condor 37 Error Handling and Debugging › Many more places for things to go horribly wrong › Need clear, simple error semantics › Logs, logs, logs  Have them everywhere

38 www.cs.wisc.edu/condor 38 From earlier › Transfer of responsibility to schedule and execute a job  Transfer policy “instructions”  Stage in executable and data files  Securely transfer (and refresh?) credentials, obtain local identities  Monitor and present job progress (tranparency!)  Return results

39 www.cs.wisc.edu/condor 39 Job Failure Policy Expressions › Condor/Condor-G augemented so users can supply job failure policy expressions in the submit file. › Can be used to describe a successful run, or what to do in the face of failure. on_exit_remove = on_exit_hold = periodic_remove = periodic_hold =

40 www.cs.wisc.edu/condor 40 Job Failure Policy Examples › Do not remove from queue (i.e. reschedule) if exits with a signal: on_exit_remove = ExitBySignal == False › Place on hold if exits with nonzero status or ran for less than an hour: on_exit_hold = ((ExitBySignal==False) && (ExitSignal != 0)) || ((ServerStartTime – JobStartDate) < 3600) › Place on hold if job has spent more than 50% of its time suspended: periodic_hold = CumulativeSuspensionTime > (RemoteWallClockTime / 2.0)

41 www.cs.wisc.edu/condor 41 Data Placement * (DaP) must be an integral part of the end-to-end solution Space management and Data transfer *

42 www.cs.wisc.edu/condor 42 Stork › A scheduler for data placement activities in the Grid › What Condor is for computational jobs, Stork is for data placement › Stork comes with a new concept: “Make data placement a first class citizen in the Grid.”

43 www.cs.wisc.edu/condor 43 Stage-in Execute the Job Stage-out Stage-in Execute the jobStage-outRelease input spaceRelease output space Allocate space for input & output data Data Placement Jobs Computational Jobs

44 www.cs.wisc.edu/condor 44 DAGMan DAG with DaP Condor Job Queue DaP A A.submit DaP B B.submit Job C C.submit ….. Parent A child B Parent B child C Parent C child D, E ….. C Stork Job Queue E DAG specification ACB D E F

45 www.cs.wisc.edu/condor 45 Why Stork? › Stork understands the characteristics and semantics of data placement jobs. › Can make smart scheduling decisions, for reliable and efficient data placement.

46 www.cs.wisc.edu/condor 46 Failure Recovery and Efficient Resource Utilization › Fault tolerance  Just submit a bunch of data placement jobs, and then go away.. › Control number of concurrent transfers from/to any storage system  Prevents overloading › Space allocation and De-allocations  Make sure space is available

47 www.cs.wisc.edu/condor 47 Support for Heterogeneity Protocol translation using Stork memory buffer.

48 www.cs.wisc.edu/condor 48 Support for Heterogeneity Protocol translation using Stork Disk Cache.

49 www.cs.wisc.edu/condor 49 Flexible Job Representation and Multilevel Policy Support [ Type = “Transfer”; Src_Url = “srb://ghidorac.sdsc.edu/kosart.condor/x.dat”; Dest_Url = “nest://turkey.cs.wisc.edu/kosart/x.dat”; …… Max_Retry = 10; Restart_in = “2 hours”; ]

50 www.cs.wisc.edu/condor 50 Run-time Adaptation › Dynamic protocol selection [ dap_type = “transfer”; src_url = “drouter://slic04.sdsc.edu/tmp/test.dat”; dest_url = “drouter://quest2.ncsa.uiuc.edu/tmp/test.dat”; alt_protocols = “nest-nest, gsiftp-gsiftp”; ] [ dap_type = “transfer”; src_url = “any://slic04.sdsc.edu/tmp/test.dat”; dest_url = “any://quest2.ncsa.uiuc.edu/tmp/test.dat”; ]

51 www.cs.wisc.edu/condor 51 Run-time Adaptation › Run-time Protocol Auto-tuning [ link = “slic04.sdsc.edu – quest2.ncsa.uiuc.edu”; protocol = “gsiftp”; bs = 1024KB;//block size tcp_bs= 1024KB;//TCP buffer size p= 4; ]

52 www.cs.wisc.edu/condor 52 Planner DAGMan Condor-G Stork RFT GRAM SRM StartD SRB NeST GridFTP Application Parrot

53 www.cs.wisc.edu/condor 53 Thank You! › Questions?


Download ppt "Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Job Delegation and Planning."

Similar presentations


Ads by Google