Hunter of Idle Workstations Miron Livny Marvin Solomon University of Wisconsin-Madison URL:

Slides:



Advertisements
Similar presentations
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor: A Project and.
Advertisements

Community Grids Lab1 CICC Project Meeting VOTable Developed VotableToSpreadsheet Service which accepts VOTable file location as an input, converts to Excel.
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Building a secure Condor ® pool in an open academic environment Bruce Beckles University of Cambridge Computing Service.
1 Concepts of Condor and Condor-G Guy Warner. 2 Harvesting CPU time Teaching labs. + Researchers Often-idle processors!! Analyses constrained by CPU time!
Basic Grid Projects – Condor Part II Sathish Vadhiyar Sources/Credits: Condor Project web pages.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
Matchmaking in the Condor System Rajesh Raman Computer Sciences Department University of Wisconsin-Madison
Dr. David Wallom Use of Condor in our Campus Grid and the University September 2004.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
CS 501: Software Engineering Fall 2000 Lecture 16 System Architecture III Distributed Objects.
1 Draft of a Matchmaking Service Chuang liu. 2 Matchmaking Service Matchmaking Service is a service to help service providers to advertising their service.
GRID Workload Management System Massimo Sgaravatto INFN Padova.
6d.1 Schedulers and Resource Brokers ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson.
Workload Management Massimo Sgaravatto INFN Padova.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Condor Overview Bill Hoagland. Condor Workload management system for compute-intensive jobs Harnesses collection of dedicated or non-dedicated hardware.
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Jim Basney Computer Sciences Department University of Wisconsin-Madison Managing Network Resources in.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
Miron Livny Computer Sciences Department University of Wisconsin-Madison Harnessing the Capacity of Computational.
Grid Computing, B. Wilkinson, 20046d.1 Schedulers and Resource Brokers.
Alain Roy Computer Sciences Department University of Wisconsin-Madison An Introduction To Condor International.
Grid Computing 7700 Fall 2005 Lecture 17: Resource Management Gabrielle Allen
Grid Computing, B. Wilkinson, 20046d.1 Schedulers and Resource Brokers.
Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.
High Throughput Computing with Condor at Purdue XSEDE ECSS Monthly Symposium Condor.
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
Resource Management and Accounting Working Group Working Group Scope and Components Progress made Current issues being worked Next steps Discussions involving.
An Introduction to High-Throughput Computing Monday morning, 9:15am Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
Condor Tugba Taskaya-Temizel 6 March What is Condor Technology? Condor is a high-throughput distributed batch computing system that provides facilities.
1 HawkEye A Monitoring and Management Tool for Distributed Systems Todd Tannenbaum Department of Computer Sciences University of.
Grid Computing I CONDOR.
SUMA: A Scientific Metacomputer Cardinale, Yudith Figueira, Carlos Hernández, Emilio Baquero, Eduardo Berbín, Luis Bouza, Roberto Gamess, Eric García,
Condor Birdbath Web Service interface to Condor
Condor Project Computer Sciences Department University of Wisconsin-Madison A Scientist’s Introduction.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
Alain Roy Computer Sciences Department University of Wisconsin-Madison ClassAds: Present and Future.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.
1 Condor BirdBath SOAP Interface to Condor Charaka Goonatilake Department of Computer Science University College London
CS 501: Software Engineering Fall 1999 Lecture 12 System Architecture III Distributed Objects.
Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.
ClearQuest XML Server with ClearCase Integration Northwest Rational User’s Group February 22, 2007 Frank Scholz Casey Stewart
An Introduction to High-Throughput Computing With Condor Tuesday morning, 9am Zach Miller University of Wisconsin-Madison.
Scheduling & Resource Management in Distributed Systems Rajesh Rajamani, May 2001.
Nicholas Coleman Computer Sciences Department University of Wisconsin-Madison Distributed Policy Management.
Condor Services for the Global Grid: Interoperability between OGSA and Condor Clovis Chapman 1, Paul Wilson 2, Todd Tannenbaum 3, Matthew Farrellee 3,
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
Grid Workload Management (WP 1) Massimo Sgaravatto INFN Padova.
Condor Tutorial NCSA Alliance ‘98 Presented by: The Condor Team University of Wisconsin-Madison
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor NT Condor ported.
6d.1 Schedulers and Resource Brokers Topics ITCS 4146/5146, UNC-Charlotte, B. Wilkinson, 2007 Feb 12, 2007 Local schedulers Condor.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Workload Management Workpackage
Quick Architecture Overview INFN HTCondor Workshop Oct 2016
Condor – A Hunter of Idle Workstation
Condor: Job Management
A Distributed Policy Scenario
Accounting, Group Quotas, and User Priorities
Basic Grid Projects – Condor (Part I)
HTCondor Training Florentia Protopsalti IT-CM-IS 1/16/2019.
Wide Area Workload Management Work Package DATAGRID project
Presentation transcript:

Hunter of Idle Workstations Miron Livny Marvin Solomon University of Wisconsin-Madison URL:

2

3 Outline  Condor overview  Potential uses of Java in Condor  Current use of Java in Condor: Classified Advertisements

4 What is Condor?  Resource finder  Batch queue manager  Scheduler  Checkpoint/Restart  Process migration  Remote system calls All jobs Jobs linked with the Condor library

5 Condor is Real  In production use at dozens (hundreds?) of sites  In production use for over a decade  Basis of commercial products Load leveler LCF  Evolving

6 Condor System Structure Submit MachineExecution Machine Collector CA [...A] [...B] [...C] CN RA Negotiator Customer AgentResource Agent Central Manager

7 Customer Agent  Maintains queue of submitted jobs  Advertises status  Selects jobs to run

8 Resource Agent  Monitors system status Load average Keyboard and mouse idle time Memory, disk space,...  Advertises status  Listens for requests to run jobs

9 Central Manager  Collector Accepts ads from resource agents and customer agents  Negotiator Matches customers with resources  Accountant Records resource usage by customers

10 Condor System Structure Submit MachineExecution Machine Collector CA [...A] [...B] [...C] CN RA Negotiator Customer AgentResource Agent Central Manager

11 Advertising Protocol CA [...A] [...B] [...C] CN RA [...N] [...M]

12 Advertising Protocol CA [...A] [...B] [...C] CN RA [...M] [...N]

13 Matching Protocol CA [...A] [...B] [...C] CN RA [...M] [...N]

14 Claiming Protocol CA [...A] [...C] CN RA [...S]

15 Claiming Protocol CA [...A] [...C] CN RA [...S] Job

16 Remote System Calls CA [...A] [...C] CN RA [...S] JobShadow

17 Condor Meets Java  Java jobs  Java for Condor implementation

18 Running Java Jobs  Run JVM as “vanilla” job Class files are treated as ordinary jobs Requires uniform environment (same CLASSPATH everywhere) No checkpointing  Re-link JVM as “standard” job Remote system calls for class loader  Checkpoint/restart of “vanilla” jobs

19 Java-Aware Condor  Class file as “job” Requires “pre-installed” JVM, class libraries and/or job “package” (code + files) Also useful for remote compilation  Checkpoint JVM state  Platform-independent checkpoint

20 Java for Implementing Condor

21 Classified Advertisements  Simple yet powerful  Extensible  Active matching  Symmetric matching

22 Symmetric Active Matching  Job requires a workstation X86 architecture Solaris GB memory  Resource is only avialable Between 6pm and 6am If the keyboard is idle at least 15 mintues To DOE Contractors

23 The ClassAd Language  Set of bindings of Attribute Names to Expressions  Self-describing (no separate schema)  Combine query and data  Arbitrarily composed and nested

24 Examples [ Type= "Job"; Owner= "raman"; Cmd= "run_sim"; Args= "-Q "; Cwd= "/u/raman"; Memory= 31; Qdate= ;... Rank= other.Kflops... Constraint= other.Type =... ] [ Type= "Machine"; Name= "xxy.cs...."; Arch= "iX86"; OpSys= "Solaris"; Mips= 104; Kflops= 21893; State= "Unclaimed"; LoadAvg= ;... Rank=...; Constraint=...; ]

25 Attribute Expressions  Constants104, , "iX86"  Referencesattr, self.attr, other.attr, expr.attr  Operators+, *, >>, =, &&,...  Functionsstrcat, substr, floor, member,...  Lists{ expr, expr,... }  ClassAds[ name=expr; name=expr;... ]

26 Example Attributes  Descriptive attributes Type = "Job"; Owner = "raman"; Arch = "iX86"; OpSys = "Solaris"; Memory = 64;// megabytes Disk = ;// k bytes

27 Example Attributes  Current state Daytime = 36017;// secs past midnight KeyboardIdle = 1432;// seconds State = "Unclaimed"; LoadAvg = ;

28 Example Attributes  Parameters ResearchGrp = { "raman", "miron", "solomon", "jbasney" }; Friends = { "tannenba", "wright" }; Untrusted = { "rival", "riffraff" }; WantCheckpoint = 1;

29 Complex Attributes  Derived data Rank =// machine's rank for job 10 * member(other.Owner,ResearchGrp) + member(other.Owner, Friends); Rank =// job's rank for machine Kflops/1E3 + other.Memory/32;

30 Constraints  Job constraint Constraint = other.Type = "Machine" && Arch = "iX86" && OpsSys = "Solaris" && Disk > && other.Memory >= self.Memory;

31 Constraints  Machine constraint Constraint = ! member(other.Owner, Untrusted) && Rank >= 10 ? true : Rank > 0 ? (LoadAvg 15*60) : DayTime 18*60*60;

32 Matching Algorithm  To match two ads A and B Set up enironment such that in A –self evaluates to A –other evaluates to B –other attributes are searched for first in A and then in B –and vice versa (with A and B interchanged) Check if A.Constraint and B.Constraint both evaluate to true A.Rank and B.Rank for preferences

33 Three-valued Logic other.Memory > 32all other.Memory == 32UNDEFINED other.Memory != 32 if other has no !(other.Memory == 32)"Memory" attribute other.Mips >= 10 || other.Kflps >= 1000 TRUEif either attribute exists and satisfies the given condition

34 Summary  Distributed resource allocation Distributed clients, servers Heterogeneous resources Distributed ownership  Classified advertisements Semi-structured data model Schema, data, and query in one language Separation of matching from claiming

35 Summary  ClassAds are currently in use throughout Condor Flexible Robust  C++ and Java implementations  Freely available as part of Condor and as stand-alone libraries

36 Future Work  Get “Java” customers  Support “Java” customers Vanilla jobs Standard jobs Java-aware Condor execution engine

37 Future Work  Application of ClassAds to other distributed resource-allocation and discovery problems  Bulk operations and aggregation Structural regularity Value regularity  User interfaces  Tools

38 Information About Condor  WWW 