Presentation is loading. Please wait.

Presentation is loading. Please wait.

Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.

Similar presentations


Presentation on theme: "Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor."— Presentation transcript:

1 Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison condor-admin@cs.wisc.edu http://www.cs.wisc.edu/condor What’s New in Condor

2 www.cs.wisc.edu/condor 2 Overview Quick ‘sound bytes’ on new functionality in recent Condor releases › Condor Development Process › New Features in Condor version 6.6.x › New Features in Condor version 6.7.0

3 www.cs.wisc.edu/condor 3 Condor Development Process › We maintain two different releases at all times  Stable Series Second digit is even: e.g. 6.2.2, 6.4.7, 6.6.3  Development Series Second digit is odd: e.g. 6.5.1, 6.7.2

4 www.cs.wisc.edu/condor 4 Stable Series › Heavily tested › Runs on our department production pool of nearly 1,000 CPUs (for min of 3 weeks) › No new features, only bugfixes and ports. › A given stable release is always compatible with other releases from the same series  6.6.X is compatible with 6.6.Y › Recommended for production pools

5 www.cs.wisc.edu/condor 5 Development Series › Less heavily tested › Runs on our small(er) test pool. › New features and new technology are added frequently › Versions from the same development series are not guaranteed compatible with each other (although we try hard)

6 www.cs.wisc.edu/condor 6 New in version 6.6.x › Version 6.6.0 released in November 03. › Current release: version 6.6.3, released in April 04.

7 www.cs.wisc.edu/condor 7 The Struggle to Build Condor › Condor is BIG  Condor code consists of primary source plus ‘externals’. Externals include Kerberos, zlib, GSI, PVM, gSOAP… Patches to externals

8 www.cs.wisc.edu/condor 8 The Struggle to Build Condor › Condor is BIG  Condor code consists of primary source plus ‘externals’. Externals include Kerberos, zlib, GSI, PVM, gSOAP… Patches to externals  Current shipped source + externals: ~415MB of source, or ~9 million lines!  Building Condor outside of UW- Madison used to be very difficult. “LIST OF SHAME”“LIST OF SHAME”: Build pointed to packages on UW-Madison fileservers.

9 www.cs.wisc.edu/condor 9 Now Condor Source “Self-Contained” › Source code to externals are now bundled w/ Condor itself.  Self-contained  Allows version control on externals + patches › Build w/ just “configure; make” !  Checks for existence and proper version of all “bootstrap” requirements, such as the compiler  Applies our patches to the externals  All 9 million lines built and bundled

10 www.cs.wisc.edu/condor 10 Building Condor Building Condor before Version 6.6.0… Building Condor Post Version 6.6.0!

11 www.cs.wisc.edu/condor 11 › NMI = NSF Middleware Initiative › Automated build and test infrastructure built on top of Condor  Pool of 37 machines of many architectures  Scalable  Runs every night, builds several Condor source branches, then runs 114 test programs.  All results stored in RDBMS, reported on the web.  Yes, Condor builds Condor! Condor + NMI

12 www.cs.wisc.edu/condor 12 Ports › New Ports w/ v6.6.x –vs- v6.4.x :  Solaris 9  RedHat Linux 8.x, 9.x for x86 (+RPMs)  RedHat Linux 7.x and SUSE 8.0 for IA64 (clipped)  Tru64 5.1 (clipped)  AIX 5.2 (clipped)  Mac OS X (clipped)

13 www.cs.wisc.edu/condor 13 Some new components › Computing On Demand (COD) › Integration of “Hawkeye” technology › Condor-G Additions  Matchmaking  Grid Monitor  Grid Shell

14 www.cs.wisc.edu/condor 14 Computing On Demand (COD) › Introduce effective timesharing to a distributed system  Batch applications often want sustained throughput for a long period of time  Interactive applications often want a quick burst of CPU power for small period of time  COD : Allow both to co-exist

15 www.cs.wisc.edu/condor 15 HawkEye Technology › Dynamic Resource Monitoring, now ‘built-in’ to Condor.  Allows custom dynamic attributes to be added into machine classads.  These attributes can be used for Queries Scheduling  Many plugins available. Disk space, memory used, network errors, open files/descriptors, process monitoring, users, …

16 www.cs.wisc.edu/condor 16 Condor-G › Condor-G Matchmaking  Condor-G can determine which grid site to utilize via ClassAd matchmaking (grid planning, meta scheduling, …) › Condor-G Grid Monitor  Reduces the load on a GT2-based gatekeeper, greatly increasing the amount of jobs that can be submitted › Condor-G GridShell  A wrapper for the job  Reports exit status, cpu utilization, more

17 www.cs.wisc.edu/condor 17 Improvements in Condor for Windows › Ability to run SCHEDULER universe jobs  Including DAGMan › JAVA universe support › More Win32 flavors, incl international versions. › Added support for encryption on disk of the job and data files on execute machine.

18 www.cs.wisc.edu/condor 18 New Features in DAGMan › DAGMan previously required that all jobs in a DAG share one log file › Each job can now have it’s own log file › Understands XML formatted logs › Can draw a graphical representation of your DAG  Uses GraphViz, http://www.graphviz.org/

19 www.cs.wisc.edu/condor 19

20 www.cs.wisc.edu/condor 20 Central Manager New Features › Central Manager daemons can now run on any port COLLECTOR_HOST = condor.cs.wisc.edu:9019 NEGOTIATOR_HOST = condor.cs.wisc.edu:9020  Useful for firewall situations  Allows multiple instances on one machine › Keeps statistics on missed updates › Can use TCP instead of UDP, if you must

21 www.cs.wisc.edu/condor 21 Command-line Tools › ‘condor_update_stats’ tool to display information on any dropped central manager updates › ‘condor_q –hold’ gives you a list of held jobs and the reason they were put on hold › ‘condor_config_val –v’ tells you where (file and line number) an attribute is defined › ‘condor_fetch_log’ will grab a log file from a remote machine:  condor_fetch_log c2-15.cs.wisc.edu STARTD › ‘condor_configure’ will install Condor via simple command-line switches, no questions asked › ‘condor_vacate_job’ to release a resource by job id, and can be invoked by the job owner. › `condor_wait’ blocks until a job or set of jobs completes

22 www.cs.wisc.edu/condor 22 New 6.7.x Development Series › Release of v6.7.0 in April 04. › Can you take the suspense?!?

23 www.cs.wisc.edu/condor 23 V6.7 Themes › Scalability  Resources, jobs, matchmaking framework › Accessibility  APIs, more Grid middleware, network › Availability  Failover

24 www.cs.wisc.edu/condor 24 Thank You! › Later this afternoon is the roadmap for future work. › Questions?


Download ppt "Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor."

Similar presentations


Ads by Google