1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison

Slides:



Advertisements
Similar presentations
Condor Project Computer Sciences Department University of Wisconsin-Madison Introduction Condor.
Advertisements

Condor Project Computer Sciences Department University of Wisconsin-Madison Eager, Lazy, and Just-in-Time.
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
1 Concepts of Condor and Condor-G Guy Warner. 2 Harvesting CPU time Teaching labs. + Researchers Often-idle processors!! Analyses constrained by CPU time!
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor-G: A Case in Distributed.
SIE’s favourite pet: Condor (or how to easily run your programs in dozens of machines at a time) Adrián Santos Marrero E.T.S.I. Informática - ULL.
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
GRID Workload Management System Massimo Sgaravatto INFN Padova.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 8 Introduction to Printers in a Windows Server 2008 Network.
Condor Overview Bill Hoagland. Condor Workload management system for compute-intensive jobs Harnesses collection of dedicated or non-dedicated hardware.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machine Universe in.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Zach Miller Condor Project Computer Sciences Department University of Wisconsin-Madison Flexible Data Placement Mechanisms in Condor.
Condor Project Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Utilizing Condor and HTC to address archiving online courses at Clemson on a weekly basis Sam Hoover 1 Project Blackbird Computing,
Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Alain Roy Computer Sciences Department University of Wisconsin-Madison An Introduction To Condor International.
Web Servers Web server software is a product that works with the operating system The server computer can run more than one software product such as .
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
1 Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
Prof. Heon Y. Yeom Distributed Computing Systems Lab. Seoul National University FT-MPICH : Providing fault tolerance for MPI parallel applications.
Chapter 7: WORKING WITH GROUPS
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
Condor Tugba Taskaya-Temizel 6 March What is Condor Technology? Condor is a high-throughput distributed batch computing system that provides facilities.
1 HawkEye A Monitoring and Management Tool for Distributed Systems Todd Tannenbaum Department of Computer Sciences University of.
Hao Wang Computer Sciences Department University of Wisconsin-Madison Security in Condor.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor-G and DAGMan.
Peter Keller Computer Sciences Department University of Wisconsin-Madison Quill Tutorial Condor Week.
Grid Computing I CONDOR.
Compiled Matlab on Condor: a recipe 30 th October 2007 Clare Giacomantonio.
Condor Birdbath Web Service interface to Condor
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
Greg Thain Computer Sciences Department University of Wisconsin-Madison cs.wisc.edu Interactive MPI on Demand.
Part 6: (Local) Condor A: What is Condor? B: Using (Local) Condor C: Laboratory: Condor.
Intermediate Condor Rob Quick Open Science Grid HTC - Indiana University.
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Condor Project Computer Sciences Department University of Wisconsin-Madison A Scientist’s Introduction.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor-G Operations.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
Alain Roy Computer Sciences Department University of Wisconsin-Madison ClassAds: Present and Future.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor RoadMap.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Quill / Quill++ Tutorial.
1 Condor BirdBath SOAP Interface to Condor Charaka Goonatilake Department of Computer Science University College London
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Review of Condor,SGE,LSF,PBS
Condor Project Computer Sciences Department University of Wisconsin-Madison Grids and Condor Barcelona,
Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.
Derek Wright Computer Sciences Department University of Wisconsin-Madison New Ways to Fetch Work The new hook infrastructure in Condor.
Error Scope on a Computational Grid Douglas Thain University of Wisconsin 4 March 2002.
Peter Couvares Computer Sciences Department University of Wisconsin-Madison Condor DAGMan: Introduction &
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
Matthew Farrellee Computer Sciences Department University of Wisconsin-Madison Condor and Web Services.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor NT Condor ported.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor and Virtual Machines.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
Intermediate Condor Monday morning, 10:45am Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Using Condor An Introduction Condor Week 2004
Chapter 2: System Structures
Using Condor An Introduction Condor Week 2003
HTCondor Training Florentia Protopsalti IT-CM-IS 1/16/2019.
The Condor JobRouter.
Presentation transcript:

1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison

2 Stable vs. Development Series › Much like the Linux kernel, Condor provides two different releases at any time:  Stable series  Development series › Allows Condor to be both a research project and a production-ready system

3 Stable series › Series number in version is even (e.g ) › Releases are heavily tested › Only bug fixes and ports to new platforms are added on a stable series

4 Stable series (cont.) › A given stable release is always compatible with other releases from the same series › Recommended for production pools

5 Development Series › Series number in the version is odd (e.g , 6.3.1) › New features and new technology are added frequently › Versions from the same development series are not always compatible with each other

6 Development Series (cont.) › Releases are not as heavily tested › Generally not recommended for production pools  … unless new features are required  … unless we recommend otherwise :^)

7 Where is Condor Today? › Version being released asap – this is the v6.4.0 release candidate. › We expect version released by the end of March.

8 What’s new for Condor v6.4.0?

9 New Ports in › Full support (with checkpointing and remote system calls):  RedHat 7.x (Linux 2.4.x kernel + glibc 2.2.x)

10 New Ports in (cont.) › ”Clipped" support (no checkpointing, PVM, or remote system calls, but all other functionality is available)  Windows 2000  Mac OS X

11 Secure Communication › Secure network communication  Strong user authentication Multiple methods supported: Kerberos, X509, NT LanMan, …  Encryption  Integrity › Authorization based on host or user

12 New Job Universes › MPI Universe  Launch MPI jobs linked with MPICH library › Globus Universe  Faster, more reliable, better integrated › Java Universe

13 Java Universe Job universe = java executable = Main.class jar_files = MyLibrary.jar input = infile output = outfile arguments = Main queue condor_submit

14 Why not use Vanilla Universe for Java jobs? › Java Universe provides more than just inserting “java” at the start of the execute line  Knows which machines have a JVM installed  Knows the location, version, and performance of JVM on each machine  Provides more information about Java job completion than just JVM exit code Program runs in a Java wrapper, allowing Condor to report Java exceptions, etc.

15 Java support, cont. condor_status -java Name JavaVendor Ver State Activity LoadAv Mem aish.cs.wisc. Sun Microsy Owner Idle anfrom.cs.wis Sun Microsy Owner Idle babe.cs.wisc. Sun Microsy Claimed Busy

16 Condor File Transfer › Condor will transfer job files from the submit machine to the execute machine › Files to send and/or receive specified at submit time › Transfer is atomic  All files are transferred, or transfer fails › Appeared in v6.2 only in Condor for Windows

17 File Transfer, cont. › Example: transfer_input_files = x, y, z … transfer_output_files = a, b, c …. transfer_files = [ ALWAYS | ONEXIT ] › Note: Condor can automatically figure out output files  Default: Send back any new/changed files

18 Remote I/O Socket › Job can request that the condor_starter process on the execute machine create a Remote I/O Socket › Used for online access of file on submit machine – without Standard Universe.  Use in Vanilla, Java, … › Libraries provided for Java and for C, e.g. : Java: FileInputStream -> ChirpInputStream C : open() -> chirp_open()

Job Fork startershadow Home File System I/O Library I/O ServerI/O Proxy Secure Remote I/O Local System Calls Local I/O (Chirp) Execution Site Submission Site

20 Job Policy Expressions › User can supply job policy expressions in the submit file. › Can be used to describe a successful run. on_exit_remove = on_exit_hold = periodic_remove = periodic_hold =

21 Job Policy Examples › Do not remove if exits with a signal: on_exit_remove = ExitBySignal == False › Place on hold if exits with nonzero status or ran for less than an hour: on_exit_hold = ((ExitBySignal==False) && (ExitSignal != 0)) || ((ServerStartTime – JobStartDate) < 3600) › Place on hold if job has spent more than 50% of its time suspended: periodic_hold = CumulativeSuspensionTime > (RemoteWallClockTime / 2.0)

22 Firewall Support › Port Restrictions  In condor_config file can specify: LOWPORT = x HIGHPORT = y  All dynamic ports will be between x and y inclusive › Condor + Firewalls/Private Networks:  Who: Se-Chang Son  Time: 9am-12pm Weds  Where: rm 3387

23 Condor on Windows › On both NT and Win2k › New universes added: MPI, Java, Scheduler (and Globus in the works!) › DAGMan ported › CondorView ported › Run shadow + DAGMan as the user  Allows submission from directories on shared filesystems

24 And more… › Unix Man pages › Fetch/consolidate log files remotely › ClassAd chaining › Many DAGMan improvements › Bug fixes, etc…

25 What’s Next? Future Directions › Increased focus on standalone tools built with Condor Technology  DAGMan  NeST  PFS  HawkEye  Condor-G …

26 What’s Next? › Big Item: More focus on being a service provider than just an end-user tool:  Developer APIs / libraries  SOAP access to services  XML representations of user logs, ClassAds, accounting info, etc.

27 More what’s next… › Condor on Windows  Increased support from Microsoft Research  Remote I/O  Complete Shared Filesystem support  Condor-G › MPI Scheduling Improvements

28 More what’s next… › New version of ClassAds into Condor  Conditionals !! if/then/else  Aggregates (lists, nested classads)  Built-in functions String operations, pattern matching, time operators, unit conversions  Clean implementations in C++ and Java  ClassAd collections

29 More what’s next… › Re-write of the condor_schedd  Performance enhancements and lowered resource requirements (particularly RAM) › Re-write of the checkpoint server  Add secure communication  NEST technology infusion  Enhanced support for multiple servers  Store meta-data along with checkpoint files

30 Thank you for coming to Paradyn/Condor Week!