The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison

Slides:



Advertisements
Similar presentations
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Advertisements

Building a secure Condor ® pool in an open academic environment Bruce Beckles University of Cambridge Computing Service.
1 Concepts of Condor and Condor-G Guy Warner. 2 Harvesting CPU time Teaching labs. + Researchers Often-idle processors!! Analyses constrained by CPU time!
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Condor and GridShell How to Execute 1 Million Jobs on the Teragrid Jeffrey P. Gardner - PSC Edward Walker - TACC Miron Livney - U. Wisconsin Todd Tannenbaum.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor-G: A Case in Distributed.
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
Optinuity Confidential. All rights reserved. C2O Configuration Requirements.
GRID Workload Management System Massimo Sgaravatto INFN Padova.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
BMC Control-M Architecture By Shaikh Ilyas
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 8: Implementing and Managing Printers.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 8: Implementing and Managing Printers.
Condor Overview Bill Hoagland. Condor Workload management system for compute-intensive jobs Harnesses collection of dedicated or non-dedicated hardware.
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Distributed Computing Overviews. Agenda What is distributed computing Why distributed computing Common Architecture Best Practice Case study –Condor –Hadoop.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machines in Condor.
Derek Wright Computer Sciences Department, UW-Madison Lawrence Berkeley National Labs (LBNL)
Zach Miller Condor Project Computer Sciences Department University of Wisconsin-Madison Flexible Data Placement Mechanisms in Condor.
Cheap cycles from the desktop to the dedicated cluster: combining opportunistic and dedicated scheduling with Condor Derek Wright Computer Sciences Department.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
17/09/2004 John Kewley Grid Technology Group Introduction to Condor.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Harnessing the Capacity of Computational.
Alain Roy Computer Sciences Department University of Wisconsin-Madison An Introduction To Condor International.
Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
1 Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
Remote OMNeT++ v2.0 Introduction What is Remote OMNeT++? Remote environment for OMNeT++ Remote simulation execution Remote data storage.
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
PCGRID ‘08 Workshop, Miami, FL April 18, 2008 Preston Smith Implementing an Industrial-Strength Academic Cyberinfrastructure at Purdue University.
Installing and Managing a Large Condor Pool Derek Wright Computer Sciences Department University of Wisconsin-Madison
Hao Wang Computer Sciences Department University of Wisconsin-Madison Security in Condor.
Peter Keller Computer Sciences Department University of Wisconsin-Madison Quill Tutorial Condor Week.
Grid Computing I CONDOR.
Condor Birdbath Web Service interface to Condor
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
Greg Thain Computer Sciences Department University of Wisconsin-Madison cs.wisc.edu Interactive MPI on Demand.
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Miron Livny Computer Sciences Department University of Wisconsin-Madison Welcome and Condor Project Overview.
Condor Project Computer Sciences Department University of Wisconsin-Madison A Scientist’s Introduction.
Hao Wang Computer Sciences Department University of Wisconsin-Madison Authentication and Authorization.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor RoadMap.
Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.
Condor Usage at Brookhaven National Lab Alexander Withers (talk given by Tony Chan) RHIC Computing Facility Condor Week - March 15, 2005.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Quill / Quill++ Tutorial.
1 Condor BirdBath SOAP Interface to Condor Charaka Goonatilake Department of Computer Science University College London
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.
Derek Wright Computer Sciences Department University of Wisconsin-Madison New Ways to Fetch Work The new hook infrastructure in Condor.
Pilot Factory using Schedd Glidein Barnett Chiu BNL
Alain Roy Computer Sciences Department University of Wisconsin-Madison Condor & Middleware: NMI & VDT.
Greg Thain Computer Sciences Department University of Wisconsin-Madison Configuring Quill Condor Week.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
Dan Bradley Condor Project CS and Physics Departments University of Wisconsin-Madison CCB The Condor Connection Broker.
Matthew Farrellee Computer Sciences Department University of Wisconsin-Madison Condor and Web Services.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor NT Condor ported.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
A System for Monitoring and Management of Computational Grids Warren Smith Computer Sciences Corporation NASA Ames Research Center.
Gabi Kliot Computer Sciences Department Technion – Israel Institute of Technology Adding High Availability to Condor Central Manager Adding High Availability.
Condor on Dedicated Clusters Peter Couvares and Derek Wright Computer Sciences Department University of Wisconsin-Madison
Condor: Job Management
Condor-G Making Condor Grid Enabled
Condor-G: An Update.
Presentation transcript:

The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison

Stable vs. Development Series › Much like the Linux kernel, Condor provides two different releases at any time:  Stable series  Development series › Allows Condor to be both a research project and a production-ready system

Stable series › Series number in version is even (e.g ) › Releases are heavily tested › Only bug fixes and ports to new platforms are added on a stable series

Stable series (cont.) › A given stable release is always compatible with other releases from the same series › Recommended for production pools

Development Series › Series number in the version is odd (e.g ) › New features and new technology are added frequently › Versions from the same development series are not always compatible with each other

Development Series (cont.) › Releases are not as heavily tested › Not recommended for production pools

Where is Condor Today? › The current stable series, 6.0.*, has been out for too long › The current development series, 6.1.*, is near the end of its life cycle  Code freeze on new features after  should be out later this spring › This was our first stable/development series: we're learning...

New Ports in › Full support (with checkpointing and remote system calls):  All current versions of Linux (x86) Kernel: 2.2.* and 2.0.* C Library: glibc-2.[01] and libc-5  Solaris 2.7 (Sparc and x86)  Irix 6.5

New Ports in (cont.) › ”Clipped" support (no checkpointing or remote system calls, but all other functionality is available)  Windows NT  Alpha Linux

What Will Be New in 6.2.0? › Personal Condor and Grid Support  Flocking  Globus Job Universe  Globus Glide-In › Full, integrated support for Symmetric Multi-Processor (SMP) machines

What's New in 6.2.0? (cont.) › PVM and MPI support › DAGMan (for managing inter-job dependencies) › Jobs can be put “on hold” and released

What's New in 6.2.0? (cont.) › Greatly expanded I/O support for standard jobs  Condor can automatically buffer I/O requests from jobs  Users get much more information about the kinds of I/O their jobs are performing  Users can "remap" files to alternate locations, both regular files and URLs

What's New in 6.2.0? (cont.) › CondorVersion and CondorPlatform strings included in all binaries and libraries  Helps identify and avoid problems with having the wrong version installed  Different parts of the Condor protocol automatically check for version incompatibilities

What's New in 6.2.0? (cont.) › Better accounting  condor_view collector stores historical data web interface  Accountant stores usage information per user › Better control over user priorities  "Priority factors"

What's New in 6.2.0? (cont.) › More powerful administration tools  Setting configuration values remotely  Querying daemons directly for status › Lots of performance and bug fixes › A complete list will be in the online manual (

The 6.3 Development Series › Version 6.3.* will be for lots of easy- to-add, user-visable features › No fundamentally new technology will be added › Should be relatively short-lived will hopefully be out by the end of the year › Should be compatible with 6.2.*

What will be added in 6.3.*? › ”Master agents" - helper programs spawned by the condor_master to aid in administration  Retrieving remote log, history and/or configuration files  Remote "top", "ps" and other monitoring functions › Support for a "Java Universe" - starting a JVM under Condor

What will be added in 6.3.*? (cont.) › Improvements to condor_submit  Command-line arguments and environment variables to set default values  Ability to submit more jobs to an existing cluster › Initial checkpoint (your job's executable binary) can be stored on a checkpoint server

What will be added in 6.3.*? (cont.) › condor_startd will enforce resource limits (like RAM usage) › Tool to help setup and modify SMP startd configurations › More logic put into the condor_shadow to detect temporary problems with a job's execution, put the job on hold, and notify the user

The 6.5.* Development Series › 6.5.* will be for adding fundamentally new technology to Condor › Developed in parallel with 6.3.*  Will hopefully become (or even 7.0.0?) in the not-too-distant-future › Will be incompatible with previous versions of Condor

New Technology in 6.5.* › New version of ClassAds (Rajesh's work) › New version of the condor_starter and condor_shadow  "NT version" will be used for UNIX, too  Lots of new features, like transfering files automatically for "vanilla" jobs (no need for a shared filesystem)

New Technology in 6.5.* (cont.) › New technology for remote system calls › Integrated support for encryption  Secure channels, SSL, Kerberos, etc. › Automatic fail-over for redundant Central Managers

Other changes for 6.5.* › Re-write of the condor_schedd  Support for scheduling dedicated jobs  Performance enhancements and lowered resource requirements (particularly RAM) › Re-write of the checkpoint server  Enhanced support for multiple servers  Will store data files along with checkpoint files

Planned Future Ports › Full support  Solaris 2.8 (Sparc and Intel)  Alpha Linux  Windows NT/2000 › Clipped support  PowerPC Linux

Possible Future Ports  {free,open,net}BSD  MacOS X  HPUX 11.0  AIX 4.2

Thank you for coming to Paradyn/Condor Week!