Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor RoadMap.

Slides:



Advertisements
Similar presentations
Todd Tannenbaum Condor Team GCB Tutorial OGF 2007.
Advertisements

Jaime Frey, Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison OGF.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison OGF 19 Condor Software Forum Routing.
WS-JDML: A Web Service Interface for Job Submission and Monitoring Stephen M C Gough William Lee London e-Science Centre Department of Computing, Imperial.
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Dan Bradley Computer Sciences Department University of Wisconsin-Madison Schedd On The Side.
Part 7: CondorG A: Condor-G B: Laboratory: CondorG.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Jim Basney GSI Credential Management with MyProxy GGF8 Production Grid Management RG Workshop June.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
A Computation Management Agent for Multi-Institutional Grids
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor-G: A Case in Distributed.
Java Parallel Processing Framework. Presentation Road Map What is Java Parallel Processing Framework JPPF Features JPPF Requirements JPPF Topology JPPF.
Grid Programming Environment (GPE) Grid Summer School, July 28, 2004 Ralf Ratering Intel - Parallel and Distributed Solutions Division (PDSD)
Condor Overview Bill Hoagland. Condor Workload management system for compute-intensive jobs Harnesses collection of dedicated or non-dedicated hardware.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Jaeyoung Yoon Computer Sciences Department University of Wisconsin-Madison Virtual Machine Universe in.
Zach Miller Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Grid Computing 7700 Fall 2005 Lecture 17: Resource Management Gabrielle Allen
- 1 - Grid Programming Environment (GPE) Ralf Ratering Intel Parallel and Distributed Solutions Division (PDSD)
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s New in Condor.
National Computational Science National Center for Supercomputing Applications National Computational Science MyProxy: An Online Credential Repository.
Riccardo Bruno INFN.CT Sevilla, Sep 2007 The GENIUS Grid portal.
1 Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
Ashok Agarwal 1 BaBar MC Production on the Canadian Grid using a Web Services Approach Ashok Agarwal, Ron Desmarais, Ian Gable, Sergey Popov, Sydney Schaffer,
Hao Wang Computer Sciences Department University of Wisconsin-Madison Security in Condor.
Job Submission Condor, Globus, Java CoG Kit Young Suk Moon.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Grid Computing I CONDOR.
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
Condor Birdbath Web Service interface to Condor
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
3-2.1 Topics Grid Computing Meta-schedulers –Condor-G –Gridway Distributed Resource Management Application (DRMAA) © 2010 B. Wilkinson/Clayton Ferner.
Greg Thain Computer Sciences Department University of Wisconsin-Madison cs.wisc.edu Interactive MPI on Demand.
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Miron Livny Computer Sciences Department University of Wisconsin-Madison Welcome and Condor Project Overview.
National Computational Science National Center for Supercomputing Applications National Computational Science NCSA-IPG Collaboration Projects Overview.
ETICS All Hands meeting Bologna, October 23-25, 2006 NMI and Condor: Status + Future Plans Andy PAVLO Peter COUVARES Becky GIETZEL.
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
Derek Wright Computer Sciences Department University of Wisconsin-Madison MPI Scheduling in Condor: An.
1 Condor BirdBath SOAP Interface to Condor Charaka Goonatilake Department of Computer Science University College London
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor RoadMap Paradyn/Condor.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Review of Condor,SGE,LSF,PBS
Condor Project Computer Sciences Department University of Wisconsin-Madison Grids and Condor Barcelona,
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
Alain Roy Computer Sciences Department University of Wisconsin-Madison Condor & Middleware: NMI & VDT.
National Computational Science National Center for Supercomputing Applications National Computational Science Integration of the MyProxy Online Credential.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison What’s new in Condor?
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
Dan Bradley Condor Project CS and Physics Departments University of Wisconsin-Madison CCB The Condor Connection Broker.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.
Matthew Farrellee Computer Sciences Department University of Wisconsin-Madison Condor and Web Services.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor NT Condor ported.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison Condor and Virtual Machines.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor-G: Condor and Grid Computing.
Antonio Fuentes RedIRIS Barcelona, 15 Abril 2008 The GENIUS Grid portal.
Dynamic Deployment of VO Specific Condor Scheduler using GT4
Building Grids with Condor
Basic Grid Projects – Condor (Part I)
Condor-G Making Condor Grid Enabled
Condor-G: An Update.
Basic Setup Internet Firewall Master 7 Nodes Gigabit switch
Presentation transcript:

Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor RoadMap

2 Outline › The “Big Picture” › Version 6.7.x  Availability Failover  Scalability Resources, jobs, matchmaking framework, files  Accessibility APIs, more Grid middleware, network

3 Big Picture What do we want to achieve What do we want to achieve in a new Condor developer series? › Technology Transfer  Building a bridge between the Condor production software development activity and the academic core research activity BAD-FS, Stork, Diskrouter, Parrot (transparent I/O), Schedd Glidein, VO Schedulers, HA, Management, Improved ClassAds…

4 What do we want to achieve, cont? New Ports: Go to where the cycles are! The RedHat Dilemma Our porting ‘hopper’ : AIX 5.1L on the PowerPC architecture Redhat AS server on x86 Fedora Core on x86 Fedora Core 2 on x86 Redhat AS server on AMD64 SuSE 8.0 on AMD64 Redhat AS server on IA64 HPUX bit

5 What do we want to achieve, cont. › Improve existing ports  Move “clipped wing” port to full ports (w/ checkpoint, process migration) Max OS X, Windows  Better integration into environments Windows: operate better w/ DFS, use MSI Unix: operate w/ AFS

6 What do we want to achieve, cont. › Address changes in the computing landscape  Firewalls, NATs  64-bit operating systems  Emphasis on data  Movement towards standards such as WS, OGSA, …

7 Version 6.7.x Theme › Version 6.7.x  Scalability Resources, jobs, matchmaking framework, security  Availability Failover  Accessibility APIs, more Grid middleware, network

8 What happens if my submit machine reboots? Once upon a time, only one answer: job restarts. Checkpoint? No Checkpoint? High Availability in v6.7.x

9 New: Job Progress continues if connection is interrupted › Now for Vanilla and Java universe jobs, Condor now supports reestablishment of the connection between the submitting and executing machines. › To take advantage of this feature, put the following line into their job’s submit description file: JobLeaseDuration = For example: JobLeaseDuration = 1200

10 What if the submission point spontaneously explodes? (don’t try this at home)

11 More High Availability Solutions › Condor can support a submit machine “hot spare”  If your submit machine is down for longer than N minutes, a second machine can take over › Two mechanisms available  Job Mirroring Described by Jaime earlier today  High Availability Daemon Failover Just tell the condor_master to run ONE instance

12 Daemon Failover Master SchedD Master SchedD Refresh Lock Check Lock Machine A Machine B Active(hot spare) Obtain Lock Refresh Lock Active

13 Accessibility › Support for GCB  Condor working w/ NATs, Firewalls › Distributed Resource Management Application API (DRMAA)  GGF Working Group  An API specification for the submission and control of jobs to one or more Distributed Resource Management (DRM) systems  Condor DRMAA interface to appear in v6.7.0

14 SOAP/Grid Service condor_schedd Cedar OGSI: SOAP HTTPG Web Service: SOAP HTTPS

15 New “Grid Universe” › With new Grid Universe, always specify a ‘gridtype’. So the old “globus” Universe is now declared as: universe = grid gridtype = gt2 › Other gridtypes? GT3 for OGSA- based Globus Toolkit 3

16 Condor-G improvements › Condor-G can submit to either Globus GT2 or GT3 resources, including support for GT3 with web services.  Condor-G includes everything required; no need for client to have a GT3 installation.  Good migration path to OGSA › Condor-G to Nordugrid, Unicore, Condor, ORACLE › Support for credential refresh via the MyProxy Online Credential Management in NMI

17 Why Condor + MyProxy? › Long-lived tasks or services need credentials  Task lifetime is difficult to predict › Don’t want to delegate long-lived credentials  Fear of compromise › Instead, renew credentials with MyProxy as needed during the task’s lifetime  Provides a single point of monitoring and control  Renewal policy can be modified at any time For example, disable renewals if compromise is detected or suspected

18 Credential Renewal Condor-G Scheduler MyProxy Resource Manager Job HomeRemote Submit Jobs Enable Renewal Launch Job Retrieve Credentials Refresh Credentials

19 More… › Condor can now transfer job data files larger than 2 GB in size.  On all platforms that support 64bit file offsets › Real-time spooling of stdout/err/in in any universe incl VANILLA  Real-time monitoring of job progress

20 Thank you!