Condor-G: An Update.

Slides:



Advertisements
Similar presentations
Jaime Frey, Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison OGF.
Advertisements

Jaime Frey Computer Sciences Department University of Wisconsin-Madison OGF 19 Condor Software Forum Routing.
Condor Project Computer Sciences Department University of Wisconsin-Madison Eager, Lazy, and Just-in-Time.
Dan Bradley Computer Sciences Department University of Wisconsin-Madison Schedd On The Side.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
CSF4, SGE and Gfarm Integration Zhaohui Ding Jilin University.
Part 7: CondorG A: Condor-G B: Laboratory: CondorG.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
A Computation Management Agent for Multi-Institutional Grids
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
GRID Workload Management System Massimo Sgaravatto INFN Padova.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
Workload Management Massimo Sgaravatto INFN Padova.
Alain Roy Computer Sciences Department University of Wisconsin-Madison 25-June-2002 Using Condor on the Grid.
First steps implementing a High Throughput workload management system Massimo Sgaravatto INFN Padova
Evaluation of the Globus GRAM Service Massimo Sgaravatto INFN Padova.
Globus 4 Guy Warner NeSC Training.
SCD FIFE Workshop - GlideinWMS Overview GlideinWMS Overview FIFE Workshop (June 04, 2013) - Parag Mhashilkar Why GlideinWMS? GlideinWMS Architecture Summary.
- 1 - Grid Programming Environment (GPE) Ralf Ratering Intel Parallel and Distributed Solutions Division (PDSD)
The Glidein Service Gideon Juve What are glideins? A technique for creating temporary, user- controlled Condor pools using resources from.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
GRAM: Software Provider Forum Stuart Martin Computational Institute, University of Chicago & Argonne National Lab TeraGrid 2007 Madison, WI.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
Job Submission Condor, Globus, Java CoG Kit Young Suk Moon.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
Condor Birdbath Web Service interface to Condor
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
3-2.1 Topics Grid Computing Meta-schedulers –Condor-G –Gridway Distributed Resource Management Application (DRMAA) © 2010 B. Wilkinson/Clayton Ferner.
1 The Roadmap to New Releases Todd Tannenbaum Department of Computer Sciences University of Wisconsin-Madison
Rochester Institute of Technology Job Submission Andrew Pangborn & Myles Maxfield 10/19/2015Service Oriented Cyberinfrastructure Lab,
Condor: High-throughput Computing From Clusters to Grid Computing P. Kacsuk – M. Livny MTA SYTAKI – Univ. of Wisconsin-Madison
NUG 2004 Grid File Yanker Demo Shreyas Cholia Mass Storage Group, NERSC 06/24/2004.
Report from USA Massimo Sgaravatto INFN Padova. Introduction Workload management system for productions Monte Carlo productions, data reconstructions.
Todd Tannenbaum Computer Sciences Department University of Wisconsin-Madison Condor RoadMap.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Condor Project Computer Sciences Department University of Wisconsin-Madison Grids and Condor Barcelona,
Proposal for a IS schema Massimo Sgaravatto INFN Padova.
Pilot Factory using Schedd Glidein Barnett Chiu BNL
Grid Compute Resources and Job Management. 2 Job and compute resource management This module is about running jobs on remote compute resources.
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
Jaime Frey Computer Sciences Department University of Wisconsin-Madison What’s New in Condor-G.
Matthew Farrellee Computer Sciences Department University of Wisconsin-Madison Condor and Web Services.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor-G: Condor and Grid Computing.
Parallel Computing Globus Toolkit – Grid Ayaka Ohira.
A GOS Interoperate Interface's Design & Implementation GOS Adapter For JSAGA Meng You BUAA.
UCS D OSG Summer School 2011 Overlay systems OSG Summer School An introduction to Overlay systems Also known as Pilot systems by Igor Sfiligoi University.
Madison, Apr 2010Igor Sfiligoi1 Condor World 2010 Condor-G – A few lessons learned by Igor UCSD.
Workload Management Workpackage
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
Dynamic Deployment of VO Specific Condor Scheduler using GT4
Example: Rapid Atmospheric Modeling System, ColoState U
Data Bridge Solving diverse data access in scientific applications
Peter Kacsuk – Sipos Gergely MTA SZTAKI
Outline Expand via Flocking Grid Universe in HTCondor ("Condor-G")
GWE Core Grid Wizard Enterprise (
Building Grids with Condor
Globus Job Management. Globus Job Management Globus Job Management A: GRAM B: Globus Job Commands C: Laboratory: globusrun.
Condor Glidein: Condor Daemons On-The-Fly
Basic Grid Projects – Condor (Part I)
Wide Area Workload Management Work Package DATAGRID project
GRID Workload Management System for CMS fall production
Condor-G Making Condor Grid Enabled
GLOW A Campus Grid within OSG
Grid Computing Software Interface
Presentation transcript:

Condor-G: An Update

Outline What is Condor-G Past Present Future

What Is Condor-G Use Condor to run jobs on the Grid Uses Globus Toolkit GRAM (submit a remote job) GASS (transfer job’s files) Two components Globus Universe GlideIn

Globus Universe Run a job on a Grid resource Features Disadvantages Job management Fault tolerance Credential management Disadvantages No remote syscalls, checkpoint/migration, or dynamic resource selection

How It Works Condor-G Grid Resource Schedd LSF

600 Globus jobs How It Works Condor-G Grid Resource Schedd LSF

How It Works Condor-G Grid Resource Schedd LSF GridManager 600 Globus jobs How It Works Condor-G Grid Resource Schedd LSF GridManager

How It Works Condor-G Grid Resource JobManager Schedd LSF GridManager 600 Globus jobs How It Works Condor-G Grid Resource JobManager Schedd LSF GridManager

How It Works Condor-G Grid Resource JobManager Schedd LSF GridManager 600 Globus jobs How It Works Condor-G Grid Resource JobManager Schedd LSF GridManager User Job

GlideIn Create your own personal Condor pool from temporarily-acquired Grid resources Brings the full power of Condor to the Grid Run a Condor startd on a Grid resource Startd reports back to your machine and runs Vanilla and Standard Universe jobs

How It Works Condor-G Grid Resource Schedd LSF Collector 600 Condor jobs How It Works Condor-G Grid Resource Schedd LSF Collector

How It Works Condor-G Grid Resource Schedd LSF Collector 600 Condor jobs How It Works Condor-G Grid Resource Schedd glide-ins LSF Collector

How It Works Condor-G Grid Resource Schedd LSF GridManager Collector jobs How It Works Condor-G Grid Resource Schedd glide-ins LSF GridManager Collector

How It Works Condor-G Grid Resource JobManager Schedd LSF GridManager jobs How It Works Condor-G Grid Resource JobManager Schedd glide-ins LSF GridManager Collector

How It Works Condor-G Grid Resource JobManager Schedd LSF GridManager jobs How It Works Condor-G Grid Resource JobManager Schedd glide-ins LSF GridManager Startd Collector

How It Works Condor-G Grid Resource JobManager Schedd LSF GridManager jobs How It Works Condor-G Grid Resource JobManager Schedd glide-ins LSF GridManager Startd Collector

How It Works Condor-G Grid Resource JobManager Schedd LSF GridManager jobs How It Works Condor-G Grid Resource JobManager Schedd glide-ins LSF GridManager Startd Collector User Job

Globus Grid PBS LSF Condor Condor-G

600 Condor jobs Globus Grid PBS LSF Condor Condor-G

600 Condor jobs Globus Grid PBS LSF Condor Condor-G

600 Condor jobs Globus Grid PBS LSF Condor Condor-G glide-ins

600 Condor jobs Globus Grid PBS LSF Condor Condor-G glide-ins

600 Condor jobs Globus Grid PBS LSF Condor Condor-G glide-ins

600 Condor jobs Globus Grid PBS LSF Condor Condor-G glide-ins

Past GridManager daemon Globus GRAM 1.5 Runs Grid jobs using GRAM protocol Stages executable and standard I/O using GASS protocol Globus GRAM 1.5 We added fault-tolerance to the GRAM protocol Changes included in Globus Toolkit 2.0 release

Present Updated Condor-G to Globus Toolkit 2.0 Enhanced GridManager GAHP

Enhanced GridManager Put problem jobs on hold No more stuck jobs Increase concurrency with GAHP Almost ready

Single-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Single-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Single-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Single-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Single-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Single-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Single-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Single-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Single-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Multi-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Multi-Threaded Execution Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Globus Application Helper Protocol (GAHP) Condor is non-threaded Want to use multi-threaded libraries Increased concurrency Put libraries in external helper process Simple interface over pipes/sockets

Multi-Threaded Execution with GAHP Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource Grid Resource

Multi-Threaded Execution with GAHP Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource GAHP Server GAHP Client Grid Resource

Multi-Threaded Execution with GAHP Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource GAHP Server GAHP Client Grid Resource

Multi-Threaded Execution with GAHP Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource GAHP Server GAHP Client Grid Resource

Multi-Threaded Execution with GAHP Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource GAHP Server GAHP Client Grid Resource

Multi-Threaded Execution with GAHP Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource GAHP Server GAHP Client Grid Resource

Multi-Threaded Execution with GAHP Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource GAHP Server GAHP Client Grid Resource

Multi-Threaded Execution with GAHP Grid Resource GridManager Job 1 Grid Resource Job 2 Job 3 Job 4 Grid Resource GAHP Server GAHP Client Grid Resource

Future GRAM 1.6 Condor-G on Windows Condor-G Grid service

Globus GRAM 1.6 Working with Globus team to add additional features to GRAM protocol Credential refresh File staging Scheduler-specific options

Condor-G for Windows Condor GRAM and GASS APIs Condor-G Windows implementation available GRAM and GASS APIs No C implementation for Windows (yet) Java implementation (Java CoG) Condor-G Windows version possible by writing GAHP server in Java

Condor-G Grid Service Reliable job submission service for higher-lever applications Open Grid Services Architecture (OGSA) SOAP, WSDL, WS-Inspection Implement Grid service interface for Condor-G (and Condor in general)

Thank You Condor-G demo on Wednesday Questions? 3351 CS Talk to me E-mail condor-admin@cs.wisc.edu