Rome, Sep 2011Adapting with few simple rules in glideinWMS1 Adaptive 2011 Adapting to the Unknown With a few Simple Rules: The glideinWMS Experience by.

Slides:



Advertisements
Similar presentations
Jaime Frey Computer Sciences Department University of Wisconsin-Madison OGF 19 Condor Software Forum Routing.
Advertisements

A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager
ANTHONY TIRADANI AND THE GLIDEINWMS TEAM glideinWMS in the Cloud.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
CS 104 Introduction to Computer Science and Graphics Problems Operating Systems (2) Process Management 10/03/2008 Yang Song (Prepared by Yang Song and.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Resource Management Reading: “A Resource Management Architecture for Metacomputing Systems”
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
SCD FIFE Workshop - GlideinWMS Overview GlideinWMS Overview FIFE Workshop (June 04, 2013) - Parag Mhashilkar Why GlideinWMS? GlideinWMS Architecture Summary.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-3 CPU Scheduling Department of Computer Science and Software Engineering.
glideinWMS: Quick Facts  glideinWMS is an open-source Fermilab Computing Sector product driven by CMS  Heavy reliance on HTCondor from UW Madison and.
Lecture 2 Process Concepts, Performance Measures and Evaluation Techniques.
Sep 21, 20101/14 LSST Simulations on OSG Sep 21, 2010 Gabriele Garzoglio for the OSG Task Force on LSST Computing Division, Fermilab Overview OSG Engagement.
Use of Condor on the Open Science Grid Chris Green, OSG User Group / FNAL Condor Week, April
The huge amount of resources available in the Grids, and the necessity to have the most up-to-date experimental software deployed in all the sites within.
Evolution of the Open Science Grid Authentication Model Kevin Hill Fermilab OSG Security Team.
Mar 28, 20071/18 The OSG Resource Selection Service (ReSS) Gabriele Garzoglio OSG Resource Selection Service (ReSS) Don Petravick for Gabriele Garzoglio.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Using DIANE for astrophysics applications Ladislav Hluchy, Viet Tran Institute of Informatics Slovak.
Towards a Global Service Registry for the World-Wide LHC Computing Grid Maria ALANDES, Laurence FIELD, Alessandro DI GIROLAMO CERN IT Department CHEP 2013.
Client-Server Model of Interaction Chapter 20. We have looked at the details of TCP/IP Protocols Protocols Router architecture Router architecture Now.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Condor Week 2004 The use of Condor at the CDF Analysis Farm Presented by Sfiligoi Igor on behalf of the CAF group.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
GLIDEINWMS - PARAG MHASHILKAR Department Meeting, August 07, 2013.
Workload management, virtualisation, clouds & multicore Andrew Lahiff.
Eileen Berman. Condor in the Fermilab Grid FacilitiesApril 30, 2008  Fermi National Accelerator Laboratory is a high energy physics laboratory outside.
Parag Mhashilkar Computing Division, Fermi National Accelerator Laboratory.
Campus Grid Technology Derek Weitzel University of Nebraska – Lincoln Holland Computing Center (HCC) Home of the 2012 OSG AHM!
Lecture 4 CPU scheduling. Basic Concepts Single Process  one process at a time Maximum CPU utilization obtained with multiprogramming CPU idle :waiting.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
Job submission overview Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
2 CMS 6 PB raw/run Phobos 50 TB/run E917 5 TB/run.
3 Compute Elements are manageable By hand 2 ? We need middleware – specifically a Workload Management System (and more specifically, “glideinWMS”) 3.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
UCS D OSG Summer School 2011 Life of an OSG job OSG Summer School A peek behind the scenes The life of an OSG job by Igor Sfiligoi University of.
How to get the needed computing Tuesday afternoon, 1:30pm Igor Sfiligoi Leader of the OSG Glidein Factory Operations University of California San Diego.
Introduction to Distributed HTC and overlay systems Tuesday morning, 9:00am Igor Sfiligoi Leader of the OSG Glidein Factory Operations University of California.
Condor Week 2007Glidein Factories - by I. Sfiligoi1 Condor Week 2007 Glidein Factories (and in particular, the glideinWMS) by Igor Sfiligoi.
Introduction to the Grid and the glideinWMS architecture Tuesday morning, 11:15am Igor Sfiligoi Leader of the OSG Glidein Factory Operations University.
UCS D OSG Summer School 2011 Intro to DHTC OSG Summer School An introduction to Distributed High-Throughput Computing with emphasis on Grid computing.
OSG Consortium Meeting - March 6th 2007Evaluation of WMS for OSG - by I. Sfiligoi1 OSG Consortium Meeting Evaluation of Workload Management Systems for.
UCS D OSG Summer School 2011 Overlay systems OSG Summer School An introduction to Overlay systems Also known as Pilot systems by Igor Sfiligoi University.
Madison, Apr 2010Igor Sfiligoi1 Condor World 2010 Condor-G – A few lessons learned by Igor UCSD.
Condor Week 09Condor WAN scalability improvements1 Condor Week 2009 Condor WAN scalability improvements A needed evolution to support the CMS compute model.
Maarten Litmaath, GDB, 2008/06/11 1 Pilot Job Frameworks Review GDB working group mandated by WLCG MB on Jan. 22, 2008 Mission –Review security issues.
Condor Week May 2012No user requirements1 Condor Week 2012 An argument for moving the requirements out of user hands - The CMS experience presented.
Honolulu - Oct 31st, 2007 Using Glideins to Maximize Scientific Output 1 IEEE NSS 2007 Making Science in the Grid World - Using Glideins to Maximize Scientific.
Dealing with real resources Wed July 21st, 3:15pm Igor Sfiligoi, OSG Scalability Area coordinator and OSG glideinWMS factory manager.
Kevin Thaddeus Flood University of Wisconsin
A Grid Job Monitoring System
Operating a glideinWMS frontend by Igor Sfiligoi (UCSD)
Primer for Site Debugging
Design rationale and status of the org.glite.overlay component
Key Activities. MND sections
ATLAS Cloud Operations
Workload Management System
Operating Systems (CS 340 D)
Glidein Factory Operations
Job Scheduling in a Grid Computing Environment
The CMS use of glideinWMS by Igor Sfiligoi (UCSD)
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
William Stallings Computer Organization and Architecture
Grid Deployment Board meeting, 8 November 2006, CERN
Survey on User’s Computing Experience
WLCG Collaboration Workshop;
D. van der Ster, CERN IT-ES J. Elmsheuser, LMU Munich
Chapter5: CPU Scheduling
WMS Options: DIRAC and GlideIN-WMS
GLOW A Campus Grid within OSG
Presentation transcript:

Rome, Sep 2011Adapting with few simple rules in glideinWMS1 Adaptive 2011 Adapting to the Unknown With a few Simple Rules: The glideinWMS Experience by Igor Sfiligoi 1, Benjamin Hass 1, Frank Würthwein 1, and Burt Holzman 2 1 UCSD 2 FNAL

Rome, Sep 2011Adapting with few simple rules in glideinWMS2 The Grid landscape ● Many highly autonomous Grid sites ● Many diverse user communities ● How can users efficiently schedule their jobs? Within Scientific Grid environments (e.g. OSG, EGI) Nothin g shared

Rome, Sep 2011Adapting with few simple rules in glideinWMS3 Scheduling problem ● Grid sites expose only partial information ● Access to finer details restricted to site admins ● Each user community wants independence ● No centralized, Grid-wide job scheduling ● As a result ● Cannot accurately predict even the near future ● Partitioning across sites mostly a guesswork ● Adapting to the ever-changing state a must

Rome, Sep 2011Adapting with few simple rules in glideinWMS4 Traditional approaches ● Force sites to expose as much info as possible ● Sites end up publishing lots of garbage ● Implement retries ● Long tail before ALL jobs in a workflow finish ● Start at many sites concurrently, then kill some ● Wasteful and with semantic problems ● Mediocre results and complex code

Rome, Sep 2011Adapting with few simple rules in glideinWMS5 The glideinWMS ● The glideinWMS approach to the problem ● Use the pilot paradigm ● Pressure based scheduling ● Avoid using external information ● Range reduction The glideinWMS is a Grid job scheduler initially developed at FNAL by the CMS experiment ● Based on the CDF glideCAF concept ● With contributions from several other institutes ● Widely used in OSG, with a large instance at UCSD

Rome, Sep 2011Adapting with few simple rules in glideinWMS6 The pilot paradigm ● Send pilots to Grid sites (never user jobs) ● Create a dynamic overlay pool of compute resources ● Jobs scheduled within this overlay pool ● Scheduling in the overlay pool easy ● Complete info ● Full control ● Problem moved to the pilot submitter Site N Site 1 Pilot Overlay pool Pilots not user specific Pilot submitter One pool x user community

Rome, Sep 2011Adapting with few simple rules in glideinWMS7 Is pilot scheduling easier? ● User jobs ● Every job is important => users wait for last to finish ● A failed job is a problem for the user ● Many users => priority handling ● Pilot jobs ● All the same ● A failed pilot job is just wasted CPU time ● Single credential => no need to prioritize between them ● Must handle each and every one ● Only number of pilot jobs counts

Rome, Sep 2011Adapting with few simple rules in glideinWMS8 Pressure based scheduling ● The glideinWMS pilot scheduling based on the concept of pilot pressure ● Keep a fixed no. of pending pilots in remote queues ● Site by site ● Furthermore, split pilot scheduling from pilot submission ● Scheduling in VO frontend Site N Site 1 Pilot VO frontend P R Glidein factory

Rome, Sep 2011Adapting with few simple rules in glideinWMS9 Determining the pressure ● Calculating the proper pressure important ● Too low => small overlay pool => long job wait ● Too high => on jobs when pilot starts => wasted CPU ● Must be recalculated often ● Each site has its own pressure ● Input to pressure calculation ● Only no. matching pending(i.e. idle) user jobs ● Grid status incomplete and unreliable ● Some jobs that can run on multiple Grid sites ● Count them as the appropriate fraction against each P s (t)= f (I s (t))

Rome, Sep 2011Adapting with few simple rules in glideinWMS10 Simple pressure function ● Experience tells us Grid jobs have relatively flat start and terminate rates ● Typical O(10/few mins), max O(100/few mins) ● So pressure can be capped in the O(10) range ● Small range => tuning only when few jobs ● Using simple heuristic of dividing by 3 ● Just to have a reasonable edge-case policy f(I s (t)) =  min(I s (t)/3,C s ) 

Rome, Sep 2011Adapting with few simple rules in glideinWMS11 Operational experience (1) ● has 2 years of experience ● Serving O(4k) users ● Using about O(100) Grid sites located in the Americas, Europe and Asia Grid sites concurrently used Status of the overlay pool

Rome, Sep 2011Adapting with few simple rules in glideinWMS12 Operational experience (2) ● has 2 years of experience ● The glideinWMS logic works very efficiently ● Quick job startup times Status of the overlay pool

Rome, Sep 2011Adapting with few simple rules in glideinWMS13 Operational experience (3) ● has 2 years of experience ● The glideinWMS logic works very efficiently ● Quick job startup times ● Little over-provisioning (~5%) Status of the overlay pool

Rome, Sep 2011Adapting with few simple rules in glideinWMS14 Related work ● Non-pilot WMS (i.e. direct submission) ● gLite WMS and OSG MM ● More complex and brittle since they require accurate and complete info from Grid sites ● Pilot WMS ● PANDA – Pressure based, with basically constant pressure over time => high load on sites ● DIRAC and MyCluster – Require services at Grid sites to gather site state => many Grid sites do not allow this

Rome, Sep 2011Adapting with few simple rules in glideinWMS15 Summary ● Direct Grid-wide job scheduling is hard ● Pilot paradigm simplifies it by making it uniform ● The glideinWMS use pressure logic ● Based on number of pending user jobs only ● Pressure function capped => simple rules ● CMS experience at UCSD shows it works ● and it works well

Rome, Sep 2011Adapting with few simple rules in glideinWMS16 For more information ● The glideinWMS home page ● Relevant papers: ● I. Sfiligoi et al., "The pilot way to grid resources using glideinWMS," CSIE, WRI World Cong. on, vol. 2, pp , 2009, doi: /CSIE ● The CMS Collaboration et al. “The CMS experiment at the CERN LHC,” J. Inst, vol. 3, S08004, pp , 2008, doi: / /3/08/S08004

Rome, Sep 2011Adapting with few simple rules in glideinWMS17 Acknowledgment ● This work is partially sponsored by ● the US Department of Energy under Grant No. DE-FC02-06ER41436 subcontract No. 647F290 (OSG), and ● the US National Science Foundation under Grant No. PHY (CMS Maintenance & Operations).

Rome, Sep 2011Adapting with few simple rules in glideinWMS18 Copyright notice ● This presentation contains graphics copyright of Toon-a-day that was licensed to Igor Sfiligoi for use in this presentation ● Any other use strictly prohibited