Faucets: Efficient Utilization of Multiple Clusters Laxmikant Kale, Jayant DeSouza, Sameer Kumar, Sindhura Bandhakavi, Mani Potnuru Parallel Programming Laboratory Department of Computer Science University of Illinois at Urbana-Champaign http://charm.cs.uiuc.edu/ change title to match web page put up software, maybe CVS fix components diagram, separate diagram for AQS simpler scripts to configure, run 2/23/2019 Charm++ Workshop 2002
Outline Motivation, and Faucets Adaptive Jobs, and the Faucets solution the Adaptive Jobs solution Faucets Job Submission Job Monitoring Adaptive Jobs, and Performance Results Adaptive Queuing System, and Simulations and Performance Results Future Work 2/23/2019 Charm++ Workshop 2002
Motivation Demand for high end compute power, but Dispersed which machine will give me back my results quickest? Hard to use use ssh to login, ftp files, decide queue, create script, submit because of the hassle, users just submit same script to same machine even if a better alternative exists monitor a running job Low operational efficiency of existing computing systems first this, then outline 2/23/2019 Charm++ Workshop 2002
Solution 1: Faucets Motivation #1: dispersed, hard to use Central source of compute power Users Providers of compute resources User account not needed on every resource Match users and providers Market economy ? QoS requirements, contracts and bidding systems GUI or web-based interface Submission monitoring 2/23/2019 Charm++ Workshop 2002
Faucets Cluster Cluster Cluster Parallel systems need to maximize their efficiency! Faucets Job Specs Cluster Job Submission Bids File Upload Job Specs File Upload Job Id Cluster Job Id Efficiency metrics are profit and utilization Job Monitor Cluster http://charm.cs.uiuc.edu/research/faucets 2/23/2019 Charm++ Workshop 2002
Motivation #2: Inefficient Utilization Allocate A ! Conflict ! B Queued 16 Processor system Job B 8 processors Job A Job B Job A 10 processors first why inefficient, then how adaptive jobs solves also mention external “fragmentation” Current Job Schedulers can have low system utilization ! 2/23/2019 Charm++ Workshop 2002
Motivation #2, contd. Chun & Culler paper Compares FirstPrice (market-based scheduling) with PrioFIFO. Up to 2.5x improvement as degree of job parallelism increases Both have “head-of-line” blocking Adaptive jobs fix this Brent Chun and David Culler – User-centric Performance Analysis of Market-based Cluster Batch Schedulers, CCGrid 2002. 2/23/2019 Charm++ Workshop 2002
Solution 2: Adaptive Jobs Jobs that can shrink or expand the number of processors they are running on at runtime Improve system utilization and response time Properties Min_pe, related to the memory requirements of the job Max_pe, related to speedup 2/23/2019 Charm++ Workshop 2002
Adaptive Job Scheduler Scheduler can take advantage of this adaptivity Improve system utilization and response time Scheduling decisions Shrink existing jobs when a new job arrives Expand jobs to use all processors when a job finishes Processor map sent to the job Bit vector specifying which processors a job is allowed to use 00011100 (use 3 4 and 5!) Handles regular (non-adaptive) jobs 2/23/2019 Charm++ Workshop 2002
Two Adaptive Jobs 16 Processor system Job A Job B Job A Job B A Expands ! Allocate A ! Allocate B ! Shrink A B Finishes 16 Processor system Job B Min_pe = 8 Max_pe= 16 Job A Job B Job A Max_pe = 10 Min_pe = 1 2/23/2019 Charm++ Workshop 2002
Outline Motivation, and Faucets Adaptive Jobs, and the Faucets solution the Adaptive Jobs solution Faucets Job Submission Job Monitoring Adaptive Jobs, and Performance Results Adaptive Queuing System, and Simulations and Performance Results Future Work 2/23/2019 Charm++ Workshop 2002
Faucets: Job Submission 2/23/2019 Charm++ Workshop 2002
Submission Mechanism QoS requirements, contract, bidding type, number of processors memory estimated compute time or table: processors vs. compute time deadline price Authentication, security Accounting Cluster Bartering 2/23/2019 Charm++ Workshop 2002
Faucets Cluster Cluster Cluster Parallel systems need to maximize their efficiency! Faucets Job Specs Cluster Job Submission Bids File Upload Job Specs File Upload Job Id Cluster Job Id Efficiency metrics are profit and utilization Job Monitor Cluster http://charm.cs.uiuc.edu/research/faucets 2/23/2019 Charm++ Workshop 2002
Job Monitoring: Appspector 2/23/2019 Charm++ Workshop 2002
Using Appspector Charm client-server (CCS) interface User can write Default server Default Java client User can write Program code to send relevant data Java class to display data 2/23/2019 Charm++ Workshop 2002
Clusters Status View 2/23/2019 Charm++ Workshop 2002
Adaptive Jobs 2/23/2019 Charm++ Workshop 2002
Adaptive Job Framework Applications written in MPI or Charm++ Scheduler controls the processor map for each job Processor map is used by the job’s load balancer Scheduler Adaptive Application AMPI CHARM++ Loadbalancer Converse Proc. Map Use the Charm++ framework 2/23/2019 Charm++ Workshop 2002
Charm++ Charm++: Object based virtualization Program written as a large number of objects which can migrate Number of objects typically much larger than processors Load-balancer can remap objects Measurement based load balancing Charm++ is a data driven message passing language 2/23/2019 Charm++ Workshop 2002
Adaptive Charm++ Programs Charm++ program is adaptive automatically if a shrink expand enabled centralized load-balancing strategy is used Currently CommLB and RandcentLB are shrink expand enabled Compile with –module CommLB Run with +balancer CommLB 2/23/2019 Charm++ Workshop 2002
MPI Jobs How do we make MPI jobs adaptive? AMPI AMPI maps the MPI processes to user level threads which can migrate Each thread is embedded in a Charm++ object, thus allowing load balancing and shrink-expand Use the Charm++ framework 2/23/2019 Charm++ Workshop 2002
Adaptive AMPI Programs Build AMPI with an adaptive load balancing strategy Call MPI_MIGRATE() at regular intervals in each MPI process, because it will not listen to the processor map otherwise. 2/23/2019 Charm++ Workshop 2002
Performance Results for Adaptive Jobs 2/23/2019 Charm++ Workshop 2002
Shrink Expand Overhead 0.49 0.56 16 8 0.46 0.59 32 16 0.54 0.66 64 32 0.50 0.61 128 64 Expand Time (s) Shrink Time (s) Processors Performance for MD program with 10MB migrated data per processor on NCSA Platinum 2/23/2019 Charm++ Workshop 2002
Residual Processes Shrink Objects are moved from the unallocated processors to the allocated processors Leaves behind a residual process repetition, eliminate More work being done on the loadbalancer Many strategies have been implemented Obvious questions: how long does it take to shrink and expand? New call MPI Migrate 2/23/2019 Charm++ Workshop 2002
Effect of Residual Process Utilization (%) Jobs In System Performance cost (%) 2 1.98 4 1.43 8 3.24 Now we are convinced of the adaptive job implementation, how much does the system performance improve with adaptive jobs Performance on a 16 processor system Time (s) Performance of Job1 and Job2 2/23/2019 Charm++ Workshop 2002
Adaptive Queuing System 2/23/2019 Charm++ Workshop 2002
AQS Features Multithreaded Reliable and robust Tested on the cool.cs Linux cluster at PPL Supports most features of standard queuing systems Has the ability to manage adaptive jobs currently implemented in Charm++ and MPI Handles regular (non-adaptive) jobs 2/23/2019 Charm++ Workshop 2002
AQS Scheduling Strategy A library component that decides which jobs to schedule Similar to equipartitioning [N Islam et al] On job arrival and job completion All running jobs and the new one are allocated their minimum number of processors Leftover processors are shared equally subject to each job's maximum processor usage If it is not possible to allocate the new job its minimum number of processors, it is queued 2/23/2019 Charm++ Workshop 2002
Simulated Utilization 2/23/2019 Charm++ Workshop 2002
Simulated MRT 2/23/2019 Charm++ Workshop 2002
Experimental Utilization 2/23/2019 Charm++ Workshop 2002
Experimental MRT 2/23/2019 Charm++ Workshop 2002
Summary and Future Work Ease of use – Faucets Better utilization – Charm++/AMPI Adaptive Jobs Go to http://charm.cs.uiuc.edu/research/faucets to download Future Extend the system to other parallel machines Eliminate residual processes Integrate the scheduler with Globus More comprehensive QoS contracts being developed Sophisticated bidding schemes for the faucets framework Bidding schemes to include memory deadline profit etc. 2/23/2019 Charm++ Workshop 2002