Presentation is loading. Please wait.

Presentation is loading. Please wait.

Faucets: the Charm++ Clusters Solution Tutorial

Similar presentations


Presentation on theme: "Faucets: the Charm++ Clusters Solution Tutorial"— Presentation transcript:

1 Faucets: the Charm++ Clusters Solution Tutorial
Laxmikant Kale, Jayant DeSouza, Sameer Kumar, Sindhura Bandhakavi, Mani Potnuru Parallel Programming Laboratory Department of Computer Science University of Illinois at Urbana-Champaign demo put up software, maybe CVS fix components diagram, separate diagram for AQS simpler scripts to configure, run 11/16/2018 Charm++ Workshop 2002

2 Motivation Demand for high end computational power, but
Dispersed which machine would give me back my results quickest? Hard to use use ssh to login, ftp files, decide queue, create script, submit because of the hassle, users just submit same script to same machine even if a better alternative exists monitor a running job Low operational efficiency of existing computing systems first this, then outline Charm++ Workshop 2002

3 Outline High-level Usage and Installation
Faucets, Adaptive jobs and queuing system (AQS) Demo Usage and Installation How to write an adaptive program Installing and Using the AQS Adding your cluster to an existing faucets server Installing a faucets server Charm++ Workshop 2002

4 Solution 1: Faucets Motivation #1: dispersed, hard to use
Central source of compute power Users Providers of compute resources User account not needed on every resource Match users and providers Market economy ? QoS requirements, contracts and bidding systems GUI or web-based interface Submission monitoring Charm++ Workshop 2002

5 Faucets Cluster Cluster Cluster Parallel systems need to
maximize their efficiency! Faucets Job Specs Cluster Job Submission Bids File Upload Job Specs File Upload Job Id Cluster Job Id Efficiency metrics are profit and utilization Job Monitor Cluster Charm++ Workshop 2002

6 Motivation #2: Inefficient Utilization
Allocate A ! Conflict ! B Queued 16 Processor system Job B 8 processors Job A Job B Job A 10 processors first why inefficient, then how adaptive jobs solves also mention external “fragmentation” Current Job Schedulers can have low system utilization ! Charm++ Workshop 2002

7 Motivation #2, contd. Chun & Culler paper
Compares FirstPrice (market-based scheduling) with PrioFIFO. Up to 2.5x improvement as degree of job parallelism increases Both have “head-of-line” blocking Adaptive jobs fix this Brent Chun and David Culler – User-centric Performance Analysis of Market-based Cluster Batch Schedulers, CCGrid 2002. Charm++ Workshop 2002

8 Solution 2: Adaptive Jobs
Jobs that can shrink or expand the number of processors they are running on at runtime Improve system utilization and response time Properties Min_pe, related to the memory requirements of the job Max_pe, related to speedup Scheduler can take advantage of this adaptivity Charm++ Workshop 2002

9 Adaptive Job Scheduler
Maximize system utilization and minimize response time Scheduling decisions Shrink existing jobs when a new job arrives Expand jobs to use all processors when a job finishes Processor map sent to the job Bit vector specifying which processors a job is allowed to use (use 3 4 and 5!) Handles regular (non-adaptive) jobs Charm++ Workshop 2002

10 Two Adaptive Jobs 16 Processor system Job A Job B Job A Job B
A Expands ! Allocate A ! Allocate B ! Shrink A B Finishes 16 Processor system Job B Min_pe = 8 Max_pe= 16 Job A Job B Job A Max_pe = 10 Min_pe = 1 Charm++ Workshop 2002

11 Demonstration 11/16/2018 Charm++ Workshop 2002

12 Outline High-level Usage and Installation
Faucets, Adaptive jobs and queuing system (AQS) Demo Usage and Installation How to write an adaptive program Installing and Using the AQS Adding your cluster to an existing faucets server Installing a faucets server summarize: motivation, described, demo next we will get into the details of … Charm++ Workshop 2002

13 Adaptive Jobs 11/16/2018 Charm++ Workshop 2002

14 Adaptive Job Framework
Applications written in MPI or Charm++ Scheduler controls the processor map for each job Processor map is used by the job’s load balancer Scheduler Adaptive Application AMPI CHARM++ Loadbalancer Converse Proc. Map Use the Charm++ framework Charm++ Workshop 2002

15 Charm++ Charm++: Object based virtualization
Program written as a large number of objects which can migrate Number of objects typically much larger than processors Load-balancer can remap objects Measurement based load balancing Charm++ is a data driven message passing language Charm++ Workshop 2002

16 Adaptive Charm++ Programs
Charm++ program is adaptive automatically if a shrink expand enabled centralized load-balancing strategy is used Currently CommLB and RandcentLB are shrink expand enabled Compile with +balancer CommLB Charm++ Workshop 2002

17 MPI Jobs How do we make MPI jobs adaptive? AMPI
AMPI maps the MPI processes to user level threads which can migrate Each thread is embedded in a Charm++ object, thus allowing load balancing and shrink-expand Use the Charm++ framework Charm++ Workshop 2002

18 Writing Adaptive AMPI Programs
Build AMPI with an adaptive load balancing strategies Call MPI_MIGRATE() at regular intervals in each MPI process, because it will not listen to the processor map otherwise. Charm++ Workshop 2002

19 Performance Results for Adaptive Jobs
Charm++ Workshop 2002

20 Shrink Expand Overhead
0.49 0.56 0.46 0.59 0.54 0.66 0.50 0.61 Expand Time (s) Shrink Time (s) Processors Performance for MD program with 10MB migrated data per processor on NCSA Platinum Charm++ Workshop 2002

21 Residual Processes Shrink
Objects are moved from the unallocated processors to the allocated processors Leaves behind a residual process repetition, eliminate More work being done on the loadbalancer Many strategies have been implemented Obvious questions: how long does it take to shrink and expand? New call MPI Migrate Charm++ Workshop 2002

22 Effect of Residual Process
Utilization (%) Jobs In System Performance cost (%) 2 1.98 4 1.43 8 3.24 Now we are convinced of the adaptive job implementation, how much does the system performance improve with adaptive jobs Performance on a 16 processor system Time (s) Performance of Job1 and Job2 Charm++ Workshop 2002

23 Adaptive Queuing System
11/16/2018 Charm++ Workshop 2002

24 Add figure here Charm++ Workshop 2002

25 AQS Features Multithreaded Reliable and robust
Tested on the cool.cs Linux cluster at PPL Supports most features of standard queuing systems Has the ability to manage adaptive jobs currently implemented in Charm++ and MPI For more details check out Charm++ Workshop 2002

26 Components Database Job Scheduler Compute Cluster PPL@UIUC.EDU
Charm++ Workshop 2002

27 Installing Database Download latest version of MySql Install, then:
Install, then: mysql> create database <dbname>; mysql> use <dbname>; mysql> create table jobInfo (id mediumint primary key NOT NULL DEFAULT '0' auto_increment, …..) mysql> grant all on *.* to <user> identified by <passwd>; Charm++ Workshop 2002

28 Installing Scheduler cd charm/net-linux/pgms/scheduler;
make scheduler; make client; Edit Makefile, put correct path to MySql Running scheduler as root su chown root scheduler; chmod +s scheduler ./startScheduler Charm++ Workshop 2002

29 Installing Scheduler, contd.
Edit the startScheduler file: Edit Database to match <dbname> used earlier. Edit PORT to point to port of the scheduler Edit DATABASE_HOST DATABASE_USER and DATABASE_PASSWD to point to the database host, user and password NODELIST points to the nodelist for the scheduler Charm++ Workshop 2002

30 Configuring The Cluster
User must have access to the cluster only through the queuing system Each node runs an rsh daemon Access to rsh through a restrictive group Job switches to the rsh group before running the job only head node can rsh to the other nodes rsh disabled on the compute nodes All connections through unix sockets Charm++ Workshop 2002

31 Using the AQS locally frun runs a job interactively
fsub submits a batch job fkill kills the job fjobs list the running and queued jobs Charm++ Workshop 2002

32 Scheduling Events When : Scheduling Strategy Job arrival
Job completion Job requests change of number of processors Job suspension Scheduling Strategy A plugable component that makes decisions on which jobs to schedule TBD: order of slides, should this be earlier? Charm++ Workshop 2002

33 Scheduling Strategy Studied
Similar to equipartitioning [N Islam et al] On job arrival and job completion All running jobs and the new one are allocated their minimum number of processors Leftover processors are shared equally subject to each job's maximum processor usage If it is not possible to allocate the new job its minimum number of processors, it is queued Charm++ Workshop 2002

34 Scheduler Performance
Simulation results on 64 processors with mean job execution time of 64.5 sec 1.08 76 488 92 164 60 1.0 71 396 88 143 64.5 0.65 46 233 96 100 0.32 23 185 31 200 0.13 9 165 13 68 500 Utilization (%) MRT (s) lf Traditional Jobs Adaptive Jobs 1/(λ) (s) Define lambda MRT and Utilization and Load factor Mention poisson Arrival Time and exponential service time λ=Arrival Rate, MRT=Mean Response Time Utilization=Processor utilization, Load Factor (lf)=Execution Time*λ Charm++ Workshop 2002

35 Experimental Results Experiments on Linux cluster on 64 processors and mean job execution time of 60 sec 1.0 74 303 99 211 60 0.6 49 116 68 76 100 0.3 23 108 29 70 200 0.12 9 109 17 89 500 Utilization (%) MRT (s) lf Traditional Jobs Adaptive Jobs 1/(λ) (s) MINPE and MAXPE values for adaptive and traditional jobs Charm++ Workshop 2002

36 Adding a Cluster to Faucets
11/16/2018 Charm++ Workshop 2002

37 PPL@UIUC.EDU Charm++ Workshop 2002
re-explain motivation for running own faucets server Charm++ Workshop 2002

38 Adding new cluster Prerequisites Then Install Charm++
Install Adaptive Queuing System Then Download the faucets software Compile the cluster daemon (CD) cd faucets/cd; make Run the cluster daemon (CD) cd .. java cd.ClusterDaemon <central server> <central server port> -p <ClusterDaemon port> <working dir> Charm++ Workshop 2002

39 Installing a Faucets Server
11/16/2018 Charm++ Workshop 2002

40 Add figure here Charm++ Workshop 2002

41 Installing a Faucets Server
Install MySQL create tables grant permissions Download JDBC driver Install CS download faucets code and unpack cd faucets/cs; make Edit faucets/cs/db.properties cd faucets java -cp .:/path/to/mm.mysql bin.jar TheServer Charm++ Workshop 2002

42 Installing Appspector
Installation is a little involved Each application needs a display module written in Java Contact us if you want to install Charm++ Workshop 2002

43 Summary and Future Work
Demonstrated the system Showed you how to use and install the Charm++/AMPI adaptive job system Go to to download Future Extend the system to other parallel machines Eliminate residual processes Integrate the scheduler with Globus More comprehensive QoS contracts being developed Sophisticated bidding schemes for the faucets framework Bidding schemes to include memory deadline profit etc. Charm++ Workshop 2002


Download ppt "Faucets: the Charm++ Clusters Solution Tutorial"

Similar presentations


Ads by Google