Download presentation
Presentation is loading. Please wait.
Published byIsabel Evangeline Spencer Modified over 9 years ago
1
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1
2
Motivation Emergence of Cloud Computing Including for HPC Applications Key Advantages of Cloud Computing Elasticity (dynamically acquire resources) Pay-as-you model Can be exploited to meet cost and/or time constraints Existing HPC Applications MPI-based, use fixed number of nodes Need to make Existing MPI Applications Elastic A Framework for Elastic Execution of Existing MPI Programs 2
3
Outline Research Objective Framework Design Run time support modules Experimental Platform: Amazon Cloud Services Applications and Experimental Evaluation Decision layer Design Feedback Model Decision Layer Implementation Experimental Results for Time and Cost Criteria Conclusion A Framework for Elastic Execution of Existing MPI Programs 3
4
Detailed Research Objective To make MPI applications elastic Exploit key advantage of Cloud Computing Meet user defined time and/or cost constraints Avoid new programming model or significant recoding Design a framework for Decision making When to expand or contract Actual Support for Elasticity Allocation, Data Redistribution, Restart A Framework for Elastic Execution of Existing MPI Programs 4
5
Framework components A Framework for Elastic Execution of Existing MPI Programs 5
6
Framework design – Approach and Assumptions Target – Iterative HPC Applications Assumption : Uniform work done at every iteration Monitoring at the start of every few iterations of the time-step loop Checkpointing Resource allocation and redistribution A Framework for Elastic Execution of Existing MPI Programs 6
7
Framework design - A Simple Illustration of the Idea Progress checked based on current average iteration time Decision made to stop and restart if necessary Reallocation should not be done too frequently If restarting is not necessary, the application continues running A Framework for Elastic Execution of Existing MPI Programs 7
8
Framework design – A Simple illustration of the idea A Framework for Elastic Execution of Existing MPI Programs 8
9
9 Framework Design Execution flow
10
Other Runtime Steps Steps taken to perform scaling to a different number of nodes: Live variables and arrays need to be collected at the master node and redistributed Read only need not be restored – just retrieve Application is restarted with each node reading the local portions of the redistributed data. A Framework for Elastic Execution of Existing MPI Programs 10
11
Background – Amazon cloud Amazon Elastic compute cloud (EC2) Small instances : 1.7 GB of memory, 1 EC2 Compute Unit, 160 GB of local instance storage, 32-bit platform Large instances : 7.5 GB of memory, 4 EC2 Compute Units, 850 GB of local instance storage, 64-bit platform On demand, reserved, spot instances Amazon Simple Storage Service (S3) Provides key - value store Data stored in files Each file restricted to 5 GB Unlimited number of files A Framework for Elastic Execution of Existing MPI Programs 11
12
Runtime support modules Resource allocator Elastic execution Input taken from the decision layer on the number of resources Allocating de- allocating resources in AWS environment MPI configuration for these instances Setting up of the MPI cluster Configuring for password less login among nodes A Framework for Elastic Execution of Existing MPI Programs 12
13
Runtime support modules Check pointing and redistribution Multiple design options feasible with the support available on AWS Amazon S3 Unmodified Arrays Quick access from EC2 instances Arrays stored in small sized chunks Remote file copy Modified arrays (live arrays) File writes and reads A Framework for Elastic Execution of Existing MPI Programs 13
14
Runtime support modules Check pointing and redistribution Current design Knowledge of division of the original dataset necessary Aggregation and redistribution done centrally on a single node Future work Source to source transformation tool Decentralized array distribution schemes A Framework for Elastic Execution of Existing MPI Programs 14
15
Experiments Framework and approach evaluated using Jacobi Conjugate Gradient (CG ) MPICH 2 used 4, 8 and 16 small instances used for processing the data Observation made with and without scaling the resources - Overheads 5-10%, which is negligible A Framework for Elastic Execution of Existing MPI Programs 15
16
Experiments – Jacobi A Framework for Elastic Execution of Existing MPI Programs 16
17
Experiments – Jacobi Matrix updated at every iteration Updated matrix collected and redistributed at node change Worst case total redistribution overhead – less than 2% Scalable application – performance increases with number of nodes A Framework for Elastic Execution of Existing MPI Programs 17
18
Experiments - CG A Framework for Elastic Execution of Existing MPI Programs 18
19
Experiments - CG Single vector which needs to be redistributed Communication intensive application Not scalable Overheads are still low A Framework for Elastic Execution of Existing MPI Programs 19
20
Decision Layer - Design Main Goal – To meet user demands Constraints – Time and Cost – “Soft” and not “Hard” Measuring iteration time to determine progress Measuring communication overhead to estimate scalability Moving to large – type instances if necessary A Framework for Elastic Execution of Existing MPI Programs 20
21
Feedback Model (I) Dynamic estimation of node count based on inputs : Input time / Cost Per iteration time Current node count Communication time per iteration Overhead costs – restart, redistribution, data read A Framework for Elastic Execution of Existing MPI Programs 21
22
Feedback Model (II) Move to large instances if communication time is greater than 30 % of total time Time Criteria : New node count found based on the current progress If time criteria cannot be met with max nodes also, shift to max nodes to get best results Cost Criteria : Change at the end of billing cycle If cost criteria cannot be met with min nodes also, shift to min nodes. A Framework for Elastic Execution of Existing MPI Programs 22
23
Decision layer - Implementation Input : Monitoring Interval, Criteria, Initial Node Count, Input Time / Cost Output : Total Process Time, Total Cost A Framework for Elastic Execution of Existing MPI Programs 23
24
Experiments – Time Criteria Jacobi A Framework for Elastic Execution of Existing MPI Programs 24
25
Experiments – Time Criteria Jacobi A Framework for Elastic Execution of Existing MPI Programs 25 Input Time (In secs)
26
Experiments – Time Criteria CG A Framework for Elastic Execution of Existing MPI Programs 26
27
Experiments – Cost Criteria Jacobi A Framework for Elastic Execution of Existing MPI Programs 27
28
Experiments – Cost Criteria Jacobi A Framework for Elastic Execution of Existing MPI Programs 28 Input Cost (in $)
29
Experiments – Cost Criteria Jacobi A Framework for Elastic Execution of Existing MPI Programs 29
30
Experiments – Cost Criteria CG A Framework for Elastic Execution of Existing MPI Programs 30
31
Experiments – Cost Criteria CG A Framework for Elastic Execution of Existing MPI Programs 31
32
Experiments – Cost Criteria CG A Framework for Elastic Execution of Existing MPI Programs 32
33
Conclusion An approach to make MPI applications elastic and adaptable An automated framework for deciding the number of instances for execution based on user demands – time / cost Framework tested using 2 MPI applications showing low overheads during elastic execution and best efforts to meet user constraints A Framework for Elastic Execution of Existing MPI Programs 33
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.