A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.

Slides:

Advertisements

Similar presentations

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University

Advertisements

Adding the Easy Button to the Cloud with SnowFlock and MPI Philip Patchin, H. Andrés Lagar-Cavilla, Eyal de Lara, Michael Brudno University of Toronto.

Piccolo: Building fast distributed programs with partitioned tables Russell Power Jinyang Li New York University.

SLA-Oriented Resource Provisioning for Cloud Computing

EHarmony in Cloud Subtitle Brian Ko. eHarmony Online subscription-based matchmaking service Available in United States, Canada, Australia and United Kingdom.

1. Topics Is Cloud Computing the way to go? ARC ABM Review Configuration Basics Setting up the ARC Cloud-Based ABM Hardware Configuration Software Configuration.

Parallel Programming Laboratory1 Fault Tolerance in Charm++ Sayantan Chakravorty.

MPI in uClinux on Microblaze Neelima Balakrishnan Khang Tran 05/01/2006.

Cactus in GrADS (HFA) Ian Foster Dave Angulo, Matei Ripeanu, Michael Russell.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

Cloud based Dynamic workflow with QOS for Mass Spectrometry Data Analysis Thesis Defense: Ashish Nagavaram Graduate student Computer Science and Engineering.

MultiJob PanDA Pilot Oleynik Danila 28/05/2015. Overview Initial PanDA pilot concept & HPC Motivation PanDA Pilot workflow at nutshell MultiJob Pilot.

MATE-EC2: A Middleware for Processing Data with Amazon Web Services Tekin Bicer David Chiu* and Gagan Agrawal Department of Compute Science and Engineering.

Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.

Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.

IPDPS, Supporting Fault Tolerance in a Data-Intensive Computing Middleware Tekin Bicer, Wei Jiang and Gagan Agrawal Department of Computer Science.

A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.

Naixue GSU Slide 1 ICVCI’09 Oct. 22, 2009 A Multi-Cloud Computing Scheme for Sharing Computing Resources to Satisfy Local Cloud User Requirements.

Department of Computer Science Engineering SRM University

Self Adaptivity in Grid Computing Reporter : Po - Jen Lo Sathish S. Vadhiyar and Jack J. Dongarra.

A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.

Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar.

Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.

A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.

Performance Issues in Parallelizing Data-Intensive applications on a Multi-core Cluster Vignesh Ravi and Gagan Agrawal

Introduction to Hadoop and HDFS

An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.

Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.

1 Time & Cost Sensitive Data-Intensive Computing on Hybrid Clouds Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The.

CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.

1 A Framework for Data-Intensive Computing with Cloud Bursting Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The Ohio.

A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.

Porting Irregular Reductions on Heterogeneous CPU-GPU Configurations Xin Huo, Vignesh T. Ravi, Gagan Agrawal Department of Computer Science and Engineering.

Evaluating FERMI features for Data Mining Applications Masters Thesis Presentation Sinduja Muralidharan Advised by: Dr. Gagan Agrawal.

Smita Vijayakumar Qian Zhu Gagan Agrawal 1.  Background  Data Streams  Virtualization  Dynamic Resource Allocation  Accuracy Adaptation  Research.

Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.

Mehmet Can Kurt, The Ohio State University Gagan Agrawal, The Ohio State University DISC: A Domain-Interaction Based Programming Model With Support for.

Computer Science and Engineering Predicting Performance for Grid-Based P. 1 IPDPS’07 A Performance Prediction Framework.

CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.

SAGA: Array Storage as a DB with Support for Structural Aggregations SSDBM 2014 June 30 th, Aalborg, Denmark 1 Yi Wang, Arnab Nandi, Gagan Agrawal The.

Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.

Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.

1 Supporting Dynamic Migration in Tightly Coupled Grid Applications Liang Chen Qian Zhu Gagan Agrawal Computer Science & Engineering The Ohio State University.

Implementing Data Cube Construction Using a Cluster Middleware: Algorithms, Implementation Experience, and Performance Ge Yang Ruoming Jin Gagan Agrawal.

Efficient Live Checkpointing Mechanisms for computation and memory-intensive VMs in a data center Kasidit Chanchio Vasabilab Dept of Computer Science,

Elastic Cloud Caches for Accelerating Service-Oriented Computations Gagan Agrawal Ohio State University Columbus, OH David Chiu Washington State University.

Euro-Par, 2006 ICS 2009 A Translation System for Enabling Data Mining Applications on GPUs Wenjing Ma Gagan Agrawal The Ohio State University ICS 2009.

PDAC-10 Middleware Solutions for Data- Intensive (Scientific) Computing on Clouds Gagan Agrawal Ohio State University (Joint Work with Tekin Bicer, David.

Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.

Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.

HPC HPC-5 Systems Integration High Performance Computing 1 Application Resilience: Making Progress in Spite of Failure Nathan A. DeBardeleben and John.

AUTO-GC: Automatic Translation of Data Mining Applications to GPU Clusters Wenjing Ma Gagan Agrawal The Ohio State University.

Michael J. Voss and Rudolf Eigenmann PPoPP, ‘01 (Presented by Kanad Sinha)

Department of Computer Science, Johns Hopkins University Pregel: BSP and Message Passing for Graph Computations EN Randal Burns 14 November 2013.

© 2015 MetricStream, Inc. All Rights Reserved. AWS server provisioning © 2015 MetricStream, Inc. All Rights Reserved. By, Srikanth K & Rohit.

St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:

Evaluating and Optimizing Indexing Schemes for a Cloud-based Elastic Key- Value Store Apeksha Shetty and Gagan Agrawal Ohio State University David Chiu.

SEMINAR ON.  OVERVIEW -  What is Cloud Computing???  Amazon Elastic Cloud Computing (Amazon EC2)  Amazon EC2 Core Concept  How to use Amazon EC2.

MASS Java Documentation, Verification, and Testing

Supporting Fault-Tolerance in Streaming Grid Applications

Year 2 Updates.

Department of Computer Science University of California, Santa Barbara

Data-Intensive Computing: From Clouds to GPU Clusters

Smita Vijayakumar Qian Zhu Gagan Agrawal

Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz

Parallel Programming in C with MPI and OpenMP

Department of Computer Science University of California, Santa Barbara

Harrison Howell CSCE 824 Dr. Farkas

Supporting Online Analytics with User-Defined Estimation and Early Termination in a MapReduce-Like Framework Yi Wang, Linchuan Chen, Gagan Agrawal The.

L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher

Presentation transcript:

A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1

Motivation  Emergence of Cloud Computing Including for HPC Applications  Key Advantages of Cloud Computing Elasticity (dynamically acquire resources) Pay-as-you model Can be exploited to meet cost and/or time constraints  Existing HPC Applications MPI-based, use fixed number of nodes  Need to make Existing MPI Applications Elastic A Framework for Elastic Execution of Existing MPI Programs 2

Detailed Research Objective  To make MPI applications elastic Exploit key advantage of Cloud Computing Meet user defined time and/or cost constraints Avoid new programming model or significant recoding  Design a framework for Decision making When to expand or contract Actual Support for Elasticity Allocation, Data Redistribution, Restart A Framework for Elastic Execution of Existing MPI Programs 3

Outline  Research Objective  Framework Design  Run time support modules  Experimental Platform: Amazon Cloud Services  Applications and Experimental Evaluation  Conclusion A Framework for Elastic Execution of Existing MPI Programs 4

Framework components A Framework for Elastic Execution of Existing MPI Programs 5

Framework design – Approach and Assumptions  Target – Iterative HPC Applications  Assumption : Uniform work done at every iteration  Monitoring at the start of every few iterations of the time-step loop  Checkpointing and redistribution  Calculate required iteration time based on user input A Framework for Elastic Execution of Existing MPI Programs 6

Framework design - Modification to Source Code  Progress checked based on current average iteration time  Decision made to stop and restart if necessary  Reallocation should not be done too frequently  If restarting is not necessary, the application continues running A Framework for Elastic Execution of Existing MPI Programs 7

8 Framework Design Execution flow

Other Runtime Steps  Steps taken to perform scaling to a different number of nodes:  Live variables and arrays need to be collected at the master node and redistributed  Read only need not be restored – just retrieve  Application is restarted with each node reading the local portions of the redistributed data. A Framework for Elastic Execution of Existing MPI Programs 9

Runtime support modules Decision layer  Interaction with user and application program  Constraints- Time or cost  Monitoring the progress and making a decision  Current work :  Measuring communication overhead and estimating scalability  Moving to large – type instances if necessary A Framework for Elastic Execution of Existing MPI Programs 10

Framework design – Modification to Source Code A Framework for Elastic Execution of Existing MPI Programs 11

Background – Amazon cloud  Services used in our framework :  Amazon Elastic compute cloud (EC2)  Virtual images called instances  Small instances : 1.7 GB of memory, 1 EC2 Compute Unit, 160 GB of local instance storage, 32-bit platform  Large instances : 7.5 GB of memory, 4 EC2 Compute Units, 850 GB of local instance storage, 64-bit platform  On demand, reserved, spot instances A Framework for Elastic Execution of Existing MPI Programs 12

Background – Amazon cloud  Amazon Simple Storage Service (S3)  Provides key - value store  Data stored in files  Each file restricted to 5 GB  Unlimited number of files A Framework for Elastic Execution of Existing MPI Programs 13

Runtime support modules Resource allocator  Elastic execution  Input taken from the decision layer on the number of resources  Allocating de- allocating resources in AWS environment  MPI configuration for these instances  Setting up of the MPI cluster  Configuring for password less login among nodes A Framework for Elastic Execution of Existing MPI Programs 14

Runtime support modules Check pointing and redistribution  Multiple design options feasible with the support available on AWS  Amazon S3  Unmodified Arrays  Quick access from EC2 instances  Arrays stored in small sized chunks  Remote file copy  Modified arrays (live arrays)  File writes and reads A Framework for Elastic Execution of Existing MPI Programs 15

Runtime support modules Check pointing and redistribution  Current design  Knowledge of division of the original dataset necessary  Aggregation and redistribution done centrally on a single node  Future work  Source to source transformation tool  Decentralized array distribution schemes A Framework for Elastic Execution of Existing MPI Programs 16

Experiments  Framework and approach evaluated using  Jacobi  Conjugate Gradient (CG )  MPICH 2 used  4, 8 and 16 small instances used for processing the data  Observation made with and without scaling the resources - Overheads 5-10%, which is negligible A Framework for Elastic Execution of Existing MPI Programs 17

Experiments – Jacobi A Framework for Elastic Execution of Existing MPI Programs 18

Experiments – Jacobi A Framework for Elastic Execution of Existing MPI Programs 19

Experiments – Jacobi  Matrix updated at every iteration  Updated matrix collected and redistributed at node change  Worst case total redistribution overhead – less than 2%  Scalable application – performance increases with number of nodes A Framework for Elastic Execution of Existing MPI Programs 20

Experiments - CG A Framework for Elastic Execution of Existing MPI Programs 21

Experiments - CG A Framework for Elastic Execution of Existing MPI Programs 22

Experiments - CG  Single vector which needs to be redistributed  Communication intensive application  Not scalable  Overheads are still low A Framework for Elastic Execution of Existing MPI Programs 23

Conclusion  An overall approach to make MPI applications elastic and adaptable  An automated framework for deciding the number of instances for execution  Framework tested using 2 MPI applications showing low overheads during elastic execution A Framework for Elastic Execution of Existing MPI Programs 24