Simple Interface for Polite Computing (SIPC) Travis Finch St. Edward’s University Department of Computer Science, School of Natural Sciences Austin, TX.

Slides:

Advertisements

Similar presentations

Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.

Advertisements

M. Muztaba Fuad Masters in Computer Science Department of Computer Science Adelaide University Supervised By Dr. Michael J. Oudshoorn Associate Professor.

The Who, What, Why and How of High Performance Computing Applications in the Cloud Soheila Abrishami 1.

1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.

Distributed Process Management

GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.

Technical Architectures

Task Scheduling and Distribution System Saeed Mahameed, Hani Ayoub Electrical Engineering Department, Technion – Israel Institute of Technology

Department of Electrical and Computer Engineering Texas A&M University College Station, TX Abstract 4-Level Elevator Controller Lessons Learned.

Distributed Process Management

Chapter 13 Embedded Systems

Science Advisory Committee Meeting - 20 September 3, 2010 Stanford University 1 04_Parallel Processing Parallel Processing Majid AlMeshari John W. Conklin.

SE is not like other projects. l The project is intangible. l There is no standardized solution process. l New projects may have little or no relationship.

Honeywall CD-ROM. 2 Developers and Speakers  Dave Dittrich University of Washington  Rob McMillen USMC  Jeff Nathan Sygate  William Salusky AOL.

Design, Implementation and Maintenance

SERVICE BROKER. SQL Server Service Broker SQL Server Service Broker provides the SQL Server Database Engine native support for messaging and queuing applications.

Chapter 2 Computer Clusters Lecture 2.1 Overview.

THE AFFORDABLE SUPERCOMPUTER HARRISON CARRANZA APARICIO CARRANZA JOSE REYES ALAMO CUNY – NEW YORK CITY COLLEGE OF TECHNOLOGY ECC Conference 2015 – June.

U.S. Department of the Interior U.S. Geological Survey David V. Hill, Information Dynamics, Contractor to USGS/EROS 12/08/2011 Satellite Image Processing.

ADLB Update Recent and Current Adventures with the Asynchronous Dynamic Load Balancing Library Rusty Lusk Mathematics and Computer Science Division Argonne.

Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.

N Tropy: A Framework for Analyzing Massive Astrophysical Datasets Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey.

© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March

“SEMI-AUTOMATED PARALLELISM USING STAR-P " “SEMI-AUTOMATED PARALLELISM USING STAR-P " Dana Schaa 1, David Kaeli 1 and Alan Edelman 2 2 Interactive Supercomputing.

1 Developing Native Device for MPJ Express Advisor: Dr. Aamir Shafi Co-advisor: Ms Samin Khaliq.

Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.

Tools and Utilities for parallel and serial codes in ENEA-GRID environment CRESCO Project: Salvatore Raia SubProject I.2 C.R. ENEA-Portici. 11/12/2007.

Low-Power Wireless Sensor Networks

Chapter 6 : Software Metrics

March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.

Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.

The Cluster Computing Project Robert L. Tureman Paul D. Camp Community College.

Early Adopter: Integrating Concepts from Parallel and Distributed Computing into the Undergraduate Curriculum Eileen Kraemer Computer Science Department.

A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.

Model-Driven Analysis Frameworks for Embedded Systems George Edwards USC Center for Systems and Software Engineering

1 Distributed Energy-Efficient Scheduling for Data-Intensive Applications with Deadline Constraints on Data Grids Cong Liu and Xiao Qin Auburn University.

EpiFast: A Fast Algorithm for Large Scale Realistic Epidemic Simulations on Distributed Memory Systems Keith R. Bisset, Jiangzhuo Chen, Xizhou Feng, V.S.

April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.

© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE 498AL, University of Illinois, Urbana-Champaign 1 Basic Parallel Programming Concepts Computational.

UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.

1 Coscheduling in Clusters: Is it a Viable Alternative? Gyu Sang Choi, Jin-Ha Kim, Deniz Ersoz, Andy B. Yoo, Chita R. Das Presented by: Richard Huang.

An Investigation into Implementations of DNA Sequence Pattern Matching Algorithms Peden Nichols Computer Systems Research April,

1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.

Computing Simulation in Orders Based Transparent Parallelizing Pavlenko Vitaliy Danilovich, Odessa National Polytechnic University Burdeinyi Viktor Viktorovych,

Virtual Private Grid (VPG) : A Command Shell for Utilizing Remote Machines Efficiently Kenji Kaneda, Kenjiro Taura, Akinori Yonezawa Department of Computer.

Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.

By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.

Software Development in HPC environments: A SE perspective Rakan Alseghayer.

Finding concurrency Jakub Yaghob. Finding concurrency design space Starting point for design of a parallel solution Analysis The patterns will help identify.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

What is a Microprocessor ? A microprocessor consists of an ALU to perform arithmetic and logic manipulations, registers, and a control unit Its has some.

Computer Science 320 Load Balancing. Behavior of Parallel Program Why do 3 threads take longer than two?

Chapter 8 System Management Semester 2. Objectives  Evaluating an operating system  Cooperation among components  The role of memory, processor,

3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.

Evaluating the Fault Tolerance Capabilities of Embedded Systems via BDM M. Rebaudengo, M. Sonza Reorda Politecnico di Torino Dipartimento di Automatica.

December 13, G raphical A symmetric P rocessing Prototype Presentation December 13, 2004.

LINUX Presented By Parvathy Subramanian. April 23, 2008LINUX, By Parvathy Subramanian2 Agenda ► Introduction ► Standard design for security systems ►

INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.

Parallel Programming Models

OPERATING SYSTEMS CS 3502 Fall 2017

Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming

CRESCO Project: Salvatore Raia

Model-Driven Analysis Frameworks for Embedded Systems

What is Parallel and Distributed computing?

MASS CUDA Performance Analysis and Improvement

Compiler Back End Panel

Compiler Back End Panel

Experiment Instructions

Operating System Overview

Presentation transcript:

Simple Interface for Polite Computing (SIPC) Travis Finch St. Edward’s University Department of Computer Science, School of Natural Sciences Austin, TX Abstract As computing labs begin to rely more on shared commodity workstations to perform parallel computations load balancing cannot be ignored. Parallel applications, by nature, are resource intensive and often times load balancing techniques do not take into consideration external events of the application. This can cause disruption among other users sharing the same computer. A steep learning curve is also present in High-Performance Computing (HPC) for novice programmers, often causing load balancing to be totally ignored. This paper presents Simple Interface for Polite Computing (SIPC), a mechanism that allows external load balancing to be easily integrated into programs where polite resource sharing is necessary. While SIPC can be used with any program, the focus here is the integration of it with embarrassingly parallel applications that follow a dynamic scheduling paradigm. Background Polite computing allows intensive applications to run on a shared workstation HPC application does not excessively consume resources in the presence of other users Allows other users to remain productive and lessens starvation for other processes In his paper "Polite Parallel Computing", Cameron Rivers integrated a simple approach to solving external load balancing into mpiBLAST The algorithm allowed an application to become aware of its surroundings and scale back if needed to distribute computing power Method is effective, but introduces unnecessary overhead and is difficult for novice HPC programmers to utilize SIPC was built around three goals: - A self-contained library easily utilized by novice programmers - Less overhead requirements than Rivers’ implementation - A mechanism that allows dynamic scheduling of system load checks Integrated into three embarrassingly parallel message-passing applications that use MPI: - mpiBLAST - MPI-POVRay - MPI-Mandelbrot All of these applications are parallelized in a straightforward manner and require minimal communications References Rivers, Cameron. "Polite Parallel Computing." Journal of Computing Sciences in Colleges 21 (2006): Hochstein, Lorin, Jeff Carver, Forrest Shull, Sima Asgari, Victor Basili, Jeffrey K. Hollingsworth, and Marvin V. Zelkowitz. "Parallel Programmer Productivity: a Case Study of Novice Parallel Programmers." Proceedings of the 2005 ACM/IEEE Conference on Supercomputing (2005). Darling, A., L. Carey, and W. Feng. "The Design, Implementation, and Evaluation of MpiBLAST." 4th International Conference on Linux Clusters (2003).. "MPI-POVRay." 14 Nov "The Mandelbrot Set." 30 Mar Acknowledgements Faculty Advisor: Dr. Sharon Weber This work was conducted on a computing cluster funded by a grant from the Department of Defense. Results & Future Work The inclusion of SIPC into a host application proved to have very little overhead On average increased execution time by 1% This work represents a beginning in the development of tools designed to improve the efficiency of code written by beginning HPC programmers More accessible tools like SIPC will allow an easier understanding of the concepts behind advanced techniques used in parallel programming Future work for SIPC includes: - A tool that would analyze code and locate a safe spot to place load checking procedure calls - Porting SIPC to other operating system allowing reliable cross-platform capabilities Solution In the HPC community, often times the speed of program execution is the only determination of success. Other relevant factors that are ignored include: - Time to develop the solution - Additional lines of code compared to the serial implementation - The cost per line of code Studies show that HPC development is significantly more expensive than serial development HPC applications that use MPI often contain twice as many lines of code as their serial counterpart Tools are needed that allow advanced aspects such as external load balancing to be injected into the parallel application with minimal effort from the novice programmer SIPC was designed as a self-contained library and requires only two function calls: initialization and load checking SIPC uses a unique mechanism that allows load check timing to be adjusted dynamically at run time It is based on the assumption that if a system is under high load at time t, at time t + x where x is a relatively small amount of time, the system probably remains under high load Implementation A goal of SIPC was to obtain CPU utilization and the number of users currently logged onto the system and not create a large processing footprint Obtaining the CPU load is accomplished by opening the /proc/loadavg file and retrieving the first value in it with the fopen and fscanf functions Counting the number of users is achieved by executing the users shell command and capturing the return stream via the popen and fgets functions If a system is under high load at least two users must be logged into the system A condition check prevents the application from sleeping on a semi-high load with a low number of users A final condition check determines if the system load is high enough to sleep and does for a predetermined amount of time if it is necessary If the target application sleeps, the timing mechanism used to schedule load checks is reset so another check will occur soon If the target application does not sleep, the duration between load checks is doubled