Alexey Lastovetsky, Maureen O’Flynn

Slides:



Advertisements
Similar presentations
High Performance Computing Course Notes Parallel I/O.
Advertisements

Experiments and Variables
Hybrid System Verification Synchronous Workshop 2003 A New Verification Algorithm for Planar Differential Inclusions Gordon Pace University of Malta December.
Random Number Generation. Random Number Generators Without random numbers, we cannot do Stochastic Simulation Most computer languages have a subroutine,
Sampling Mathsfest Why Sample? Jan8, 2003 Air Midwest Flight 5481 from Douglas International Airport in North Carolina stalled after take off, crashed.
Matrix Multiplication on Two Interconnected Processors Brett A. Becker and Alexey Lastovetsky Heterogeneous Computing Laboratory School of Computer Science.
Non-Linear Problems General approach. Non-linear Optimization Many objective functions, tend to be non-linear. Design problems for which the objective.
A Parallel Computational Model for Heterogeneous Clusters Jose Luis Bosque, Luis Pastor, IEEE TRASACTION ON PARALLEL AND DISTRIBUTED SYSTEM, VOL. 17, NO.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Towards Data Partitioning for Parallel Computing on Three Interconnected Clusters Brett A. Becker and Alexey Lastovetsky Heterogeneous Computing Laboratory.
ISPDC 2007, Hagenberg, Austria, 5-8 July On Grid-based Matrix Partitioning for Networks of Heterogeneous Processors Alexey Lastovetsky School of.
Building the communication performance model of heterogeneous clusters based on a switched network Alexey Lastovetsky
High Performance Communication using MPJ Express 1 Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan 29 June 2015.
Nor Asilah Wati Abdul Hamid, Paul Coddington. School of Computer Science, University of Adelaide PDCN FEBRUARY 2007 AVERAGES, DISTRIBUTIONS AND SCALABILITY.
CS352- Link Layer Dept. of Computer Science Rutgers University.
Heterogeneous and Grid Computing2 Communication models u Modeling the performance of communications –Huge area –Two main communities »Network designers.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Speaker: Xin Zuo Heterogeneous Computing Laboratory (HCL) School of Computer Science and Informatics University College Dublin Ireland International Parallel.
Predicting performance of applications and infrastructures Tania Lorido 27th May 2011.
Load Balancing in Distributed Computing Systems Using Fuzzy Expert Systems Author Dept. Comput. Eng., Alexandria Inst. of Technol. Content Type Conferences.
GPU Acceleration of Pyrosequencing Noise Removal Dept. of Computer Science and Engineering University of South Carolina Yang Gao, Jason D. Bakos Heterogeneous.
353(0) TECHNOLOGY SOLUTION CeADAR’s standalone platform personalises the forecasting model for monitored.
Robin McDougall Scott Nokleby Mechatronic and Robotic Systems Laboratory 1.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Copyright © 2006, UCD Dublin Systems Research Group School of Computer Science and Informatics UCD Dublin, Belfield, Dublin 4, Ireland
Management Science Helps analyze and solve organizational problems. It uses scientific and quantitative methods to set up models that are based on controllable.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Science Fair Title Subtitle Goes Here. Question A question or statement showing what you are trying to find out.
Introduction to Physical Science
1 Distributed BDD-based Model Checking Orna Grumberg Technion, Israel Joint work with Tamir Heyman, Nili Ifergan, and Assaf Schuster CAV00, FMCAD00, CAV01,
Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Using Static Code Analysis to Improve Performance of GridRPC Applications Oleg Girko, Alexey Lastovetsky School of Computer Science & Informatics University.
1 Hierarchical Parallelization of an H.264/AVC Video Encoder A. Rodriguez, A. Gonzalez, and M.P. Malumbres IEEE PARELEC 2006.
Yinglei Cheng1,2, Ying Li2, Rongchun Zhao2 2010/01/07 黃千峰 A Parallel Image Fusion Algorithm Based on Wavelet Packet.
Bar Graphs Used to compare amounts, or quantities The bars provide a visual display for a quick comparison between different categories.
Application of a Charge Transfer Model to Space Telescope Data Paul Bristow Dec’03
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
| presented by Vasileios Zois CS at USC 09/20/2013 Introducing Scalability into Smart Grid 1.
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
RT-OPEX: Flexible Scheduling for Cloud-RAN Processing
PI: Professor Yong Zeng Department of Mathematics and Statistics
OPERATING SYSTEMS CS 3502 Fall 2017
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
TrueTime.
Advanced Higher Statistics
Random variables (r.v.) Random variable
UPC Parallel I/O Library
STATISTICS FOR SCIENCE RESEARCH
COMPUTATIONAL MODELS.
When Is Agreement Possible
Presentation at NI Day April 2010 Lillestrøm, Norway
Movable lines class activity.
Parallel and Distributed Simulation Techniques
Steven Whitham Jeremy Woods
Using SCTP to hide latency in MPI programs
S519: Evaluation of Information Systems
CMAQ PARALLEL PERFORMANCE WITH MPI AND OpenMP George Delic, Ph
Analyzing Reliability and Validity in Outcomes Assessment Part 1
Automatic Tuning of Collective Communications in MPI
CLUSTER COMPUTING.
COMP60621 Designing for Parallelism
Javad Ghaderi, Tianxiong Ji and R. Srikant
Fixed, Random and Mixed effects
Section 11.7 Probability.
Chapter 01: Introduction
Multiplexing and Demultiplexing
Real time signal processing
Presentation transcript:

A Performance Model of Many-to-One Collective Communications for Parallel Computing  Alexey Lastovetsky, Maureen O’Flynn UCD School of Computer Science and Informatics Belfield, Dublin 4, Ireland alexey.lastovetsky@ucd.ie, maureen.oflynn@ucd.ie 23 April 2019

Objectives Goal: prediction of the execution time of MPI collective communications on a heterogeneous cluster based on a switched Ethernet Background: performance models of single point-to-point, simultaneous independent point-to-point, one-to-many communications Observation: a significant increase in the execution time of many-to-one communication for medium-sized messages on all platforms and MPI implementations Problem: to model many-to-one communications for medium-sized messages

Performance models for point-to-point and one-to-many communications point-to-point: - execution time - message size - fixed delays - variable delays - transmission rate one-to-many:

Many-to-one collective communications: non-linear and non-deterministic escalations 0.25 0.2 Seconds 0.04 .00001 0 10 20 30 40 50 60 70 80 90 Message size in KB

Parameters of many-to-one model for medium-sized messages M1 MC M2 message size where escalations begin escalations stop occuring message size from which escalations occur with 100% certainty probabilities of escalations

Probability of escalation Discrete constant levels of escalation of values of 40, 200 and 250 times Probability of escalation to level is found

Many-to-one model for small messages

Many-to-one model for large messages

Multi-spectral satellite application A typical real-time satellite imaging application (512x512 bytes) A sequence of raw data images divided into partitions for parallel processing by a cluster M2 Mc M1 n1 n2

Redesigning Application Calculate the number of sub-partitions m of a partition of the medium size M so that: Replace MPI_Gather with sequence of MPI_Gather for smaller messages

Conclusion Results previously undocumented non-linear non-deterministic behaviour for medium messages is analysed many-to-one model is built on the empirical data and point-to-point model parallel application is redesigned in accordance with many-to-one model The work was supported by Science Foundation Ireland.