Download presentation
Presentation is loading. Please wait.
2
Heterogeneous and Grid Computing2 Communication models u Modeling the performance of communications –Huge area –Two main communities »Network designers »HPC (network users) –Different goals and approaches »Complex detailed models of behavior (simulation, queues, etc. – design analysis) u Performance parameters are not primary »Simple and efficient predictive performance models
3
Heterogeneous and Grid Computing3 Communication models (ctd) u HPC communication performance models –Simple and efficient »Small number of measurable parameters (LogP) u Hardware platforms –Clusters –Local networks –Dedicated global networks –Global networks connected via Internet
4
Heterogeneous and Grid Computing4 Communication models (ctd) u Communication level –Different levels »Stack of protocols and software –Most relevant level »The level of communication middleware used in HPC programs (application programmer’s level) u MPI, PVM –Lower levels »For system programmers (say, MPI implementers)
5
Heterogeneous and Grid Computing5 Communication models (ctd) u Objectives –Predict the execution time of communication operations »For algorithm and program design –Optimization of communication operations »Collective MPI communications
6
Heterogeneous and Grid Computing6 Heterogeneous clusters u Most analytical predictive models used for heterogeneous clusters –Inherently homogeneous »Originally designed for homogeneous clusters »Execution time of communication operation only depends on u Topology u The number of participating processors
7
Heterogeneous and Grid Computing7 Heterogeneous clusters (ctd) u Homogeneous communication models –Very simple (linear) »Differ in formation of constant and variable parts –Typical structure »Point-to-point communications are the basis u A small set of integral parameters having the same value for each pair of processors »Collectives are expressed as combination of p2p’s u Time analytically predicted depending on message size and the number of processors
8
Heterogeneous and Grid Computing8 Heterogeneous clusters (ctd) u Two main issues –Model design –Efficient and accurate estimation of the model u Estimation of homogeneous models –For homogeneous clusters »p2p parameters are found statistically u From measurements of communications between any two processors –For heterogeneous clusters »p2p parameters are found by averaging values for all pairs
9
Heterogeneous and Grid Computing9 Homogeneous models u Homogeneous analytical predictive communication models –The Hockney model –LogP –LogGP –PLogP
10
Heterogeneous and Grid Computing10 The Hockney model u Time of p2p communication is α+β×m –α – the latency –β – bandwidth –m – message size u Estimation –Directly from p2p tests for different message sizes with linear regression –Each test »Measures the time of roundtrip u Sending and receiving a message of size m, or u Sending a message of size m and receiving a zero- sized message
11
Heterogeneous and Grid Computing11 The LogP model The main parameters of the LogP model –L: An upper bound on the latency »The delay, incurred in sending a message from its source processor to its target processor –o: The overhead »The length of time that a processor is engaged in the transmission or reception of each message; during this time the processor cannot perform other operations => Point-to-point communication time L 2xo
12
Heterogeneous and Grid Computing12 The LogP model –g: The gap between messages »The minimum time interval between consecutive message transmissions or consecutive message receptions at a processor u Transmission of at most L/g messages simultaneously –P: The number of processors
13
Heterogeneous and Grid Computing13 The LogGP model u LogGP – extension of LogP for messages of the arbitrary size m –G – the gap per byte for large messages –p2p communication time
14
Heterogeneous and Grid Computing14 The PLogP model u PLogP – parameterized LogP –o s (m) and o r (m) »Send and receive overheads »Functions of the message size –g(m) – the gap »Function of message size »g(m)≥ o s (m), g(m)≥ o r (m) u p2p time : L+ o s (m)+ o r (m)
15
Heterogeneous and Grid Computing15 Estimation of LogP-based models u o s (m) –directly from the execution time of the send operation sending a message of m bytes »Results of a number of experiments are averaged (~tens) u o r (m) –directly from the time of receiving a message of m bytes in the roundtrip » After completion of the send, processor i waits for some time and only then posts a receive operation »The execution time of the receive operation approximates o r (m)
16
Heterogeneous and Grid Computing16 Estimation of LogP-based models (ctd) u g(m) –Directly from the execution time s n (m) of sending without reply a large number n of messages of size m »As, then »n is obtained from saturation process (thousands or more) u L –From the execution time of a roundtrip sending and receiving a messages of size 1 »The time: 2×L+2×(o s (1)+o r (1))
17
Heterogeneous and Grid Computing17 Estimation of LogP-based models (ctd) LogP/LogGPPLogP o(o s (1)+o r (1))/2 gg(1) Gg(m)/m PP
18
Heterogeneous and Grid Computing18 Homogeneous models: collectives u The homogeneous models –Used for analytical prediction of the execution time of different algorithms of collective communications »In applications »In MPI implementations u Optimization of collective operations upon installation of MPI implementation
19
Heterogeneous and Grid Computing19 Homogeneous models: collectives (ctd) u The traditional homogeneous models –Linear »p2p and collective operations are linear functions of message size (except PLogP) –Deterministic »p2p is deterministic => all collectives too u Recent results show that it is not true for many popular platforms –For example, single-switched clusters with MPI stack including TCP/IP layer
20
Heterogeneous and Grid Computing20 Homogeneous models: collectives (ctd) u Many-to-one (flat tree, as in MPI standard)
21
Heterogeneous and Grid Computing21 Homogeneous models: collectives (ctd) u One-to-many (flat tree)
22
Heterogeneous and Grid Computing22 Homogeneous models: collectives (ctd) u Extra parameters of a more accurate model –M 1 =M 1 (n) (gather escalations begin) –M 2 =const (gather escalations stop) –k, the number of levels of escalation –T i, the execution time for i-th escalation level –f i (n, m), the probability of escalation to level i »depending on the number of involved processors, n, and the message size, m (M 1 ≤m≤M 2 ) –S, the scatter leap happens »S=M 2
23
Heterogeneous and Grid Computing23 Homogeneous models: collectives (ctd) u Discrete constant levels of escalation (tens- and hundreds-fold) u Probability of escalation to level is found
24
Heterogeneous and Grid Computing24 Homogeneous models: collectives (ctd) u Application of the more accurate model –Optimization of MPI_Scatter and MPI_Gather »Eliminating the non-determinism and non-linearity of MPI_Gather if (M 1 ≤m≤M 2 ) { find N such that (m/N)<M 1 and (m/(N-1))≥M 1 ; for (i=0; i<; i++) { MPI_Barrier(comm); MPI_Gather(sendbuf + i*(m/N), m/N); } } else MPI_Gather(sendbuf, m);
25
Heterogeneous and Grid Computing25 Homogeneous models: collectives (ctd) u Application of the more accurate model (ctd) »Elimination of the non-linearity of MPI_Scatter if (m>S) { find N such that (m/N)<S and (m/(N-1))≥S ; for (i=0; i<; i++) { MPI_Scatter(recvbuf + i*(m/N), m/N); } } else MPI_Scatter(recvbuf, m);
26
Heterogeneous and Grid Computing26 Homogeneous models: collectives (ctd)
27
Heterogeneous and Grid Computing27 Heterogeneous communication models u None of the traditional models reflects heterogeneity of the processors –p2p parameters average real ones –The averages are used in modelling collectives –If some processors significantly differ in performance »The model may become quite inaccurate u More accurate models would have different p2p parameters
28
Heterogeneous and Grid Computing28 Heterogeneous communication model: case study u Cluster of heterogeneous computers –Switched Ethernet network –MPI –The most common platform for heterogeneous parallel computing u Objectives –Efficient prediction of communication cost of parallel algorithms/MPI programs –Effective and efficient building of the model
29
Heterogeneous and Grid Computing29 Heterogeneous communication model: case study ctd) u Heterogeneous point-to-point –processor parameters - fixed delays - variable delays –link parameters - transmission rate u Parameters cannot be found from other p2p –Hockney/LogGP: parameters are insufficient to find variable processing delays and transmission rates –PLogP: parameters are functions of message size u Design of communication experiments –more than 2 linear parameters u Minimization of the number of measurements
30
Heterogeneous and Grid Computing30 Heterogeneous communication model: case study (ctd) u One-to-many (scatter type)
31
Heterogeneous and Grid Computing31 Heterogeneous communication model: case study (ctd) Many-to-one model for small messages
32
Heterogeneous and Grid Computing32 Heterogeneous communication model: case study (ctd) Many-to-one model for large messages
33
Heterogeneous and Grid Computing33 Design of communication experiments u Fixed processing delays ( unknowns) experiments u Variable processing delays ( unknowns) u Transmission rates ( unknowns) experiments
34
Heterogeneous and Grid Computing34 Design of communication experiments (ctd) u Additional experiments u Solution
35
Heterogeneous and Grid Computing35 Design of communication experiments (ctd) u Measurements –a small number of measurements –particular message sizes (0 and m<S) –fast roundtrips (one-to-one, one-to-two) u Calculations – comparisons –simple expressions to get values u Averaging –within solution (n fixed, n variable processing delays, and transmission rates) –within measurements (for accurate measurement of the communication execution time)
36
Heterogeneous and Grid Computing36 Optimization of application u A real-time satellite imaging application u A sequence of raw data images divided into partitions for parallel processing by a cluster M2McM1M2McM1 n 1 n 2
37
Heterogeneous and Grid Computing37 Redesigning application Calculate the number of sub-partitions m of a partition of the medium size M so that: Replace MPI_Gather with sequence of MPI_Gather for smaller messages
38
Heterogeneous and Grid Computing38 Communication models (ctd) u Other heterogeneous platforms –Local network of computers –Global networks »Dedicated communication channels »Internet-connected u Currently used models –Simplified versions of cluster models »p2p communication time is modeled by β × m
39
Heterogeneous and Grid Computing39 Communication models (ctd) u Wide-area links –Dedicated »Serial links between remote computers »Numerous algorithms for optimization of communication operations –Internet »Allow for parallel simultaneous communications between two remote computers without degradation of bandwidth »New area of R&D
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.