Download presentation
Presentation is loading. Please wait.
Published byAnne Selvey Modified over 10 years ago
1
CoMPI: Enhancing MPI based applications performance and scalability using run-time compression. Rosa Filgueira, David E.Singh, Alejandro Calderón and Jesús Carretero University Carlos III of Madrid.
2
Summary Problem description Main objectives CoMPI Study of compression algorithms. Evaluation of CoMPI Results Conclusions
3
Summary Problem description Main objectives CoMPI Study of compression algorithms. Evaluation of CoMPI Results Conclusions
4
Problem description Cluster architecture solution for scientific applications. Collection of computers working together. Interconnected not always by a fast network. Scientific applications need: Large number of computer nodes. Huge volume of data transferred among the processes. Communication system becomes a limiting factor of performance Network with high latency and low bandwidth Network saturation. Program model by using in clusters is MPI.
5
Main objectives (1/2) Overall Time Scalability Reduce the communication transfer time for MPI.
6
Main objectives (2/2) CoMPI: Optimization of MPI communications by using compression. Compression in all MPI primitives. Fit any MPI application. Transparent to user. Run-time compression. Studding of compression algorithms. Selecting the best algorithm based on message characteristics.
7
Summary Problem description Main objectives CoMPI –How we have integrated compression into MPI –Set of compression algorithms proposed Study of compression algorithms. Evaluation of CoMPI Results Conclusions
8
MPICH architecture (1/2) Point to Point. Collective. MPI Communication mechanism Application Programmer Interface (API). Abstract Device Interface (ADI). Channel Interface (CI) MPICH layers Control the data. Specifies whether the message is sent or receiver. Message queues management. Messages passing protocols. ADI layer Collective routines are implemented by using point-to-point routines. Point to Point are provided by ADI Data compression and decompression. Integrated compression library. Modification ADI layer
9
MPICH architecture (2/2)
10
Compression of MPI Messages (1/2)
11
Compression of MPI Messages (2/2) Header in the exchanged message to inform: –Compression used or not, algorithm and length. All compression algorithms are included in a single Compression Library: –CoMPI can be easily updated. – New compression algorithms can be included. Message size evaluation. Compression algorithm selection. Data compression. Header inclusion. Compression stages Header checking Data decompression Decompression stages
12
Set of compression algorithms proposed (1/2) Compressor selected in CoMPI Smallest overhead Lossless compressor
13
Set of compression algorithms proposed (2/2)
14
Summary Problem description Main objectives CoMPI Study of compression algorithms. –Conclusion of compression study. Evaluation of CoMPI Results Conclusions
15
Study of compression algorithms(1/7) To select the most appropriated algorithm for each datatype based on: –Buffer size. –Redundancy level. To Increase the transmission speed by using compression depends on: –Number of bits sent. –Time required to compress. –Time required to decompress.
16
Study of compression algorithms(2/7) Synthetic datasets Integer Floating- point. Double precision. Each datasets contains buffers with different Buffer size: 100, 500, 900 and 1500 KB. Redundancy level: 0%, 25 %, 50 %, 75 % and 100 % For each algorithm, datatype, buffer size and redundancy level we will study the Complexity and Compression ratio.
17
Study of compression algorithms(3/7)
18
Study of compression algorithms(4/7) Integer dataset
19
Study of compression algorithms(5/7) Floating-point dataset
20
Study of compression algorithms(6/7) Double precision dataset WITHOUT pattern
21
Study of compression algorithms(7/7) Double precision WITH pattern: Data sequence 50001.0, 50003.0, 50005.0 …
22
Conclusion of compression study Integer and Floating-point 0% Redundancy : No compress. 25% to 100 % Redundancy : LZO. Double precision Without PatternLZO.With Pattern 0% to 50 % Redundancy : FPC. 50% to 100 % Redundancy : LZO
23
Summary Problem description Main objectives CoMPI Study of compression algorithms. Evaluation of CoMPI Results Conclusions
24
Evaluation of CoMPI MPICHGM-1.2.7.15NOGM- COMP 64 Nodes Dual Core AMD 512MB of RAM FastEthernet Network. NAS Parallel IS Integer LU Double. BISP3D Float. PSRG Integer STEM-II Float. ApplicationsBenchmarks DistributionCluster
25
Summary Problem description Main objectives CoMPI Study of compression algorithms. Evaluation of CoMPI Results –Real Applications –Benchmarks Conclusions
26
Results (1/5) BISP3D: –Floating-point data. – Improves between x1.2 and x1.4 with LZO.
27
Results (2/5) PSRG: –Integer data. – Improves up to x2 with LZO.
28
Results (3/5) STEM-II: –Floating-point data. – Improves to x1.4 with LZO.
29
Results (4/5) IS : –Integer data. –Improves to x1.2 with LZO. –Rice obtains good results with 32 processes.
30
Results (5/5) LU: –Double precision. – No better performance. Only with 64 processes by using FPC we obtain a speedup of x1.1
31
Summary Problem description Main objectives CoMPI Study of compression algorithms. Evaluation of CoMPI Results Conclusions –Principal Conclusion. –On going.
32
Principal conclusions (1/2) New Compression library integrated into MPI using MPICH distribution CoMPI. CoMPI includes five different compression algorithms and compress all MPI primitives. Main characteristics: –Transparent for the users. –Fit any application without any change in it. We have evaluated CoMPI using: –Synthetic traces. –Real applications.
33
Principal conclusion (2/2) The results of evaluations demonstrated that in most of the cases, the compression: –Reduce the overall execution time. –Enhance the scalability. When compression is not appropriated: –Little performance degradation.
34
On going (1/2) Adaptive Compression Select the most appropriate compression algorithm. Compression Turn on/off In run-time to application Learning from communication history taking account : Message characteristics: Datatype Redundancy level Platform: Network latency and bandwidth Compression algorithms behavior
35
On going (2/2)
36
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.