Download presentation
Presentation is loading. Please wait.
Published byJessica Ramsey Modified over 9 years ago
1
Efficiency of small size tasks calculation in grid clusters using parallel processing.. Olgerts Belmanis Jānis Kūliņš RTU ETF Riga Technical University
2
.. Krakow, CGW 07, 15-16 Okt 2
3
3 RTU Cluster ■ Initially RTU cluster started with five servers AMD Opteron 146. 1TB ■ Additionaly was installed eight dual core AMD Opteron 2210 M2. ■ Therefore now there are 9 working nodes with 21 CPU units. ■ Total amount of memory is 1,8 TB. ■ RTU cluster successfully completed many calculation tasks including LHCB virtual organization orders. Krakow, CGW 07, 15-16 Okt
4
4 RTU Cluster Krakow, CGW 07, 15-16 Okt
5
RTU Cluster 5 Krakow, CGW 07, 15-16 Okt
6
Computing Algorithms ■ Serial algorithm One task – one WN (working node); Parts of task performed serial; Task execution time depend on WN performance only! ■ Paralel algorithm One task – several WN; Parts of task performed: ► Consecutive on separate WN ► In parallel on number of WN; rezults summerizing Task execution time depend on: ► WN performance; ► Network performance; ► Bandwith of shared data stocks; ► Type of coding. 6 Krakow, CGW 07, 15-16 Okt
7
Bottlenecks in distributive computing system 7 Krakow, CGW 07, 15-16 Okt
8
8
9
Interconnections between CPU nodes 9 ************************************************************ task 0 is on wn03.grid.etf.rtu.lv partner= 2 task 1 is on wn10.grid.etf.rtu.lv partner= 3 task 2 is on wn10.grid.etf.rtu.lv partner= 0 task 3 is on wn10.grid.etf.rtu.lv partner= 1 ************************************************************ ***Message size: 1000000 *** best / avg / worst (MB/sec) task pair: 0 - 2: 103.31 / 102.29 / 53.64 task pair: 1 - 3: 371.33 / 197.63 / 134.05 task pair: 1 - 3: 371.33 / 197.63 / 134.05 OVERALL AVERAGES: 237.32 / 149.96 / 93.84...use of multicore servers help to achieve higher data transmission rate in MPI applications! Krakow, CGW 07, 15-16 Okt
10
Local interconnection rate CPU number Low rate Mb/sMedium rate Mb/s Peek rate Mb/s 495140240 6186090 836098 10 Transmission rate dependence of number of CPU....MPI used number of CPU have influence to intermediate connection rate!!! Krakow, CGW 07, 15-16 Okt
11
Parallel application execution time Krakow, CGW 07, 15-16 Okt 11
12
Paralel speedup determination ■ During experiment multiplication of large matrixes has been done. ■ Test create traffic between WN more than some 10 Mb and loaded processors. ■ Main task of the experiment is to find beginning of horizontal part of speed up curve. ■ Experiment on 1 CPU in RTU cluster takes 420 seconds. Krakow, CGW 07, 15-16 Okt 12
13
2x WN ≠ H/2...according to Amdal’s law that speed-up conform with 20% serial algorithm code! 13 Krakow, CGW 07, 15-16 Okt
14
Possible solutions: ■ Internal connection improvement: Infiniband, Myranet….connections between WN; Multicore WN implementation (RTUETF); NFS network file system abandonment. ■ Data transfer process optimizing: Number of flows using; Replace standard TCP protocol to Scalable TCP; ■ Parallel algorithm processing optimization: Minimize transactions between WN; Reduce sequential part of MPI code; Optimization of MPI threat number. ■ Optimization of requested resource management 14 Krakow, CGW 07, 15-16 Okt
15
. Thank you for attention! 15 Krakow, CGW 07, 15-16 Okt
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.