Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and.

Similar presentations


Presentation on theme: "Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and."— Presentation transcript:

1 Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and Distributed Systems (PMEO-PDS 2007) Rosa Filgueira, David E. Singh, Florin Isaila, Jesús Carretero, Antonio G. Loureiro

2 Sumary I. Description of the problem II. Main objetives III. Parallel I/O storage IV. Evaluation V. Optimization the I/O VI. Conclusions

3 I. Description of the problem (I) BISP3D is a semiconductor devices simulator based on finite element methods. Optimization and evaluation of parallel I/O for the BISP3D.

4 I. Description of the problem (II) The mesh is divided into several sub-domains (METIS). Each processor makes calculations only with local data. The results are stored in a sequential way. The sequential storage is an important bottleneck.

5 II. Main objectives (I) Objetives:  Evaluation of the sequential I/O cost.  Implementing parallel I/O techniques.  Developing a method for selecting the most appropriate I/O technique based on the network type, mesh size and data set size.  Introducing a new data clustering technique called Interval Data Grouping (IDG).

6 II. Main objectives (II)  Several I/O configurations has been implemented and evaluated: Sequential I/O over NFS. Sequential I/O over PVFS. Parallel I/O over PVFS (unoptimized). Parallel I/O over PVFS with two phase I/O. Parallel I/O over PVFS with List I/O.

7 III. Parallel I/O All processors write on the disk their local data. Each processor constructs a view over the file using the distibution provided by METIS. 1324567981012111413151617181920 124981211141315 35671016171920 View over the file for processor 1 View over the file for processor 2 1324567981012111413151617181920 Metis distribution for partition 0Metis distribution for partition 1

8 IV. Evaluation (I) We have make tests:  Different networks (Myrinet and Fast Ethernet),  Different meshes. Mesh 1Mesh 2Mesh 3Mesh 4 Nodes472193288873260289650 Vertices3051202104374169502027885

9 IV. Evaluation (II) 13245 121’2’33’454’5’ 11’’1’22’2’’33’’3’44’’4’5’55’’ Mesh with load 2 Mesh Mesh with load 3 Using a parameter (Load) we increase the size of the mesh Note that with this parameter we change the grain size of the acceses

10 IV. Evaluation: Myrinet Two phaseList I/O

11 IV. Evaluation: : Fast Ethernet List I/O

12 IV. Evaluation: Decision tree Nx=70,000 Nld=50 Nld=90

13 V. Optimizing the I/O We introduce a novel technique of data grouping: Interval Data Grouping (IDG)  The goal of IDG: grouping data for I/O in order to increase the locality and reduce the disk write time.

14 V. Optimizing the I/O : Distribution of example mesh 02134567 01234 12456 BISP3D Data distribution Processor 0 Processor 1 Local Shared 57 7 01357 246 Processor 0 Processor 1 METIS assignation

15 V. Optimizing the I/O : Distribution of example mesh (II) IDG algorithm has two stages:  Node classification: Analyze the mesh structure and Metis distribution to clasifying mesh node (shared or local):  Disk access scheduler: For local nodes  they are written by processor which belong to For shared nodes  we have to choose the most appropriate one looking its previous and subsequent node.

16 V. Optimizing the I/O : Distribution of example mesh (III) 02134567 01234 67 5 Processor 0 Processor 1 Local Shared IDG distribution 01234 12456 BISP3D Data distribution Processor 0 Processor 1 57 7 01357 246 Processor 0 Processor 1 METIS assignation

17 V. Optimizing the I/O : evaluation (I) We have combined IDG with List I/O for different meshes and different loads. We have compared the IDG performance with other strategies:  METIS  Original node distribution.  Random  Each shared node is assigned to partition radomly.  First Position  Each shared node is assigned to the first particion among all that it belongs.

18 Results: percentage of shared nodes Percentage of shared nodes for 8,16, 32,and 64 partitions for each mesh.  IDG : large number of partitions  more shared nodes  larger data intervals  Improved performance

19 V. Optimizing the I/O : evaluation (II)

20 VI. Conclusions Optimization and evaluation of parallel I/O operations for BISP3D simulator. A decision tree to choose the best I/O configuration was made. We have introduced a novel technique which exploits the data replication of mesh nodes for scheduling disk accesses.With this proposal the perfomance of the parallel I/O operations is improved.


Download ppt "Optimization and evaluation of parallel I/O in BIPS3D parallel irregular application Performance Modelling, Evaluation, and optimization of Parallel and."

Similar presentations


Ads by Google