Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fire Benchmark Parallelisation Programming of Supercomputers WS 11/12 Sam Maurus.

Similar presentations


Presentation on theme: "Fire Benchmark Parallelisation Programming of Supercomputers WS 11/12 Sam Maurus."— Presentation transcript:

1 Fire Benchmark Parallelisation Programming of Supercomputers WS 11/12 Sam Maurus

2 What is Fire Benchmark? CFD solver for arbitrary geometries This project concerned itself with the gccg solver

3 How Fast is Fire Benchmark Sequentially?

4 What effect does the input file- format have?

5 Data structures in gccg Points Elements

6 Data structures in gccg x y z points array

7 Data structures in gccg elems array

8 Data structures in gccg lcc array

9 Data distribution approach Process 0 (root)Process 1 Process 2 Process 3 Root Process Tasks: Read input file Partition elements using chosen approach Create and send relevant mapping arrays to each processes Broadcast common data package to each processor = lcc, ne, epart, countPart, bs_local, be_local …

10 Communication model

11 P3 Communication model has_ghost_neighbour array P3 has_ghost_neighbour = 0has_ghost_neighbour = 1 P5

12 Communication model Process x Process 0Process 1 Process k (k = count) … Computational loop, phase one: Start Isend to required processes ( where cellCountsToSend[i] > 0) Start Irecv from required processes ( where cellCountsToRecv[i] > 0) Process local elements that have no ghost neighbours Wait on all requests Update remaining local elements

13 Communication model

14 Problems overcome MPI_WAIT FUNCTION Problem: MPI_Wait was being executed both for the send and receive requests for every element processed Solution: has_ghost_neighbour array introduced, allowing for intermediate computation. MPI_Wait then only called once for each request. BEFOREAFTER

15 Problems overcome REDUNDANT REPROCESSING OF INPUT FILE Problem: Input file was being read once at initialisation and again for writing the result (redundant) Solution: ‘Write solution’ code was refactored to re-use the relevant file information obtained from the first read BEFOREAFTER

16 Speedup – cojack

17 Speedup – pent

18 Speedup – drall

19 Speedup – tjunc

20 Speedup – full execution

21 Thanks for listening Discussion time!


Download ppt "Fire Benchmark Parallelisation Programming of Supercomputers WS 11/12 Sam Maurus."

Similar presentations


Ads by Google