Download presentation
Presentation is loading. Please wait.
Published byDeborah Lawrence Modified over 8 years ago
1
Hybrid Parallel Implementation of The DG Method Advanced Computing Department/ CAAM 03/03/2016 N. Chaabane, B. Riviere, H. Calandra, M. Sekachev, S. Hamlaoui
2
Outline Numerical methods Modern programming models DG method: Implementation and scalability
3
Outline Numerical methods Modern programming models DG method: Implementation and scalability
4
Classical approaches Finite difference:
5
Classical approaches Finite volume: oldwww.unibas.it
6
Limitations The finite volume method is a low order method. The approximate solution is piecewise constant. Very fine mesh = High number of degrees of freedom = Large linear system.
7
DG-Finite Element Method Allows us to use higher order approximation. Allows the modelling of complex geometries. The modern methods such as the DG method allows the implementation of hp-refinement in a relatively easy way. p=2 p=1 p=3
8
DG-Finite Element Method Allows us to use higher order approximation. Allows the modelling of complex geometries. The modern methods such as the DG method allows the implementation of hp-refinement in a relatively easy way.
9
Outline Numerical methods Modern programming models DG method: Implementation and scalability
10
Serial Computers Serial Computer Memory Unit Central Processing Unit (CPU) 1 Central Processing Unit (CPU). 1 Memory Unit.
11
From Serial to Parallel: Step I Idea: Add more cores! => Multi-core processor/CPU Architecture: Uniform memory access (UMA) UMA Node Memory Unit Central Processing Unit (CPU) Core Speed A
12
From Serial to Parallel: Step II Idea: Add more processors => Multi-processor nodes Architecture: Non-uniform memory access (NUMA) NUMA Node Memory Unit Central Processing Unit (CPU) Core Central Processing Unit (CPU) Core Speed A Speed B Speed A > Speed B
13
From Serial to Parallel: Step III Idea: Connect nodes by network (actual wires) Result: The majority of supercomputers around 2010. Architecture: Interconnected NUMA nodes … NUMA Node Speed С Speed A > Speed B > Speed С
14
Outline Numerical methods Modern programming models DG method: Implementation and scalability
15
Domain Decomposition and SPMD Single program, Multiple data (SPMD) Most common style of parallel programming Tasks are split up and run simultaneously on multiple processors with different input in order to obtain results faster. Same program is executed on every processor
16
Domain Decomposition Core 1Core 2 Ghost region
17
Domain Decomposition of The FE Method Core 1
18
Domain Decomposition of The FE Method Core 2Core 1 MPI
19
Load Balance The domain decomposition is done by elements. Assign weights to the elements to ensure load balance. p=2 p=1 p=3
20
Strong Scalability CRAY machine: 52 nodes with 2 CPUs =>Total number of cores = 1040 We use Hypre* to solve the linear system. * http://acts.nersc.gov/hypre/
21
Strong Scalability CRAY machine: 52 nodes with 2 CPUs =>Total number of cores = 1040 We use Hypre* to solve the linear system. * http://acts.nersc.gov/hypre/
22
Weak Scalability
23
Evolution of Supercomputers: GPUs Idea: Complement CPUs with accelerators/co-processors Result: The biggest supercomputers today. Architecture: Hybrid … NUMA Node Speed С GPU CPU NUMA Node GPU CPU NUMA Node GPU CPU NUMA Node GPU CPU
24
Domain Decomposition of The FE Method Node 1
25
Domain Decomposition of The FE Method Node 2Node 1 MPI
26
Scalability of The Hybrid Implementation I Comparison between HYPRE and AMGX made using 2 CPUs per node for HYPRE and one Tesla K40 GPU per node for AMGX.
27
NUMA Node Central Processing Unit (CPU) Drawbacks Core GPU SUBDOMAIN i Uniform Access Linear system
28
NUMA Node Optimized Implementation: OpenMP Central Processing Unit (CPU) Core GPU SUBDOMAIN i Access Linear system OpenMP
29
Scalability of The Hybrid Implementation II
30
Conclusion We were able to develop a very scalable software that takes into account modern technology to simulate geophysical applications. hp-refinement is fairly easy as a result of using DG method. Load balancing is ensured using parmetis.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.