Download presentation
Presentation is loading. Please wait.
Published byCoral West Modified over 9 years ago
1
Parallel Computation of the 2D Laminar Axisymmetric Coflow Nonpremixed Flames Qingan Andy Zhang PhD Candidate Department of Mechanical and Industrial Engineering University of Toronto ECE 1747 Parallel Programming Course ProjectDec. 2006
2
2 Outline Introduction Motivation Objective Methodology Result Conclusion Future Improvement Work in Progress
3
3 Introduction Multi-dimensional flame Easy to model Computationally OK with detail sub-models such as chemistry, transport, etc. Lots of experimental data Resembles the turbulent flames in some cases (eg. flamelet regime) Flow configuration
4
4 Motivation Complex Chemical Mechanism Appel (2000) mechanism (101 species,543 reactions) Complex Geometry Large 2D coflow laminar flame (1,000*500=500,000) 3D laminar flame (1,000*500*100=50,000,000) Complex Physical Problem Soot formation Multi-phase problem The run time is expected to be long if:
5
5 Objective Speedup Feasibility Accuracy Flexibility To develop parallel flame code based on the sequential flame code
6
6 Methodology -- Options Shared Memory OpenMP Pthread Distributed Memory MPI Distributed Shared Memory Munin TreadMarks MPI is chosen because it is widely used for scientific computation, easy to program and also the cluster is a Distributed Memory system.
7
7 Methodology -- Preparation Linux OS Programming tool (Fortran, Make, IDE) Parallel computation concepts MPI commands Network (SSH, queuing system)
8
8 Methodology –Sequential code Sequential Code Analysis Algorithm Dependency Data I/O CPU time breakdown Sequential code is the backbone for parallelization!
9
9 Flow configuration and computational domain Methodology Continuity equation Momentum equation Gas species equation Energy equation CFD With parallel computation Constitutive relation + Initial condition Boundary condition
10
10 Quantities solved (primitive variables): U, V, P’, Y i (i=1,KK), T Y i --- i th gas species mass fraction KK --- total gas species number Methodology CFD: Finite Volume Method Iterative process on Staggered grid Flow configuration and computational domain If KK=100, then we have to solve (3+100+1)=104 equations at each point. If mesh is 1000*500, then, we have to solve 104*1000*500=52,000,000 equations in each iteration. If 3000 iterations are required to get converged solution, we have to totally solve 52,000,000*3000=156,000,000,000 equations.
11
11 Unsteady Term + Convection Term = Diffusion Term + Source Term Unsteady: time variant term Convection: caused by flow motion Diffusion: For species: molecular diffusion and thermo diffusion Source term: For species: chemical reaction General Transport Equation
12
12 Axial momentum: Radial momentum: Mass and Momentum equation Mass:
13
13 Species and Energy equation Diffusion of species Chemical reaction Radiation heat transfer Species Energy
14
14 Methodology –Sequential code Start iteration from scratch or continued job Within one iteration Iteration starts Discretization get AP(I,J) and CON(I,J) Solve TDMA or PbyP Gauss Elimination Get new value update F(I,J,NF) array Do other equations Iteration ends End iteration if convergence reached
15
15 Methodology –Sequential code Fig. 1 CPU time for each sub-code summarized after one iteration with radiation included Most time-consuming part: Species Jacobian matrix DSDY(K1,K2,I,J) evaluation Dependency??
16
16 Methodology -- Parallelization Domain Decomposition Method (DDM) with Message Passing Interface (MPI) programming Six processes used to decompose the computational domain of 206*102 staggered grid points R, V Z, U Ghost Points are placed at the boundary to reduce communication among processes!
17
17 Cluster Information Cluster location: icpet.nrc.ca in Ottawa 40 nodes connected by Ethernet AMD Opteron 250 (2.4GHz) with 5G memory Redhat Linux Enterprise Edition 4.0 Batch-queuing system: Sun Grid Engine (SGE) Portland Group compilers (V 6.2) + MPICH2 n1-5 n2-5 n3-5 n4-5 | n5-5 n6-5 n7-5 n8-5 n1-4 n2-4 n3-4 n4-4 | n5-4 n6-4 n7-4 n8-4 n1-3 n2-3 n3-3 n4-3 | n5-3 n6-3 n7-3 n8-3 n1-2 n2-2 n3-2 n4-2 | n5-2 n6-2 n7-2 n8-2 n1-1 n2-1 n3-1 n4-1 | n5-1 n6-1 n7-1 n8-1
18
18 Results --Speedup Table 1 CPU time and speedup for 50 iterations with Appel et al. 2000 mechanism Processes Sequential4 processes6 processes12 processes CPU time(s)5131315254105965253 Speedup13.364.849.77 (1)Speedup is good (2)CPU time spent on 50 iterations for the original sequential code is 51313 seconds, i.e. 14.26 hours. Too long!
19
19 Results --Speedup Fig. 3 Speedup obtained with different processes
20
20 Pyrene field (in mole fraction) Results --Real application OH field (in mole fraction) Temperature field (in K) Benzene field (in mole fraction) Flame field calculation using the parallel code (Appel 2000 mechanism) The trend is well predicted!
21
21 Conclusion The sequential flame code is parallelized with DDM Speedup is good The parallel code is applied to model a flame using a detailed mechanism Flexibility is good, i.e. geometry and/or # of processors can be easily changed
22
22 Future Improvement Optimized DDM Species line solver
23
23 Work in Progress Fixed sectional soot model Add 70 equations to the original system of equations
24
24 Experience Keep communication down Wise parallelization method Debugging is hard I/O
25
25 Thanks Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.