Unstructured Mesh Discretizations and Solvers for Computational Aerodynamics Dimitri J. Mavriplis University of Wyoming
Overview –Discretization Issues Cell versus Vertex Based Grid Alignment Problems Reconstruction for 2 nd order accuracy –Gradient reconstruction issues –Artificial Dissipation –Limiters Viscous Discretizations –Choice of element type –Full NS terms Grid resolution issues –Solvers and Scalability –Conclusions
Cell Centered vs Vertex-Based Tetrahedral Mesh contains 5 to 6 times more cells than vertices –Hexahedral meshes contain same number of cells and vertices (excluding boundary effects) –Prismatic meshes: cells = 2X vertices Tetrahedral cells : 4 neighbors Vertices: 14 neighbors on average
Cell Centered vs Vertex-Based On given mesh: –Cell centered discretization: Higher accuracy –Vertex discretization: Lower cost Equivalent Accuracy-Cost Comparisons Difficult Often based on equivalent numbers of surface unknowns (2:1 for tet meshes) –Levy (1999) –Yields advantage for vertex-discretization
Example: DLR-F4 Wing-body (AIAA Drag Prediction Workshop) Grid Characteristics Vertex-Based Grid Cell-Based Grid Cell-Based Grid (wall functions) Boundary Vertices 48,33923,29025,175 Boundary Triangles 96,67446,57650,346 Total Points 1,647,810470,427414,347 Total Cells 9,686,8022,743,3862,390,089 Cells in Viscous Layer 6,495,8282,208,2601,281,854
DLRF4-F6 Test Cases (DPW) Wing-Body Configuration Transonic Flow Mach=0.75, Incidence = 0 degrees, Reynolds number=3,000,000
Illustrative Example: DLR-F4 NSU3D: vertex-based discretization –Grid : 48K boundary pts, 1.65M pts (9.6M cells) USM3D: cell-centered discretization –Grid : 50K boundary cells, 2.4M cells (414K pts) –Uses wall functions NSU3D: on cell centered type grid –Grid: 46K boundary cells, 2.7M cells (470K pts)
Cell versus Vertex Discretizations Similar Lift for both codes on cell-centered grid Baseline NSU3D (finer vertex grid) has lower lift
Cell versus Vertex Discretizations Pressure drag –Wall treatment discrepancies NSU3D : cell centered grid –High drag, (10 to 20 counts) –Grid too coarse for NSU3D –Inexpensive computation USM3D on cell-centered grid closer to NSU3D on vertex grid Vertex based: more efficient for given accuracy Cell-Centered: reduced grid generation requirements
Vertex vs. Cell-Centered Discretizations For tetrahedral mesh : N vertices 6N cells 7N edges 12N Faces (triangles) –Cell centered approach has 6 times more d.o.f. on same grid as vertex scheme –But vertex scheme on 6 times finer grid is more accurate than cell-centered scheme Vertex scheme has 7 fluxes per cv Cell centered scheme has 12/6=2 fluxes per cv –Differences less pronounced on mixed element grids AIAA Drag Prediction Workshop Practice: –“Equivalent” vertex grid ~ 3 times finer than cell centered grid (not 6 times) Computational overheads favor vertex approach (in our opinion) Cell centered schemes have other advantages –Easier grid generation and file transfer/archiving Longer term objective: –Single code can be run cell or vertex centered (graph based)
Boundary Conditions Element of boundary is a face (not a vertex) Unambiguous BC prescription requires face based implementation –Weak form for vertex discretizations
Grid Resolution and Discretization Issues –Choice of discretization and effect of dissipation (intricately linked) Cells versus points Discretization formulations –Grid resolution requirements Choice of element type Grid resolution issues –Grid convergence
Spatial Discretization Mixed Element Meshes –Tetrahedra, Prisms, Pyramids, Hexahedra Control Volume Based on Median Duals –Fluxes based on edges –Single edge-based data-structure represents all element types F ik = F(u L ) + F(u R ) + T T -1 (u L –u R ) - Upwind discretization - Matrix artificial dissipation
Mixed-Element Discretizations Edge-based data structure –Building block for all element types –Reduces memory requirements –Minimizes indirect addressing / gather- scatter –Graph of grid = Discretization stencil Implications for solvers, Partitioners
Alignment Problems Structured grids can be aligned with flow features Specific examples of unstructured grid alignment –Prismatic layers in boundary layer –More difficult for shocks/shear layers Mesh generation based on structured mesh methods Quad/hex/prism element types Mesh adaptation through point movement –Possibly adjoint based
Adjoint Driven Shock Fitting Optimization of Mesh Based on Minimizing Error from “Exact Solution”
Upwind Discretization First order scheme Second order scheme Gradients evaluated at vertices by Least-Squares Limit Gradients for Strong Shock Capturing
Matrix Artificial Dissipation First order scheme Second order scheme By analogy with upwind scheme: Blending of 1 st and 2 nd order schemes for strong shock capturing
Entropy Fix matrix: diagonal with eigenvalues: u, u, u, u+c, u-c Robustness issues related to vanishing eigenvalues Limit smallest eigenvalues as fraction of largest eigenvalue: |u| + c –u = sign(u) * max(|u|, (|u|+c)) –u+c = sign(u+c) * max(|u+c|, (|u|+c)) –u – c = sign(u -c) * max(|u-c|, (|u|+c))
Entropy Fix –u = sign(u) * max(|u|, (|u|+c)) –u+c = sign(u+c) * max(|u+c|, (|u|+c)) –u – c = sign(u -c) * max(|u-c|, (|u|+c)) = 0.1 : typical value for enhanced robustness = 1.0 : Scalar dissipation becomes scaled identity matrix –T | | T -1 becomes scalar quantity –Simplified (lower cost) dissipation operator Applicable to upwind and art. dissipation schemes
Discretization Formulations Examine effect of discretization type and parameter variations on drag prediction Effect on drag polars for DLR-F4: –Matrix artificial dissipation Dissipation levels Entropy fix Low order blending –Upwind schemes Gradient reconstruction Entropy fix Limiters
Effect of Artificial Dissipation Level Increased accuracy through lower dissipation coef. Potential loss of robustness
Effect of Entropy Fix for Artificial Dissipation Scheme Insensitive to small values of High drag values for large and scalar scheme
Effect of Low-Order Dissipation Blending for Shock Capturing Lift and drag relatively insensitive Generally not recommended for transonics
Effect of Artificial Dissipation Discretization C L C D Fine Mesh (13M pts) M pts: 1 =0.0, 2 =1.0, = M pts: 1 =0.0, 2 =0.5, = M pts: 1 =0.0, 2 =1.0, = M pts: 1 =0.0, 2 =1.0, = M pts: 1 =1.0, 2 =1.0, =
Comparison of Discretization Formulation (Art. Dissip vs. Grad. Rec.) Least squares approach slightly more diffusive (?) Extremely sensitive to entropy fix value –Unweighted LS Gradient extremely inaccurate in BL region AIAA Paper : Revisiting the LS Gradient
Unweighted Least-Squares Gradient Accuracy compromised in regions of high- stretching and moderate curvature
Unweighted Least-Squares Gradient Accuracy compromised in regions of high-stretching and moderate curvature Distance Weighting cures problem but lacks robustness
Effect of Limiters on Upwind Discretization Limiters reduces accuracy, increase robustness Less sensitive to non-monotone limiters
Effect of Discretization Type Discretization C L C D Fine Mesh (13M pts) Baseline matrix dissipation Least squares: Limiter OFF, = Least squares: Limiter OFF, = Least squares: Limiter ON, =
Effect of Element Type Right angle tetrahedra produced in boundary layer regions –Highly stretched elements for efficiency –Non obtuse angle requirement for accuracy Diagonal edge has face not aligned with flow features –Problematic ? De-emphasize diagonal edges: containment dual cv
Effect of Element Type Alternate strategy: Remove culprit edges –Mesh generation task Semi-structured tetrahedra combinable into prisms Prism elements of lower complexity (fewer edges) No significant accuracy benefit (Aftosmis et. al in 2D)
Effect of Element Type in BL Region Little overall effect on accuracy Potential differences between two codes –Further grid refinement shows increased discrepancies (Lee- Rausch et al. (2003, 2004)
Grid Convergence Study (DLR-F4)
Viscous Term Formulation Vertex-based: Linear Galerkin Finite Elements –Extra stencil pts on hybrid elements –Edge data-structure insufficient –Exact Jacobian construction
Viscous Term Formulation Gradients of Gradients: –Extended stencil (neighbors of neighbors) –Odd-even decoupling (stencil 2h) Multi-dimensional thin layer –Laplacian of velocity: Incompressible NS Inconsistent Laplacian on edge-data-structure –Consistent for orthogonal prismatic BL grids Hybrid approach: –Laplacian on edges –Gradients of gradients for remaining terms Relieves odd-even coupling problem Retains extended stencil (inexact Jacobian)
Sensitivity to Navier-Stokes Terms DPW2 Wing-Body –Mach=0.75, Incidence=0 degrees, Re=3 million –Regions of separated flow –Differences not significant
Grid Resolution Issues Possibly greatest impediment to reliable RANS drag prediction Promise of adaptive meshing held back by development of adequate error estimators Unstructured mesh requirement similar to structured mesh requirements –200 to 500 vertices chordwise (cruise) –Lower optimal spanwise resolution –Y + of order 1 required in BL
Effect of Normal Spacing in BL Inadequate resolution under-predicts skin friction Direct influence on drag prediction
Effect of Normal Resolution for High- Lift (c/o Anderson et. AIAA J. Aircraft, 1995) Indirect influence on drag prediction Easily mistaken for poor flow physics modeling
DPW3 Wing1-Wing2 Cases
W1-W2 Grid Convergence Study Apparently uniform grid convergence
W1-W2 Grid Convergence Study Good grid convergence of individual drag component
W1-W2 Results Discrepancy between UW and Cessna Results Importance of consistent family of grids
W1-W2 Results Removing effect of lift-induced drag : Results on both grid families converge consistently
DPW2/3 Configurations Up to 72M point meshes
Sensitivity to Dissipation Levels Drag is grid converging Sensitivity to dissipation decreases as expected
65M pt mesh Results 10% drop in C L at AoA=0 o : closer to experiment Drop in C D : further from experiment Same trends at Mach=0.3 Little sensitivity to dissipation
Grid Specifications 65 million pt grid 72 million pt grid
Grid Convergence Grid convergence apparent using self-similar family of grids Large discrepancies possible across grid families –Sensitive areas Separation, Trailing edge Pathological cases ? Would grid families converge to same result limit of infinite resolution ? –i.e. Do we have consistency ? –Due to element types ?
Structured vs Unstructured Drag Prediction (AIAA workshop results) Similar predictive ability for both approaches –More scatter for structured methods –More submissions/variations for structured methods
Tinoco 52 3rd CFD Drag Prediction Workshop San Francisco, California – June 2006 Grid Convergence – All Solutions
Unstructured vs Structured (Transonics) Considerable scatter in both cases No clear advantage of one method over the other in terms of accuracy DPW3 Observation: –Core set of codes which: Agree remarkably well with each other Span all types of grids –Structured, Overset, Unstructured Have been developed and used extensively for transonic aerodynamics
Solution Methodologies Explicit no-longer acceptable (40M pt grids) Implicit –Locally or globally Multigrid –Linear or non-linear (FAS) Preconditioned Newton-Krylov –Preconditioners are key Any of above iterative methods Matrix based (ILU)
Solution Methodology To solve R(w) = 0 (steady or unsteady residual) –Newton’s method: –Requires storage/inversion of Jacobian (too big for 2 nd order scheme) –Replace with 1 st order Jacobian Stored as block Diagonals [D] (for each vertex) and off-diagonals [O] (2 for each edge) –Use block Jacobi or Gauss-Seidel to invert Jacobian at each Newton iteration using subiteration k:
Solution Methodology Corresponds to linear Jacobi/Gauss-Seidel in many unstructured mesh solvers Alternately, replace Jacobian simply by [D] (i.e. drop [O] terms) (Point implicit) –Non-linear residual must now be updated at every iteration (no subiterations) –Corresponds to non-linear Jacobi/Gauss-Seidel
Solution Methodologies In almost all applications, reduced Jacobian is used for linear and/or non-linear solvers –Nearest neighbor stencil –Reduced memory footprint –Can be viewed as: Defect correction scheme Preconditioning strategy –Preconditioner = 1 st order Jacobian 1 st order multigrid coarse level discretizations –Inherent limit on convergence efficiency
Non-Linear vs Linear Solvers Expense of non-linear solver dominated by residual evaluation (non-linear term) Expense of linear solver determined only by stencil topology (once Jacobian has been constructed) Memory requirements of linear solver can be considerably higher –1 st order Jacobian: ~350 words per vertex –Point implicit: 25 words per vertex
Non-Linear vs Linear Solvers In most cases: –Linear solver faster per iteration –Non-linear solver requires much less memory In asymptotic limit, both deliver identical convergence rates Linear solver can be less robust at startup Prefer : –Non-linear solver for steady-state –Linear solver for unsteady time-implicit problems
Preconditioned AMG Solver Point or line-implicit solver –Reduces stiffness due to anisotropy –Managable memory overhead (in non-linear form) Agglomeration multigrid –Convergence rate independent of grid resolution (approximately) Can be implemented as linear or non-linear solver or preconditioner for GMRES
Method of Solution Line-implicit solver Strong coupling
Agglomeration Multigrid Agglomeration Multigrid solvers for unstructured meshes –Coarse level meshes constructed by agglomerating fine grid cells/equations
Agglomeration Multigrid Automated Graph-Based Coarsening Algorithm Coarse Levels are Graphs Coarse Level Operator by Galerkin Projection Grid independent convergence rates (order of magnitude improvement)
Agglomeration Multigrid Automated Graph-Based Coarsening Algorithm Coarse Levels are Graphs Coarse Level Operator by Galerkin Projection Grid independent convergence rates (order of magnitude improvement)
Agglomeration Multigrid Automated Graph-Based Coarsening Algorithm Coarse Levels are Graphs Coarse Level Operator by Galerkin Projection Grid independent convergence rates (order of magnitude improvement)
Agglomeration Multigrid Automated Graph-Based Coarsening Algorithm Coarse Levels are Graphs Coarse Level Operator by Galerkin Projection Grid independent convergence rates (order of magnitude improvement)
Parallelization through Domain Decomposition Intersected edges resolved by ghost vertices Generates communication between original and ghost vertex –Handled using MPI and/or OpenMP (Hybrid implementation) –Local reordering within partition for cache-locality Multigrid levels partitioned independently –Match levels using greedy algorithm –Optimize intra-grid communication vs inter-grid communication
Partitioning (Block) Tridiagonal Lines solver inherently sequential Contract graph along implicit lines Weight edges and vertices Partition contracted graph Decontract graph –Guaranteed lines never broken –Possible small increase in imbalance/cut edges
Partitioning Example 32-way partition of 30,562 point 2D grid Unweighted partition: 2.6% edges cut, 2.7% lines cut Weighted partition: 3.2% edges cut, 0% lines cut
Partitioning Example 32-way partition of 30,562 point 2D grid Unweighted partition: 2.6% edges cut, 2.7% lines cut Weighted partition: 3.2% edges cut, 0% lines cut
Line Solver Multigrid Convergence Line solver convergence insensitive to grid stretching Multigrid convergence insensitive to grid resolution
(Multigrid) Preconditioned Newton Krylov Mesh independent property of Multigrid GMRES effective (in asymptotic range) but requires extra memory
Scalability Near ideal speedup for 72M pt grid on 2008 cpus of NASA Columbia Machine –Homogeneous Data-Structure –Near perfect load balancing –Near Optimal Partitioners
Conclusions For transonics –Equivalent accuracy on equivalent grids –Equivalent or superior solution technology –Superior scalability Indirect addressing and memory overheads Problems remain particularly for high-speed flows –Alignment –Robustness/Accuracy –Gradient reconstruction –Viscous terms and element type –Grid Convergence and consistency questions
Discretization Governing Equations: Reynolds Averaged Navier-Stokes Equations –Conservation of Mass, Momentum and Energy –Single Equation turbulence model (Spalart-Allmaras) Convection-Diffusion – Production Vertex-Based Discretization –2 nd order upwind finite-volume scheme –6 variables per grid point –Flow equations fully coupled (5x5) –Turbulence equation uncoupled