Presentation is loading. Please wait.

Presentation is loading. Please wait.

Progress in Unstructured Mesh Techniques Dimitri J. Mavriplis Department of Mechanical Engineering University of Wyoming and Scientific Simulations Laramie,

Similar presentations


Presentation on theme: "Progress in Unstructured Mesh Techniques Dimitri J. Mavriplis Department of Mechanical Engineering University of Wyoming and Scientific Simulations Laramie,"— Presentation transcript:

1 Progress in Unstructured Mesh Techniques Dimitri J. Mavriplis Department of Mechanical Engineering University of Wyoming and Scientific Simulations Laramie, WY

2 Overview NSU3D Unstructured Multigrid Navier-Stokes Solver –2 nd order finite-volume discretization –Fast steady state solutions (~100M pts in 15 minutes NASA Columbia Supercomputer) –Extension to Design Optimization –Extension to Aeroelasticity Enabling techniques: Accuracy and Efficiency High-Order Discontinuous Galerkin Methods (Longer term) –High accuracy discretizations through increased p order –Fast combined h-p multigrid solver –Steady-State (2-D and 3-D Euler) –Unsteady Time-Implicit (2-D Euler)

3 NSU3D Discretization Vertex based unstructured meshes –Finite volume / finite element Arbitrary Elements –Single edge-based data structure Central Difference with matrix dissipation Roe solver with MUSCL reconstruction

4 NSU3D Spatial Discretization Mixed Element Meshes –Tetrahedra, Prisms, Pyramids, Hexahedra Control Volume Based on Median Duals –Fluxes based on edges –Single edge-based data-structure represents all element types

5 Mixed-Element Discretizations Edge-based data structure –Building block for all element types –Reduces memory requirements –Minimizes indirect addressing / gather-scatter –Graph of grid = Discretization stencil Implications for solvers, Partitioners

6 NSU3D Convergence Acceleration Methods for Steady-State (and Unsteady) Problems Multigrid Methods –Fully automated agglomeration techniques –Provides convergence rates independent of grid size (usually < 500 MG Cycles) Implicit Line Solver –Used on each MG Level –Reduces stiffness due to grid anisotropy in Blayer No Wall Fctns

7 Multigrid Methods High-frequency (local) error rapidly reduced by explicit methods Low-frequency (global) error converges slowly On coarser grid: –Low-frequency viewed as high frequency

8 Multigrid Correction Scheme (Linear Problems)

9 Coarse Level Construction Agglomeration Multigrid solvers for unstructured meshes –Coarse level meshes constructed by agglomerating fine grid cells/equations

10 Anisotropy Induced Stiffness Convergence rates for RANS (viscous) problems much slower then inviscid flows –Mainly due to grid stretching –Thin boundary and wake regions –Mixed element (prism-tet) grids Use directional solver to relieve stiffness –Line solver in anisotropic regions

11 Directional Solver for Navier-Stokes Problems Line Solvers for Anisotropic Problems –Lines Constructed in Mesh using weighted graph algorithm –Strong Connections Assigned Large Graph Weight –(Block) Tridiagonal Line Solver similar to structured grids

12 Multigrid Line Solver Convergence DLR-F4 wing-body, Mach=0.75, 1 o, Re=3M –Baseline Mesh: 1.65M pts

13 Parallelization through Domain Decomposition Intersected edges resolved by ghost vertices Generates communication between original and ghost vertex –Handled using MPI and/or OpenMP (Hybrid implementation) –Local reordering within partition for cache-locality Multigrid levels partitioned independently –Match levels using greedy algorithm –Optimize intra-grid communication vs inter-grid communication

14 Partitioning (Block) Tridiagonal Lines solver inherently sequential Contract graph along implicit lines Weight edges and vertices Partition contracted graph Decontract graph –Guaranteed lines never broken –Possible small increase in imbalance/cut edges

15 NASA Columbia Supercluster 20 SGI Atix Nodes –512 Itanium2 cpus each –1 Tbyte memory each –1.5Ghz / 1.6Ghz –Total 10,240 cpus 3 Interconnects –SGI NUMAlink (shared memory in node) –Infiniband (across nodes) –10Gig Ethernet (File I/O) Subsystems: –8 Nodes: Double density Altix 3700BX2 –4 Nodes: NUMAlink4 interconnect between nodes BX2 Nodes, 1.6GHz cpus

16 NSU3D TEST CASE Wing-Body Configuration 72 million grid points Transonic Flow Mach=0.75, Incidence = 0 degrees, Reynolds number=3,000,000

17 NSU3D Scalability 72M pt grid –Assume perfect speedup on 128 cpus Good scalability up to 2008 using NUMAlink –Superlinear ! Multigrid slowdown due to coarse grid communication ~3TFlops on 2008 cpus

18 Single Grid Performance up to 4016 cpus 1 OMP possible for IB on 2008 (8 hosts) 2 OMP required for IB on 4016 (8 hosts) Good scalability up to 4016 5.2 Tflops at 4016 First real world application on Columbia using > 2048 cpus

19 Unstructured NS Solver/NASA Columbia Supercomputer ~100M pt solutions in 15 minutes 10 9 pt solutions can become routine –Ease other bottlenecks (I/O for 10 9 pts = 400 GB) High resolution MDO High resolution Aeroelasticity

20 Enabling Techniques Design Optimization –Robust Mesh Deformation (Linear elasticity) –Discrete Adjoint for Flow equations –Discrete Adjoint for Mesh Motion Equations Mesh sensitivites (Park and Nielsen) –Line-Implicit Agglomeration Multigrid Solver Flow, flow adjoint, mesh motion, mesh adjoint –Duality preserving formulation Adjoint discretization requires almost no additional memory over first order-Jacobian used for implicit solver Modular (subroutine) construction for adjoint and mesh sensitivities –dR/dx = dr/d(edge). d(edge)/dx Similar convegence rates for tangent and adjoint problems

21 Enabling Techniques Aeroelasticity –Robust Mesh Deformation (Linear elasticity) –Line-Implicit Agglomeration Multigrid Solver Flow (implicit time step), mesh motion Linear multigrid formulation –High-order temporal discretization Backwards Difference (up to 3 rd order) Implicit Runge-Kutta (up to fourth order) Formulation of Geometric Conservation Law for high- order time-stepping –Necessary for non-linear stability

22 Mesh Motion Developed for MDO and Aeroelasticity Problems Emphasis on Robustness –Spring Analogy –Truss Analogy, Beam Analogy –Linear Elasticity: Variable Modulus Emphasis on Efficiency –Edge based formulation –Gauss Seidel Line Solver with Agglomeration Multigrid –Fully integrated into flow solver

23 Formulation Mesh motion strategies –Tension spring analogy Laplace equation, maximum principle, incapable of reproducing solid body rotation –Truss analogy (Farhat et al, 1998)

24 Formulation Linear Elasticity Equations Prescription of E very important –Reproduces solid body translation/rotation for stiff E regions –Prescribe large E in critical regions –Relegates deformation to less critical regions of mesh

25 LE variable E spring Results and Discussion Mesh motion strategies for 2D viscous mesh truss LE constant E

26 IMPORTANT DIFFERENCES (FUN3D) Navier-Equations for displacement Derived assuming constant E Variations only in Poisson ratio

27 Method of Solution Linear Elasticity Equations can be difficult to solve Apply same techniques as for flow solver –Linear agglomeration multigrid(LMG) method –Line-implicit solver –Using same line/AMG structures

28 Method of Solution Agglomeration multigrid

29 Method of Solution Line-implicit solver Strong coupling

30 iter= 0iter= 1 Results and Discussion Line solver + MG4, first 10 iterations Viscous mesh, linear elasticity with variable E iter= 2iter= 3iter= 4iter= 5iter= 6iter= 7iter= 8iter= 9iter= 10

31 3D Dynamic Meshes (NS mesh) DLR wing-body configuration, 473,025 vertices

32 Results and Discussion Convergence rates for different iterative methods 2D viscous mesh, linear elasticity3D viscous mesh, linear elasticity

33 Unsteady Flow Solver Formulation Flow governing equations in Arbitrary-Lagrangian-Eulerian(ALE) form: After discretization (in space):

34 Unsteady Flow Solver Formulation Flow governing equations in Arbitrary-Lagrangian-Eulerian(ALE) form GCL: Maintain Uniform Flow Exactly (discrete soln)

35 Implicit Runge-Kutta Schemes Dalquist Barrier: No complete A and L stability above BDF2 –BDF3 often works…. But… –For higher order: Implicit Runge Kutta Schemes Backwards Difference (BDF2, BDF3) and Implicit Runge Kutta (up to 4 th order in time) previously compared for unsteady flows with static grids For moving grids, must obey Geometric Conservation Law GCL –2 nd and 3 rd order BDF-GCL relatively straight-forward –How to construct high-order Runge-Kutta GCL schemes ?

36 New Approach to GCL Use APPROXIMATE FaceVelocities evaluated at the RK Quadrature Points to Respect GCL, but still maintain Design Accuracy –i.e. For low order schemes: dx/dt = (x n+1 - x n ) /dt –For High-order RK: Solve system at each time step given by DGCL:

37 2D Example Periodic Pitching NACA0012 (exaggerated) Mach=0.755, AoA=0.016 o +- 2.51 o RK accurate with large time steps

38 2D Pitching Airfoil Error measured as RMS difference in all flow variables between solution integrated from t=0 to t=54 with reference solution at t=54 –Reference solution: RK64 with 256 time steps/period Slope of accuracy curves: –BDF2: 1.9 –RK64: 3.5

39 3D Example: Twisting OneraM6 Wing Mach=0.755, AoA=0.016 o +- 2.51 o Reduced frequency=0.1628

40 3D Validation (RK64 +GCL) Twisting ONERA M6 Wing Same error measure as in 2D (ref.solution=128 steps/period) IRK64 enables huge time steps Slopes of error curves: BDF2=2.0, RK64=3.3

41 AGARD WING Aeroelastic Test Case Modal Analysis 1 st Mode2 nd Mode

42 AGARD WING Aeroelastic Test Case Modal Analysis 3 rd Mode4 th Mode

43 AGARD WING Aeroelastic Test Case First 4 structural modes Coarse Euler Simulation –45,000 points, 250K cells Linear Elasticity Mesh Motion –Multigrid solver 2 nd order BDF Time stepping –Multigrid solver Flow/Structure solved fully coupled at each implicit time step 2 hours on 1 cpu per analysis run

44 Flutter Boundary Prediction Flutter Boundary Generalized Displacements

45 Current and Future Work Investigate benefits of Implicit Runge-Kutta –4th order temporal accuracy Investigate optimal time-step size and convergence criteria Develop automated temporal-error control scheme Viscous simulations, Finer Meshes –5M pt Unsteady Navier-Stokes solutions : 2-4 hours on 128 cpus of Columbia Adjoint for unsteady problems –Time domain –Frequency domain

46 Higher-Order Methods Simple asymptotic arguments indicate benefit of higher-order discretizations Most beneficial for: –High accuracy requirements –Smooth functions

47 Motivation Higher-order methods successes –Acoustics –Large Eddy Simulation (structured grids) –Other areas High-order methods not demonstrated in: –Aerodynamics, Hydrodynamics –Unstructured mesh LES –Industrial CFD –Cost effectiveness not demonstrated: Cost of discretization Efficient solution of complex discrete equations

48 Motivation Discretizations well developed –Spectral Methods, Spectral Elements –Streamwise Upwind Petrov Galerkin (SUPG) –Discontinuous Galerkin Most implementations employ explicit or semi- implicit time stepping –e.g. Multi-Stage Runge Kutta ( ) Need efficient solvers for: –Steady-State Problems –Time-Implicit Problems ( )

49 Multigrid Solver for Euler Equations Develop efficient solvers (O(N)) for steady-state and time-implicit high-order spatial discretizations Discontinuous Galerkin –Well suited for hyperbolic problems –Compact-element-based stencil –Use of Riemann solver at inter-element boundaries –Reduces to 1 st order finite-volume at p=0 Natural extension of FV unstructured mesh techniques Closely related to spectral element methods

50 Discontinuous Galerkin (DG) Mass Matrix Spatial (convective or Stiffness) Matrix Element Based-Matrix Element-Boundary (Edge) Matrix

51 Steady-State Solver K ij u i =0 (Ignore Mass matrix) Block form of K ij : –E ij = Block Diagonals (coupling of all modes within an element) –F ij = 3 Block Off-Diagonals (coupling between neighboring elements) Solve iteratively as: E ij (u i n+1 – u i n ) = K ij u i n

52 Steady-State Solver: Element Jacobi Solve iteratively as: E ij (u i n+1 – u i n ) = K ij u i n  u i n+1 = E -1 ij K ij u i n Obtain E -1 ij by Gaussian Elimination (LU Decomposition) 10X10 for p=3 on triangles

53 DG for Euler Equations Mach = 0.5 over 10% sin bump Cubic basis functions (p=3), 4406 elements

54 Entropy as Measure of Error S  0.0 for exact solution S is smaller for higher order accuracy

55 Single Grid: Accuracy P - approximation order N - number of elements

56 Element Jacobi Convergence P-Independent Convergence H-dependence

57 Improving Convergence H-Dependence Requires implicitness between grid elements Multigrid methods based on use of coarser meshes for accelerating solution on fine mesh

58 Spectral Multigrid Form coarse “grids” by reducing order of approximation on same grid –Simple implementation using hierarchical basis functions When reach 1 st order, agglomerate (h-coarsen) grid levels Perform element Jacobi on each MG level

59 Hierarchical Basis Functions Low order basis functions are subset of higher order basis functions Low order expansion (linear in 2D): –U= a 1  1 + a 2  2 + a 3  3 Higher order (quadratic in 2D) –U=a 1  1 + a 2  2 + a 3  3 + a 4  4 + a 5  5 + a 6  6 To project high order solution onto low order space: –Set a 4 =0, a 5 =0, a 6 =0

60 Hierarchical Basis on Triangles Linear (p=1):  1 =  1,  2 =  2,  3 =  3 Quadratic (p=2):                   Cubic (p=3):                                       

61 Spectral Multigrid Fine/Coarse Grids contain same elements Transfer operators almost trivial for hierarchical basis functions Restriction: Fine to Coarse –Transfer low order (resolvable) modes to coarse level exactly –Omit higher order modes Prolongation: Coarse to Fine –Transfer low order modes exactly –Zero out higher order modes

62 Element Jacobi Convergence P-Independent Convergence H-dependence

63 Multigrid Convergence Nearly h-independent

64 4-Element Airfoil (Euler Solution)

65 4-Element Airfoil (Entropy)

66 Agglomeration Multigrid for p=0

67 Four Element Airfoil (Inviscid) Mach=0.3 hp-Multigrid –p=1…4 –V-cycle(5,0) –Smoother (EGS) Mesh size –N=1539 –N=3055 –N=5918 AMG –3-Levels –4-Levels –5-Levels

68 Four Element Airfoil: p- and h dependence N = 3055P = 4 Improved convergence for higher orders Slight h-dependence

69 Four Element Airfoil: Linear (CGC) hp-multigrid N=3055, P=4 Newton Scheme: Quadratic convergence –Driven by linear MG scheme Linear hp-multigrid between the non-linear updates Exit strategy (“k” iteration) –machine epsilon (non-optimized) –optimization criterion :

70 Linear (CGC) vs. non-linear (FAS) hp-Multigrid FAS - non-linear multigrid CGC - linear multigrid Linear MG most efficient –Expense of non-linear residual NQ = 16 (p=4) NQ = 25 (over-integration)

71 Preliminary 3D DG Results (steady-state) 3D biconvex airfoil mesh –7,000 tetrahedral elements

72 3D Steady State Euler DG P-multigrid convergence

73 3D Steady-State Euler DG Curved boundaries under development p=1 (2 nd order) p=4 (5 th order)

74 Unsteady DG (2D) Implicit time-stepping for low reduced frequency problems and small explicit time step restriction of high-order schemes Balance Temporal/Spatial Discretization Errors –p=3(4 th order in space) –BDF1, BDF2, IRK4 Runge-Kutta is equivalent to DG in time Use h-p multigrid to solve non-linear problem at each implicit time step

75 Unsteady Euler DG Convection of vortex P=3: Fourth order spatial accuracy BDF1, BDF2, IRK64 Time step = 0.2, CFL cell = 2.

76 Unsteady Euler DG P=3: Fourth order spatial accuracy BDF1: 1 st order temporal accuracy Time step = 0.2, CFL cell = 2. 10 pMG cycles per time step

77 Unsteady Euler DG p=3 (4 th order spatial accuracy) BDF2: 2 nd order temporal accuracy Time step = 0.2, CFL cell = 2. 10 pMG cycles per time step

78 Unsteady Euler DG p=3, 4 th order spatial accuracy IRK4: 4 th order temporal accuracy Time step = 0.2, CFLcell= 2. 5 pMG cycles/stage, 20pMG cycles/time step

79 Unsteady Euler DG Vortex convection problem

80 Unsteady Euler DG IRK 4 th order best for high accuracy

81 Future Work Higher order schemes still costly in terms of cpu time compared to 2 nd order schemes –Will these become viable for industrial calculations? H-P Adaptivity –Flexible approach to use higher order where beneficial –Incorporate hp-Multigrid with hp Adaptivity Extend to: –3D Viscous –Unsteady –Dynamic Meshes


Download ppt "Progress in Unstructured Mesh Techniques Dimitri J. Mavriplis Department of Mechanical Engineering University of Wyoming and Scientific Simulations Laramie,"

Similar presentations


Ads by Google