Download presentation
Presentation is loading. Please wait.
Published byMarcia McDaniel Modified over 8 years ago
1
Special Solution Strategies inside a Spectral Element Ocean Model Mohamed Iskandarani Rutgers University and Miami University Craig C. Douglas University of Kentucky and Yale University Gundolf Haase University Linz, Austria and University of Kentucky
2
Outline Versions of Spectral Element Ocean Model (SEOM) Description of layered version Solving the Laplacian for Spectral Elements –Schur complement method with BPS-like pc –Sparse approximation matrix and AMG –A Two-grid method with patch smoothing What to do in 3D?
3
North East Pacific Grid
4
4 Way Partitioning of the Grid
5
SEOM Versions and Applications Single Layer –1.5 layer (wind circulation/abyssal flow) –global tides –estuarine modeling Multiple Layers –wind driven circulation, 2-5 layers 3D Continuous Stratification –Gravitational Adjustment –Overflow –Basin Circulation
6
Highlights of Spectral Element Method h-p type FEM (C 0 continuity) Geometric flexibility Dense computational kernels (ops O(KN 3 )) Excellent scalability Very low phase errors and numerical dissipation CPU intensive
7
Motivation for Layered SEOM Mathematically simpler than SEOM-3D Computationally simpler and faster No cross isopycnal diffusion No pressure gradient errors Baroclinic processes possible with 2 layers Eddy resolving simulations can be produced relatively easily and cheaply
8
Layered Model Equations Equations are –du k /dt + f u k = h k u k )/h k – u k The Montgomery potential is k = g 1 + k-1 + g k z k, k>0, where 1 is the barotropic pressure contribution to the Montgomery potential. The thickness anomaly of layer k is k = k - k+1 with N+1 = 0. The total depth of the fluid is H = The vertical coordinate of the surface interface of layer k is z k = z k+1 + h k. The stress on the layer k is k, where 1 is surface wind stress, N+1 is the bottom drag coefficient, and k+1 is the interfacial drag coefficient.
9
Current Limitations Layer thickness must be > 0 –Entrainment kicks in when h < h c : h t + (hu) = w t Topography confined to deepest layer No thermodynamics
10
Time Discretization Third order Adams-Bashford (AB3) explicit on all terms except surface gravity waves Backward Euler (BE) implicit on surface gravity waves –Implicit terms isolated in 2D equations –Iterative solutions via PCG.
11
Filtering Each layer has to solve denotes the filtered vorticity and is the filtered divergence field The filtering is done by series expansion and the Boyed-Vandeven filter in each spectral element. Solve on each of the 5 layers
12
Spectral Element Gauss-Lobatto discretization Element is the support of inner node f.e. basis functions IInner nodes BBoundary nodes E consisting of Edge nodes V Vertex nodes
13
SEOM Advantage over FDM The speedup formula shows that the speedup deteriorates as the second term in the denominator increases. This second term decreases quadratically with the spectral truncation, and like the square root of the number of elements in the partition. The formula also shows the distinguishing property of the spectral element method which gives it its coarse grain character: the communication cost increases only linearly with the order of the method while its computational cost increases cubically, yielding a quadratic ratio between the two. High order finite difference methods, by contrast, show a quadratic increase of the communication cost with the order, since the halo of points needed to be passed between processors increases.
14
System of equations Spectral element discretization: solve 10 times the system of eqns Block structure Note, that and are symmetric.
15
What’s the problem? symmetric, positive definite matrix no M-matrix huge Many parallel solvers available Memory requirements vs. solution time but
16
A. Schur Complement cg Solve Laplacian by Schur Complement cg Preconditioner Adapts wrt. spectral elements
17
Factor matrix Factorization of results in Schur complement Matrices are stored.
18
Schur Complement and Basis Transformation Defining the exact harmonic basis transformation the Schur complement can be reinterpreted as i.e., Galerkin approach.
19
Schur complement cg 1.) 2.) 3.) Solve 4.)
20
Schur Complement Preconditioner I Again, we can factor such that BUT with j counter of elements/edges/...
21
Schur Complement Preconditioner II Substitute by : linear interpolation from vertices onto an edge j
22
Schur Complement Preconditioner III Calculate element-wise: Approximate by is on edge j [Dryja] Derive directly by symbolic methods [Bramble/Pasciak/Schatz]
23
Schur complement pc 1.) 2.) Solve 3.) 4.)
24
Vertex node system is equivalent to a (non-constant) 9-point stencil Solve directly (gather on one processor) Combine with parallel AMG (PEBBLES) Special cache-optimized and parallel AMG/MG for 9-point stencil ()
25
Memory requirements (A) Laplacian in 2D Small example: 99 elements, 5146 nodes M = O(nelem) M(Schur-cg) = 2.35 MB M(Schur-cg,pc) = 2.36 MB
26
B. Matrix approximation Memory Approximate element matrices AMG solver
27
Memory for stiffness matrix Small example Storing in CRS requires 4.79 MB Storing full matrices needs 3.10 MB Symmetry ==> half of memory requirements AMG ==> 3 x 4.8 MB = 14.6 MB
28
Sparse element matrices 4096 entries in, many of them are small Lumping of entries < 5% of main diagonal ==> sparse matrix with aver. 9 entries per row Future: Element preconditioning [Reitzinger], M-matrix, reduced pattern, symbolic methods
29
Sparse matrix: memory(B) M(C) = 0.42 Mbytes AMG(C) ==> 3 x 0.42 MB = 1.26 MB cg(K) with AMG(C)-preconditioning matrix free matrix-vector: M = 1.26 MB matrix-vector: M = 2.82 MB
30
C. Two grid method Direct reduction to vertex system Patch smoother Matrix free defect calculation
31
Interpolation bilinear interpolation from vertices same operator for all elements
32
Factor matrix Factorization of wrt. Element-wise vertex Schur complement ( = coarse matrix) Matrices are stored (16*nelem + 8).
33
Patch smoother Sparse approximation of is Accumulate it and store inverse element matrix (matrix-free) DO r = 1, nelem OD
34
Element matrix I
35
Element matix II Store in each element (3*nelem) Store three 64x64 matrices (3*4096) 3 Mults and 3 Adds calculate the matrix entry
36
Memory requirements (C) Laplacian in 2D Small example: 99 elements, 5146 nodes M (vertex) = 0.01 MB M( ) = 0.13 MB M( ) = 0.82 MB M(Two grid) = 0.96 MB
37
Memory requirements (A-C)
38
Summary 2D: Schur complement pcg is fast AMG and Two-grid method require less memory, especially in 3D Use parallel AMG for Vertex systems Simultaneous iteration for u,v and layers will save arithmetic in matrix-free methods
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.