Download presentation
Presentation is loading. Please wait.
Published byClaud Gordon Modified over 8 years ago
1
A Parallel Hierarchical Solver for the Poisson Equation http://wavelets.mit.edu/~darshan/18.337 Seung Lee Deparment of Mechanical Engineering selee@mit.edu R.Sudarshan Department of Civil and Environmental Engineering darshan@mit.edu 13 th May 2003
2
Outline Introduction –Recap of hierarchical basis methods Used for preconditioning and adaptivity Implementation –Methodology –Mesh subdivision –Data Layout –Assembly and Solution Results and Examples –Comparison with finite element discretization –Speed up with number of processors Conclusions and Further Work
3
Introduction Problem being considered: Discretizing the operator using finite difference or finite element methods leads to stiffness matrices with large condition numbers –Need a good preconditioner (ILU, IC, circulant, domain decomp.) –Choosing such a preconditioner is “black magic” Why create an ill-conditioned matrix and then precondition it, when you can create a well-conditioned matrix to begin with! Instead of using: Use
4
Single Level Vs. Multilevel Approaches + + Single LevelMultilevel Is better conditioned than
5
Formulation and Solution Discretize using piecewise bilinear hierarchical bases: Solve coarsest level problem with PCG (and diagonal preconditioning). Then determine first level details, second level details, etc How well does this method parallelize?How well does this method parallelize? –Multiscale stiffness matrix is not as sparse –Are the savings over the single level method really significant? Big Question
6
Implementation - I: The Big Picture Read input, subdivide the mesh, and distribute node and edge info to processors For I=1:N_level*, Assemble K Solve system of Eqs. Perform inverse wavelet transform and consolidate solution I > N_level?? We (hopefully) have the converged solution!! Yes No *N_levels is known a priori Done by the “Oracle” Done in parallel Use solution from previous mesh as a guess Distribute elements
7
Implementation – II: Mesh Subdivision Level 0Level 1Level 2 1 element 16 elements 4 elements Each parent element is subdivided into four children elements Number of elements and DoFs increase geometrically, and the solution convergences with only a few subdivision levels
8
Implementation – III: Data Layout Degrees of freedom in the mesh are distributed linearly –Uses a naïve partitioning algorithm –Each processor gets roughly NDoF/NP dofs (the Update set) –Each processor assembles the rows of the stiffness matrix corresponding to elements in its update set Each processor has info about all faces connected to vertices in its update set and all vertices connected to such faces –Equivalent to “basis partitioning” I II IIIIV
9
Implementation – IV: Assembly and Solution Stiffness matrices stored in the modified sparse row format –Requires less storage than CSR or CSC formats Equations solved using AztecAztec –Solves linear systems in parallel –Comes with PCG with diagonal preconditioning Inverse wavelet transform (synthesis of the final solution) –Implemented using Aztec as a parallel matrix-vector multiply
10
Input File Format 7Number of subdivisions 4Number of vertices 4Number of edges 1Number of elements 1.0 1.0 1Coordinate (x,y) of the vertices, and boundary info -1.0 1.0 1(i.e. 1 = constrained, 0 = free) -1.0 -1.0 1 1.0 -1.0 1 3 0 1Edge definition (vertex1, vertex2), and boundary info 1 0 1(i.e. 1 = constrained, 0 = free) 2 1 1 2 3 1 0 1 2 3Element definition (edge1, edge2, edge3, edge4) 01 23 0 1 2 3 1 Vertices Edges Elements
11
Results – I: Square Domain Level 1Level 6
12
Single Scale vs. Hierarchical Basis Same order of convergence, but fewer number of iterations for larger problems Number of iterations vs. Degrees of Freedom Degrees of Freedom Number of Iterations
13
Solution Time Vs. Number of Processors
14
Coarsest mesh (level 1) – 9 DoFs, 1 iteration to solve, took 0.004 seconds on 4 procs Finest mesh (level 6) – 4225 DoFs, 43 iteration to solve, took 0.123 seconds on 4 procs Square Domain Solution
15
Results – II: “L” Domain Coarsest mesh (level 1) – 21 DoFs, 3 iteration to solve, took 0.0055 seconds on 4 procs Finest mesh (level 6) – 12545 DoFs, 94 iteration to solve, took 0.280 seconds on 4 procs
16
Single Level Vs. Hierarchical Basis
17
Solution Time Vs. Number of Processors
18
Results – III: “MIT” Domain Coarsest mesh (level 1) – 132 DoFs, 15 iteration to solve, took 0.012 seconds on 4 procs Finest mesh (level 6) – 91520 DoFs, 219 iteration to solve, took 4.77 seconds on 8 procs
19
Single Level Vs. Hierarchical Basis Did not converge after 500 iters
20
Solution Time Vs. Number of Processors
21
Conclusions and Further Work Hierarchical basis method parallelizes well –Provides a cheap and effective parallel preconditioner –Scales well with number of processors With the right libraries, parallelism is easy! –With Aztec, much of the work involved writing (and debugging!) the mesh subdivision, mesh partitioning and the matrix assembly routines Further work –Parallelize parts of the oracle (e.g. mesh subdivision) –Adaptive subdivision based on error estimation –More efficient geometry partitioning –More general element types (right now, we are restricted to rectangular four-node elements)
22
The End
24
(Galerkin) Weak Form Formally, Leads to a multilevel system of equations Coarse coarse interactions Coarse fine interactions Fine fine interactions
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.