Download presentation
Presentation is loading. Please wait.
Published byKatrina Hubbard Modified over 9 years ago
2
1 Mark F. Adams SciDAC - 27 June 2005 Ax=b: The Link between Gyrokinetic Particle Simulations of Turbulent Transport in Burning Plasmas and Micro-FE Analysis of Whole Vertebral Bodies in Orthopaedic Biomechanics
3
2 Outline Algebraic multigrid (AMG) introduction Micro-FE bone modeling Olympus parallel FE framework Scalability study on IBM SPs Gyrokinetic Particle Simulations of Turbulent Transport in Burning Plasmas
4
3 Multigrid smoothing and coarse grid correction (projection) smoothing Finest Grid Prolongation (P=R T ) The Multigrid V-cycle First Coarse Grid Restriction (R) Note: smaller grid
5
4 Multigrid V( ) - cycle Given smoother S and coarse grid space (P) Columns of “prolongation” operator P, discrete rep. of coarse grid space MG-V Function u = MG-V(A,f) if A is small u A -1 f else u S (f, u) -- steps of smoother (pre) r H P T ( f – Au ) MG-Vrecursion u H MG-V(P T AP, r H )-- recursion (Galerkin) u u + Pu H u S (f, u) -- steps of smoother (post) Iteration matrix w/ R = P T : T = S ( I - P(RAP) -1 RA ) S multiplicative
6
5 Smoothed Aggregation Coarse grid space & smoother MG method Piecewise constant function: “Plain” agg. (P 0 ) Start with kernel vectors B of operator eg, 6 RBMs in elasticity Nodal aggregation BP0P0 “Smoothed” aggregation: lower energy of functions One Jacobi iteration: P ( I - D -1 A ) P 0
7
6 Outline Algebraic multigrid (AMG) introduction Micro-FE bone modeling Olympus parallel FE framework Scalability study on IBM SPs Gyrokinetic Particle Simulations of Turbulent Transport in Burning Plasmas
8
7 Trabecular Bone 5-mm Cube Cortical bone Trabecular bone
9
8 Micro-Computed Tomography CT @ 22 m resolution 3D image Mechanical Testing E, yield, ult, etc. 2.5 mm cube 44 m elements FE mesh Methods: FE modeling
10
9 the vertebral body you are showing is pretty healthy from a 80 year old female and it is a T-10 that is thoracic. So it is pretty close to the mid-spine. Usually research is done from T-10 downward to the lumbar vertebral bodies. There are 12 thoracic VB's and 5 lumbar. The numbers go up as you go down.
11
10 Motivation Calibrate material models for continuum elements –eg, explicit computation of a yield surface Validation for low order model Investigation of effects that are not accessible with lower order models –role of cortical shell in load carrying of vertebra –effects of drug treatment on continuum properties 1 mm slice from vertebral body
12
11 Outline Algebraic multigrid (AMG) introduction Micro-FE bone modeling Olympus parallel FE framework Scalability study on IBM SPs Gyrokinetic Particle Simulations of Turbulent Transport in Burning Plasmas
13
12 Athena: Parallel FE ParMetis Parallel Mesh Partitioner (Univerisity of Minnesota) Prometheus Multigrid Solver FEAP Serial general purpose FE application (University of California) PETSc Parallel numerical libraries (Argonne National Labs) FE Mesh Input File Athena ParMetis FE input file (in memory) Partition to SMPs Athena ParMetis File FEAP Material Card Silo DB Visit Prometheus PETSc ParMetis METIS pFEAP Computational Architecture Olympus
14
13 Geometric & Material non-linear 2.25% strain 8 procs. DataStar (SP4 at UCSD)
15
14 ParMetis partitions
16
15 Outline Algebraic multigrid (AMG) introduction Micro-FE bone modeling Olympus parallel FE framework Scalability study on IBM SPs Gyrokinetic Particle Simulations of Turbulent Transport in Burning Plasmas
17
16 80 µm w/ shell Vertebral Body With Shell Large deformation elast. 6 load steps (3% strain) Scaled speedup ~131K dof/processor 7 to 537 million dof 4 to 292 nodes IBM SP Power3 14 of 16 procs/node used Double/Single Colony switch
18
17 80 µm w/o shell Inexact Newton CG linear solver Variable tolerance Smoothed aggregation AMG preconditioner Nodal block diagonal smoothers: 2 nd order Chebeshev (add.) Gauss-Seidel (multiplicative) Scalability
19
18 Computational phases Mesh setup (per mesh): Coarse grid construction (aggregation) Graph processing Matrix setup (per matrix): Coarse grid operator construction Sparse matrix triple product RAP (expensive for S.A.) Subdomain factorizations Solve (per RHS): Matrix vector products (residuals, grid transfer) Smoothers (Matrix vector products)
20
19 131K dof / proc - Flops/sec/proc.47 Teraflop/s - 4088 processors
21
20 Sources of scale inefficiencies in solve phase 7.5M dof537M dof #iteration450897 #nnz/row5068 Flop rate7674 #elems/pr19.3K33.0K model1.002.78 Measured 1.002.61
22
21 Strong speedup with 7.5M dof problem (1 to 128 nodes)
23
22 Outline Algebraic multigrid (AMG) introduction Micro-FE bone modeling Olympus parallel FE framework Scalability study on IBM SPs Gyrokinetic Particle Simulations of Turbulent Transport in Burning Plasmas
24
23
25
24 Finite Element (FEM) Elliptic Solver Developed for GTC Global Field Aligned Mesh FEM adapted for logically non- rectangular grids. Need adjustments of elements at different toroidal angles. Linear sparse matrix solver –PETSc (ANL) Enabled implementing split-weight (Manuilskiy & Lee, POP2000) –and hybrid electron models (Lin & Chen, PoP2001) Ongoing studies of kinetic electron effects on ITG and TEM turbulence Ongoing studies of electromagnetic turbulences:
26
25 Performance Multigrid preconditioned Krylov solver –Prometheus (Columbia) & HYPRE (LLNL) Scaled speedup –~38K dof per processor –1 to 32 processors/plane –8 planes, 20 time steps, 4 particles per cell
27
26 Thank You Gordon Bell Prize winner 2004: Ultrascalable implicit finite element analyses in solid mechanics with over a half a billion degrees of freedom M.F. Adams, H.H. Bayraktar,T.M. Keaveny, P. Papadopoulos ACM/IEEE Proceedings of SC2004: High Performance Networking and Computing
28
27 Linear solver iterations Newton Load Small (7.5M dof)Large (537M dof) 12345123456 15142021185113525702 251420 5113626702 35142022195113626702 45142022195113626702 55142022195113626702 65142022195113626702
29
28 S. Ethier Thunder-LLNLJacquard-NERSC
30
29 164K dof/proc
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.