Download presentation
Presentation is loading. Please wait.
Published byGervais Briggs Modified over 9 years ago
Conundrum Talk, LBL May 2000 The Cactus Code: A Framework for Parallel Computing Gabrielle Allen Albert Einstein Institute Max Planck Institute for Gravitational Physics
2 Cactus Code Versions 1,2,3 n A code for Numerical Relativity l collaborative, portable, parallel,... n Model of “Flesh” + “Thorns” l Flesh: core code, provides parallelisation, IO. l Thorns: plug in modules, written in Fortran or C, which provide the applications and infrastructure. n Successful for numerical relativity, but problems as number of thorns and number of users increased n Redesign, incorporating lessons learnt from previous versions l Cactus 4.0
3 Current Version Cactus 4.0 n Cactus 4.0 beta 1 released September 1999 n Flesh and many thorns distributed under GNU GPL n Currently: Cactus 4.0 beta 8 n Supported Architectures: l SGI Origin l SGI 32/64 l Cray T3E (142GF on 1024 nodes) l Dec Alpha l Intel/Mac Linux l Windows NT l HP Exemplar l IBM SP2 l Sun Solaris l Hitachi SR8000-F l NEC SX-5
4 Userbase n Astrophysics (Finite differencing, 3D hyperbolic/elliptic) PDEs l Einstein equations: Black holes, gravitational waves l Relativistic matter: Neutron stars, boson stars l Newtonian matter: ZEUS code n Aerospace l Pilot project with DLR, introduction of unstructured grids n QCD n Computational Science l Application code for Grid, Egrid, GrADs projects l Parallel and Distributed IO l Remote steering and visualization l Adaptive mesh refinement l And more...
5 Einstein’s Equations & Gravitational Waves n Einstein’s General Relativity l Fundamental theory of Physics (Gravity) l Among most complex equations of physics –Dozens of coupled, nonlinear hyperbolic-elliptic equations with 1000’s of terms –Barely have capability to solve after a century l Predict black holes, gravitational waves, etc. n Exciting new field about to be born: Gravitational Wave Astronomy l Fundamentally new information about Universe l What are gravitational waves??: Ripples in spacetime curvature, caused by matter motion, causing distances to change: n A last major test of Einstein’s theory: do they exist? l Eddington: “Gravitational waves propagate at the speed of thought” l 1993 Nobel Prize Committee: Hulse-Taylor Pulsar (indirect evidence) l 20xx Nobel Committee: ??? (For actual detection…) s(t) h = s/s ~ 10 -22 ! Colliding BH’s and NS’s...
6 Detecting Gravitational Waves LIGO, VIRGO (Pisa), GEO600,… $1 Billion Worldwide Was Einstein right? 5-10 years, we’ll see! We’ll need numerical relativity to: Detect them…pattern matching against numerical templates to enhance signal/noise ratio Understand them…just what are the waves telling us? 4km Hanford Washington Site
7 Teraflop Computation, AMR, Elliptic-Hyperbolic, ??? Numerical Relativity Waveforms: What Happens in Nature... PACS Virtual Machine Room
8 Resources for 3D Numerical Relativity n Explicit Finite Difference Codes l ~ 10 4 Flops/zone/time step l ~ 100 3D arrays n Require 1000 3 zones or more l ~1000 Gbytes l Double resolution: 8x memory, 16x Flops n TFlop, Tbyte machine required n Parallel AMR, I/O essential n Etc... InitialData: 4 coupled nonlinear elliptics Evolution hyperbolic evolution coupled with elliptic eqs. t=0 t=100
9 Axisymmetric Black Hole Simulations Evolution of Highly Distorted Black Hole Collision of two Black Holes (“Misner Data”)
10 NSF Black Hole Grand Challenge Alliance n University of Texas (Matzner, Browne) n NCSA/Illinois/AEI (Seidel, Saylor, Smarr, Shapiro, Saied) n North Carolina (Evans, York) n Syracuse (G. Fox) n Cornell (Teukolsky) n Pittsburgh (Winicour) n Penn State (Laguna, Finn) Develop Code To Solve G 0
11 NASA Neutron Star Grand Challenge n NCSA/Illinois/AEI (Saylor, Seidel, Swesty, Norman) n Argonne (Foster) n Washington U (Suen) n Livermore (Ashby) n Stony Brook (Lattimer) “A Multipurpose Scalable Code for Relativistic Astrophysics” Develop Code To Solve G 8 T
12 Cactus Modularity IOFlexIO FLESH (Parameters, Variables, Scheduling) IOHDF5 PUGH WaveToyF90 CartGrid3D GrACE Boundary WaveToyF77
13 Cactus 4 Design Goals n Generalization l meta-code that can be applied e.g. to any system of PDEs –mainly 3D cartesian finite differencing codes (but changing) n Abstraction l Identify key concepts that can be abstracted –Evolution skeleton. Reduction operators. I/O. Etc... n Encapsulation l Protect the developers of thorns from other thorns... n Extension l Prepare for new concepts in future thorns l Overloading, Inheritance, etc... n In some way, make it a little Object Oriented
14 Design issues n Modular and Object Oriented l Keep the concept of thorns l Encapsulation, Polymorphism, Inheritance,... n Fortran l Influences most design issues n Portable Parallelism n Support for FMR and AMR as well as Unigrid n Powerful Make system n Tools such as “Testsuite checking technology”
15 Realization n Perl l The final code is created from Thorn configuration files by perl scripts that are some sort of seed for a new language: –The Cactus Configuration Language (CCL): variables, (functions), parameters, scheduling l Perl scripts also take care of testsuite checking, configuration,... n Flesh written in ANSI C n Thorns written in C, C++, Fortran77, Fortran90
16 Cactus Flesh : interface between Application Thorns and Computational Infrastructure Thorns
17 The Flesh l Abstract API –evolve the same PDE with unigrid, AMR (MPI or shared memory, etc) without having to change any of the application code. l Interfaces –set of data structures that a thorn exports to the world (global), to its friends (protected) and to nobody (private) and how these are inherited. l Implementations –Different thorns may implement e.g. the evolution of the same PDE and we select the one we want at runtime. l Scheduling –call in a certain order the routines of every thorn and how to handle their interdependencies. l Parameters –many types of parameters and all of their essential consistency checked before running
18 Cactus Computational Toolkit n Parallel Evolution Drivers l PUGH –MPI domain decomposition based unigrid driver –Can be distributed using globus l GrACE/PAGH –Adaptive Mesh Refinement driver n Parallel Elliptic Solvers l PETSc l BAM n Parallel Interpolators n Parallel I/O l FlexIO, ASCII, HDF5, Panda, Checkpointing, etc... n Visualization, etc...
19 Data Structures n Grid Arrays l An multidimensional and arbitrarily sized array distributed among processors n Grid Functions l A field distributed on the multidimensional computational grid (a Grid Array sized to the grid) –Every point in a grid may hold a different value “f(x,y,z)” n Grid Scalars l Values common to all the grid points n Parameters l Values/Keywords that affect the behavior of the code (initialization, evolution, output, etc..) –parameter checking, steerable parameters
20 Data Types n Cactus data types to provide portability across platforms n CCTK_REAL l CCTK_REAL4, CCTK_REAL8, CCTK_REAL16 n CCTK_INT l CCTK_INT2, CCTK_INT4, CCTK_INT8 n CCTK_CHAR n CCTK_COMPLEX l CCTK_COMPLEX8, CCTK_COMPLEX16, CCTK_COMPLEX32
21 Scheduling n Thorns schedule l when their routines should be executed l what memory for Grid Arrays should be enabled l which Grid Arrays should be synchronized on exit n Basic evolution skeleton idea l standard scheduling points INITIAL, EVOL, ANALYSIS l fine control: run this routine BEFORE/AFTER that routine n Extend/customise with scheduling groups l Define own scheduling points MYEVOL l Add my routine to this group of routines l Run the group WHILE some condition is met n Future redesign l The scheduler is really a runtime selector of the computation flow. l We can add much more power to this concept
22 Interface n The concept: contract with the rest of the code l Now it is only for the data structures : variables and parameters l adding thorn utility routines and their arguments n Private l The variables that you want the flesh to allocate/communicate but no other thorn to see. n Public l The variables that you want everybody to see (that means that everybody can modify them too!) l Inheritance n Protected l Variables that you want only your friends to see! l [Watch out for the change of meaning from C++ names]
23 Implementation n Why l Two or more thorns that provide the same functionality but different internal implementation –Interchangeable pieces that allow easy comparison and evolution in the development process –They are compiled together and only one is activated at runtime n How l If all the other thorns need to see the same contract, then thorns implementing a certain functionality must –Have the same public variables –and their protected ones!! –The same concept applies to parameters and scheduling n Example l Wildly different evolution approaches for the same equations, so all the analysis and initial data thorns remain the same.
24 Parallelism in Cactus n Cactus is designed around a distributed memory model. Each thorn is passed a section of the global grid. n The actual parallel driver (implemented in a thorn) can use whatever method it likes to decompose the grid across processors and exchange ghost zone information - each thorn is presented with a standard interface, independent of the driver. driver:nghostzones = 1
25 PUGH n The standard parallel driver supplied with Cactus is supplied by thorn PUGH n Driver thorn: Sets up grid variables, handles processor decomposition, deals with processor communications n 1,2,3D (soon n-D) Grid Arrays/Functions n Uses MPI n Custom processor decomposition/Load balancing n Otherwise decomposes in z, then y, then x directions
26 Parallelizing an Application Thorn All these calls are overloaded by infrastructure thorns: n CCTK_SyncGroup –synchronise ghostzones for a group of grid variables n CCTK_Reduce –call any registered reduction operator, e.g. maximum value over the grid n CCTK_Interpolate –call any registered interpolation operator n CCTK_MyProc –unique processor number within the computation n CCTK_nProcs –total number of processors n CCTK_Barrier –waits for all processors to reach this point
27 Building an executable n Compiling Cactus involves two stages l creating a configuration l compiling the source files to an executable n Configuration: l Cactus can be compiled with different compilers, different compilation options, with different lists of thorns, with different external libraries (e.g. MPICH or LAM), and on different architectures. l To facilitate this Cactus uses configurations, which store all the distinct information used to build a particular executable (Cactus/configs) l Each configuration is given a unique name.
28 Configuration Options gmake MyConfig-config (or options file) n Default options decided by autoconf n Compiler and tool specification e.g. l F77=/weirdplace/pgf90 n Compilation and tool flags e.g. l CFLAGS=save-temps l DEBUG=ALL n Library and include file specification n Precision options e.g. l REAL_PRECISION=16
29 Configuring with External Packages n Cactus currently knows about the external packages: l MPI (NATIVE, MPICH, LAM, WMPI,CUSTOM) l HDF5 l GRACE and will search standard locations for them l gmake MyConfig MPI=NATIVE l gmake MyConfig MPI=MPICH MPICH_DEVICE=globus GLOBUS_LIB_DIR=/usr/local/globus/lib l gmake MyConfig MPI=CUSTOM MPI_LIBS=mpi MPI_LIB_DIRS=/usr/lib MPI_INC_DIRS=/usr/include
30 Compile Options gmake MyConfig n Parallel build l FJOBS= l TJOBS= n Compilation debugging l SILENT=no n [Compiler warnings] l WARN=yes
31 Running Cactus./exe/cactus_MyConfig MyParameterFile.par Additional command line options -hhelp -O[v]details about all parameters -o details about one parameter -vversion number, compile date -Tlist all thorns -t is thorn compiled -rredirect stdout -W reset warning level -E reset error level
32 Parameter Files n Cactus runs from a user’s parameter file l chooses the thorns to be used for the run (so that inactive thorns can’t do any damage) l sets parameters which are different from default values !desc = “Demonstrates my new application” ActiveThorns = “PUGH WaveToyF77 Boundary CartGrid3D” driver::global_size = 30# Change the grid size wavetoy:: initial_data = “wave”# Initial data
33 MetaComputing n Scientists want easy access to available resources l Authentication, file systems, batch queues... n They also want access to many more resources l Einstein equations require extreme memory, speed l Largest supercomputers too small l Want to access multiple supercomputers for large runs n With AMR etc will want to acquire resources dynamically during simulation n Interactive visualization and steering of simulations from anywhere
34 MetaComputing Experiments n SC93: remote CM-5 simulation with live viz in CAVE n SC95: Heroic I-Way experiments leads to development of Globus. Cornell SP-2, Power Challenge, with live viz in San Diego CAVE n SC97: Garching 512 node T3E, launched, controlled, visualized in San Jose n SC98: HPC Challenge. SDSC, ZIB, and Garching T3E compute collision of 2 Neutron Stars, controlled from Orlando n SC99: Colliding Black Holes using Garching, ZIB T3E’s, with remote collaborative interaction and viz at ANL and NCSA booths n April/May 2000: Attempting to use LANL, NCSA, NERSC, SDSC,...ZIB, Garching, … for single simulation
35 Grid Enabled Cactus n Collaboration between AEI, ANL, U. Chicago, N. Illinois U. to run a 512x512x2048 Black Hole collision n Cactus + Globus/MPICH-G2 n Machines: l 1000 IBP SP2 at SDSC, l 512 T3E at NERSC, l 1500 Origin 2000 at NCSA, l 128 Origin 2000 at ANL. l Possibly more l Connected via high-speed networks n Issues: different processor types, memories, operating systems, resource management, varied networks, bandwidths and latencies,
36 Cactus + Globus Cactus Application Thorns Distribution information hidden from programmer Initial data, Evolution, Analysis, etc Grid Aware Application Thorns Drivers for parallelism, IO, communication, data mapping PUGH: parallelism via MPI (MPICH-G2, grid enabled message passing library) Grid Enabled Communication Library MPICH-G2 implementation of MPI, can run MPI programs across heterogenous computing resources Standard MPI Single Proc
37 Remote Steering/Visualization Architecture
38 Coming up n Thorns written in Java or Perl n Cactus communication layer l Parallel driver thorn (e.g. PUGH) currently provides both variable management and communication … l abstract send and receives etc n Abstract communication from driver thorn l easily implement different parallel paradigms l shared memory, threads, Corba, OpenMP, PVM,... n Compact groups (different layout in memory for improved Cache performance) n Unstructured Meshes/Finite Elements/Spectral Methods n Unstructured Multigrid Solver n Convergence/Multiple Coordinate Patches n Capability browsing mechanism n Command line interface … connect directly to Cactus, scheduling n GUIs, Documentation, GUIs, Documentation ….
39 n Documentation l IEEE Computer December 1999 l Users Guide l Maintainers Guide n Download l CVS distribution (stable and development versions) n Development l Bugs and feature requests l Mailing lists (e.g., ) n Showcase l Presentations, publications, movies... n News, and Links to related institutions, software
Similar presentations
© 2025 Inc.
All rights reserved.