Albert-Einstein-Institut Using Supercomputers to Collide Black Holes Solving Einstein’s Equations on the Grid Solving Einstein’s Equations, Black Holes, and Gravitational Wave Astronomy Cactus, a new community simulation code framework –Toolkit for any PDE systems, ray tracing, etc... –Suite of solvers for Einstein and astrophysics systems Recent Simulations using Cactus –Black Hole Collisions, Neutron Star Collisions –Collapse of Gravitational Waves Grid Computing, remote collaborative tools: what a scientist really wants and needs Ed Seidel Albert-Einstein-Institut MPI-Gravitationsphysik & NCSA/U of IL Ed Seidel Albert-Einstein-Institut MPI-Gravitationsphysik & NCSA/U of IL What we will be able to do with this new technology…?
Albert-Einstein-Institut Einstein’s Equations and Gravitational Waves This community owes a lot to Einstein... Einstein’s General Relativity –Fundamental theory of Physics (Gravity) –Among most complex equations of physics Dozens of coupled, nonlinear hyperbolic-elliptic equations with 1000’s of terms –Predict black holes, gravitational waves, etc. –Barely have capability to solve after a century This is about to change... Exciting new field about to be born: Gravitational Wave Astronomy –Fundamentally new information about Universe –What are gravitational waves??: Ripples in spacetime curvature, caused by matter motion, causing distances to change: A last major test of Einstein’s theory: do they exist? –Eddington: “Gravitational waves propagate at the speed of thought” –This is about to change...
Albert-Einstein-Institut Multi-Teraflop Computation, AMR, Elliptic-Hyperbolic Numerical Relativity Waveforms: We Want to Compute What Actually Happens in Nature.. Can’t do this now, but this is about to change...
Albert-Einstein-Institut Computational Needs for 3D Numerical Relativity: Can’t fulfill them now, but about to change... InitialData: 4 coupled nonlinear elliptics Evolution hyperbolic evolution coupled with elliptic eqs. t=0 t=100 Multi TFlop, Tbyte machine essential Explicit Finite Difference Codes –~ 10 4 Flops/zone/time step –~ 100 3D arrays Require zones or more –~1000 Gbytes –Double resolution: 8x memory, 16x Flops Parallel AMR, I/O essential A code that can do this could be useful to other projects (we said this in all our grant proposals)! –Last few years devoted to making this useful across disciplines… –All tools used for these complex simulations available for other branches of science, engineering… Scientist/engineer wants to know only that! –But what algorithm? architecture? parallelism?, etc...
Albert-Einstein-Institut Grand Challenges: NSF Black Hole and NASA Neutron Star Projects Creating the momentum for the future... University of Texas (Matzner, Browne, Choptuik), NCSA/Illinois/AEI (Seidel, Saylor, Smarr, Shapiro, Saied) North Carolina (Evans, York) Syracuse (G. Fox) Cornell (Teukolsky) Pittsburgh (Winicour) Penn State (Laguna, Finn) NCSA/Illinois/AEI (Saylor, Seidel, Swesty, Norman) Argonne (Foster) Washington U (Suen) Livermore (Ashby) Stony Brook (Lattimer) NEW! EU Network Entire community about to become Grid-enabled...
Albert-Einstein-Institut Cactus New concept in community developed simulation code infrastructure Developed as response to needs of these projects Numerical/computational infrastructure to solve PDE’s Freely available, Open Source community framework: spirit of gnu/linux –Many communities contributing to Cactus Cactus Divided in “Flesh” (core) and “Thorns” (modules or collections of subroutines) –Flesh, written in C, glues together various components –Multilingual: User apps can be Fortran, C, C ++ ; automated interface between them Abstraction: Cactus Flesh provides API for virtually all CS type operations –Driver functions (storage, communication between processors, etc) –Interpolation –Reduction –IO (traditional, socket based, remote viz and steering…) –Checkpointing, coordinates –Etc, etc… Cactus is a Grid-enabling application middleware...
Albert-Einstein-Institut How to use Cactus Features Application scientist usually concentrates on the application... –Performance –Algorithms –Logically: Operations on a grid (structured or unstructured (coming…))...Then takes advantage of parallel API features enabled by Cactus –IO, Data streaming, remote visualization/steering, AMR, MPI, checkpointing, Grid Computing, etc… –Abstraction allows one to switch between different MPI, PVM layers, different I/O layers, etc, with no or minimal changes to application! (nearly) All architectures supported and autoconfigured –Common to develop on laptop (no MPI required); run on anything –Compaq / SGI Origin 2000 / T3E / Linux clusters + laptops / Hitachi /NEC/HP/Windows NT/ SP2, Sun Metacode Concept –Very, very lightweight, not a huge framework (not Microsoft Office) –User specifies desired code modules in configuration files –Desired code generated, automatic routine calling sequences, syntax checking, etc… –You can actually read the code it creates...
Albert-Einstein-Institut Modularity of Cactus... Application 1 Cactus Flesh Application 2... Sub-app AMR (GrACE, etc) MPI layer 3I/O layer 2 Unstructured... Globus Metacomputing Services User selects desired functionality… Code created... Abstractions... Remote Steer 2 MDS/Remote Spawn Legacy App 2
Albert-Einstein-Institut Computational Toolkit: provides parallel utilities (thorns) for computational scientist Cactus is a framework or middleware for unifying and incorporating code from Thorns developed by the community –Choice of parallel library layers (Native MPI, MPICH, MPICH-G(2), LAM, WMPI, PACX and HPVM) –Various AMR schemes: Nested Boxes, GrACE, Coming: HLL, Chombo, Samrai, ??? –Parallel I/O (Panda, FlexIO, HDF5, etc…) –Parameter Parsing –Elliptic solvers (Petsc, Multigrid, SOR, etc…) –Visualization Tools, Remote steering tools, etc… –Globus (metacomputing/resource management) –Performance analysis tools (Autopilot, PAPI, etc…) –Remote visualization and steering –INSERT YOUR CS MODULE HERE... PAPI GrACE/DAGH
Albert-Einstein-Institut High performance: Full 3D Einstein Equations solved on NCSA NT Supercluster, Origin 2000, T3E Excellent scaling on many architectures –Origin up to 256 processors –T3E up to 1024 –NCSA NT cluster up to 128 processors Achieved 142 Gflops/s on 1024 node T3E-1200 (benchmarked for NASA NS Grand Challenge) Scaling to thousands of processors possible, necessary... But, of course, we want much more… Grid Computing
Albert-Einstein-Institut Geophysics (Bosl) Numerical Relativity Community Cornell Crack prop. NASA NS GC Livermore SDSS (Szalay) Intel Microsoft Clemson “Egrid” NCSA, ANL, SDSC Cactus Community Development Projects AEI Cactus Group (Allen) NSF KDI (Suen) EU Network (Seidel) Astrophysics (Zeus) US Grid Forum DLR DFN Gigabit (Seidel) “GRADS” (Kennedy, Foster, Dongarra, et al) ChemEng (Bishop) San Diego, GMD, Cornell Berkeley
Albert-Einstein-Institut The Relativity Community takes a step forward Great progress, but computers still too small Biggest computations ever: 256 proc O2K at NCSA 225,000 SU’s, 1Tbyte Output Data in a Few Weeks Neutron Stars –Developing capability to do full GR hydro –Now can follow full orbits! Black Holes (prime source for GW) –Increasingly complex collisions: now doing full 3D grazing collisions Gravitational Waves –Study linear waves as testbeds –Move on to fully nonlinear waves –Interesting Physics: BH formation in full 3D!
Albert-Einstein-Institut Evolving Pure Gravitational Waves Einstein’s equations nonlinear, so low amplitude waves just propagate away, but large amplitude waves may… –Collapse on themselves under their own self-gravity and actually form black holes Use numerical relativity: Probe GR in highly nonlinear regime –Form BH?, Critical Phenomena in 3D?, Naked singularities? –… Little known about generic 3D behavior Take “Lump of Waves” and evolve –Large amplitude: get BH to form! –Below critical value: disperses and can evolve “forever” as system returns to flat space We are seeing hints of critical phenomena, known from nonlinear dynamics But need much more power to explore details, discover new physics...
Albert-Einstein-Institut Comparison: sub vs. super-critical solutions Newman-Penrose 4 (showing gravitational waves) with lapse underneath Subcritical: no BH forms Supercritical: BH forms!
Albert-Einstein-Institut Numerical Black Hole Evolutions Binary IVP: Multiple Wormhole Model, other models Black Holes good candidates for Gravitational Waves Astronomy –~ 3 events per years within 200Mpc –Very strong sources –But what are the waveforms? GW astronomers want to know! S1S1 S2S2 P1P1 P2P2
Albert-Einstein-Institut First Step: Full 3D Numerical Evolution Head-on, Equal Mass, BH Collisions (Misner Data) in 3D: 512 node CM-5 Event Horizon shown in green. (representing gravitational waves) shown in blue-yellow
Albert-Einstein-Institut First 3D “Grazing Collision of 2 Black Holes”: Big Step: Spinning, “orbiting”, unequal mass BHs merging. Evolution of waves Horizon merger Alcubierre et al results 384 3, 100GB simulation, Largest production relativity 256 Processor Origin 2000 at NCSA Simulation, ~500GB output data
Albert-Einstein-Institut Future view: much of it here already... Scale of computations much larger –Complexity approaching that of Nature –Simulations of the Universe and its constituents Black holes, neutron stars, supernovae Airflow around advanced planes, spacecraft Human genome, human behavior Teams of computational scientists working together –Must support efficient, high level problem description –Must support collaborative computational science –Must support all different languages Ubiquitous Grid Computing –Very dynamic simulations, deciding their own future –Apps find the resources themselves: distributed, spawned, etc... –Must be tolerant of dynamic infrastructure (variable networks, processor availability, etc…) –Monitored, viz’ed, controlled from anywhere, with colleagues anywhere else...
Albert-Einstein-Institut Our Team Requires Grid Technologies, Big Machines for Big Runs WashU NCSA Hong Kong AEI ZIB Thessaloniki How Do We: Maintain/develop Code? Manage Computer Resources? Carry Out/monitor Simulation? Paris
Albert-Einstein-Institut What we need and want in simulation science: a higher level Portal to provide the following... Got idea? Configuration manager: Write Cactus module, link to other modules, and… Find resources –Where? NCSA, SDSC, Garching, Boeing…??? –How many computers? Distribute Simulations? –Big jobs: “Fermilab” at disposal: must get it right while the beam is on! Launch Simulation –How do get executable there? –How to store data? –What are local queue structure/OS idiosyncracies? Monitor the simulation –Remote Visualization live while running Limited bandwidth: compute viz. inline with simulation High bandwidth: ship data to be visualized locally –Visualization server: all privileged users can login and check status/adjust if necessary Are parameters screwed up? Very complex! Call in an expert colleague…let her watch it too –Performance: how efficient is my simulation? Should something be adjusted? Steer the simulation –Is memory running low? AMR! What to do? Refine selectively or acquire additional resources via Globus? Delete unnecessary grids? Performance steering... Postprocessing and analysis –1TByte output at NCSA, research groups in St. Louis and Berlin…how to deal with this?
Albert-Einstein-Institut Cactus Computational Toolkit Science, Autopilot, AMR, Petsc, HDF, MPI, GrACE, Globus, Remote Steering... A Portal to Computational Science: The Cactus Collaboratory 1. User has science idea Selects Appropriate Resources Collaborators log in to monitor Steers simulation, monitors performance Composes/Builds Code Components w/Interface... Want to integrate and migrate this technology to the generic user…
Albert-Einstein-Institut Grid-Enabled Cactus (static version) Cactus and its ancestor codes have been using Grid infrastructure since 1993 (part of famous I-Way of SC’95) Support for Grid computing was part of the design requirements Cactus compiles “out-of-the-box” with Globus [using globus device of MPICH-G(2)] Design of Cactus means that applications are unaware of the underlying machine/s that the simulation is running on … applications become trivially Grid-enabled Infrastructure thorns (I/O, driver layers) can be enhanced to make most effective use of the underlying Grid architecture
Albert-Einstein-Institut Grid Applications so far... SC93 - SC2000 Typical scenario –Find remote resource (often using multiple computers) – Launch job (usually static, tightly coupled) –Visualize results (usually in-line, fixed) Need to go far beyond this –Make it much, much easier Portals, Globus, standards –Make it much more dynamic, adaptive, fault tolerant –Migrate this technology to general user Metacomputing the Einstein Equations: Connecting T3E’s in Berlin, Garching, San Diego
Albert-Einstein-Institut Dynamic Distributed Computing Static grid model works only in special cases; must make apps able to respond to changing Grid environment... Make use of –Running with management tools such as Condor, Globus, etc. –Service providers (Entropia, etc) Code as Information server, manager –Scripting thorns (management, launching new jobs, etc) –Dynamic use of MDS for finding available resources: code decides where to go, what to do next! Applications –Portal for simulation launching and management –Intelligent parameter surveys (Cactus control thorn) –Spawning off independent jobs to new machines e.g. analysis tasks –Dynamic staging … seeking out and moving to faster/larger/cheaper machines as they become available (Cactus worm) –Dynamic load balancing (e.g. inhomogeneous loads, multiple grids) –Etc…many new computing paradigms
Albert-Einstein-Institut Remote Visualization and Steering Remote Viz data HTTP HDF5 Amira Any Viz Client Changing any steerable parameter Parameters Physics, algorithms Performance IsoSurfaces and Geodesics Computed inline with simulation Only geometry sent across network OpenDX Arbitrary Grid Functions Streaming HDF5
Albert-Einstein-Institut Remote Offline Visualization Viz Client (Amira) HDF5 VFD DataGrid (Globus) DPSS FTP HTTP Visualization Client DPSS Server FTP Server Web Server Remote Data Server Downsampling, hyperslabs Viz in Berlin 4TB distributed across NCSA/ANL/Garching Only what is needed
Albert-Einstein-Institut Grand Picture Remote steering and monitoring from airport Origin: NCSA Remote Viz in St Louis T3E: Garching Simulations launched from Cactus Portal Grid enabled Cactus runs on distributed machines Remote Viz and steering from Berlin Viz of data from previous simulations in SF café DataGrid/DPSS Downsampling Globus http HDF5 IsoSurfaces
Albert-Einstein-Institut Egrid and Grid Forum Activities Grid Forum: –Developed in US over last 18 months –~ 200 members (?) –Meeting every 3-4 months –Many working groups discussing grid software, standards, grid techniques, scheduling, applications, etc. Egrid: –European initiative now 6 months old –About 2 dozen sites in Europe –Similar goals, but with European identity Next meeting: Oct in Boston We hope to enlist many more application groups to drive Grid Development –Cactus Grid Application Development Toolkit
Albert-Einstein-Institut Present Testbeds (a sampling…) Cactus Virtual Machine Room –Small version of Alliance VMR with European sites (NCSA, ANL, UNM, AEI, ZIB) –Portal allows access for users to all machines, queues, etc, without knowing local passwords, batch systems, file systems, OS, etc... –Developed in collaboration with NSF KDI project, AEI, DFN-Verein –Built on Globus services –Will copy and develop Egrid version, hopefully tomorrow... Egrid Demos –Developing Cactus Worm demo for SC2000! –Cactus simulation runs, queries MDS, finds next resource, migrates itself to next site, runs, and continues around Europe, with continuous remote viz and control... Big Distributed Simulation –Old static model: harness as many supercomputers as possible –Go for a Tflop,even with tightly coupled simulation distributed across continents –Developing techniques to make bandwidth/latency tolerant simulations...
Albert-Einstein-Institut Further details... Cactus – – Movies, research overview (needs major updating) – Simulation Collaboratory/Portal Work: – Remote Steering, high speed networking – – EU Astrophysics Network –