ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel James Wadsley Greg Stinson Laxmikant Kale Filippo Gioachin Pritish Jetley Celso.

Slides:



Advertisements
Similar presentations
Cosmological Structure Formation A Short Course III. Structure Formation in the Non-Linear Regime Chris Power.
Advertisements

The Charm++ Programming Model and NAMD Abhinav S Bhatele Department of Computer Science University of Illinois at Urbana-Champaign
Andrey Kravtsov Kavli Institute for Cosmological Physics (KICP) The University of Chicago Simulating galaxy formation at high redshifts.
Ryuji Morishima (UCLA/JPL). N-body code: Gravity solver + Integrator Gravity solver must be fast and handle close encounters Special hardware (N 2 ):
BIG BANG. EVIDENCE FOR BIG BANG Hot Big Bang Model: The universe began expanding a finite time ago from a very dense, very hot initial state. Dense Dense.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500 Cluster.
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.
Astro-2: History of the Universe Lecture 4; April
Non-linear matter power spectrum to 1% accuracy between dynamical dark energy models Matt Francis University of Sydney Geraint Lewis (University of Sydney)
PRESENTATION TOPIC  DARK MATTER &DARK ENERGY.  We know about only normal matter which is only 5% of the composition of universe and the rest is  DARK.
ASCI/Alliances Center for Astrophysical Thermonuclear Flashes Simulating Self-Gravitating Flows with FLASH P. M. Ricker, K. Olson, and F. X. Timmes Motivation:
Module on Computational Astrophysics Professor Jim Stone Department of Astrophysical Sciences and PACM.
In The Beginning  N-body simulations (~100s particles) – to study Cluster formation  Cold collapse produces too steep a density profile (Peebles 1970)
Universe in a box: simulating formation of cosmic structures Andrey Kravtsov Department of Astronomy & Astrophysics Center for Cosmological Physics (CfCP)
Dark matter and black holes over cosmic time TOMMASO TREU.
By: Giuseppe Tormen. Heating Processes Adiabatic Compression Viscous Heating – Due to internal friction of the gas. Photoionization.
Adaptive MPI Chao Huang, Orion Lawlor, L. V. Kalé Parallel Programming Lab Department of Computer Science University of Illinois at Urbana-Champaign.
Derek C. Richardson (U Maryland) PKDGRAV : A Parallel k-D Tree Gravity Solver for N-Body Problems FMM 2004.
Cosmological N-body simulations of structure formation Jürg Diemand, Ben Moore and Joachim Stadel, University of Zurich.
박창범 ( 고등과학원 ) & 김주한 ( 경희대학교 ), J. R. Gott (Princeton, USA), J. Dubinski (CITA, Canada) 한국계산과학공학회 창립학술대회 Cosmological N-Body Simulation of Cosmic.
Topology Aware Mapping for Performance Optimization of Science Applications Abhinav S Bhatele Parallel Programming Lab, UIUC.
The Dark Universe Dark Matter Matters! Exploring the Dark Universe June 28-29, 2007 Indiana University.
N Tropy: A Framework for Analyzing Massive Astrophysical Datasets Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey.
1CPSD NSF/DARPA OPAAL Adaptive Parallelization Strategies using Data-driven Objects Laxmikant Kale First Annual Review October 1999, Iowa City.
Application-specific Topology-aware Mapping for Three Dimensional Topologies Abhinav Bhatelé Laxmikant V. Kalé.
Effects of baryons on the structure of massive galaxies and clusters Oleg Gnedin University of Michigan Collisionless N-body simulations predict a nearly.
The Evolution of the Universe Nicola Loaring. The Big Bang According to scientists the Universe began ~15 billion years ago in a hot Big Bang. At creation.
1 Data Structures for Scientific Computing Orion Sky Lawlor charm.cs.uiuc.edu 2003/12/17.
Impact of Early Dark Energy on non-linear structure formation Margherita Grossi MPA, Garching Volker Springel Advisor : Volker Springel 3rd Biennial Leopoldina.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
N Tropy: A Framework for Knowledge Discovery in a Virtual Universe Harnessing the Power of Parallel Grid Resources for Astronomical Data Analysis Jeffrey.
COSMOLOGY SL - summary. STRUCTURES Structure  Solar system  Galaxy  Local group  Cluster  Super-cluster Cosmological principle  Homogeneity – no.
Origin of solar systems 30 June - 2 July 2009 by Klaus Jockers Max-Planck-Institut of Solar System Science Katlenburg-Lindau.
Web and Grid Services from Pitt/CMU Andrew Connolly Department of Physics and Astronomy University of Pittsburgh Jeff Gardner, Alex Gray, Simon Krughoff,
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.
PARALLELIZING HIGHLY DYNAMIC N-BODY SIMULATIONS Joachim Stadel University of Zurich Institute for Theoretical Physics.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Parallelizing Spacetime Discontinuous Galerkin Methods Jonathan Booth University of Illinois at Urbana/Champaign In conjunction with: L. Kale, R. Haber,
Cosmology and Dark Matter III: The Formation of Galaxies Jerry Sellwood.
1 Data Structures for Scientific Computing Orion Sky Lawlor /04/14.
1 Rocket Science using Charm++ at CSAR Orion Sky Lawlor 2003/10/21.
Hierarchical Load Balancing for Large Scale Supercomputers Gengbin Zheng Charm++ Workshop 2010 Parallel Programming Lab, UIUC 1Charm++ Workshop 2010.
Lecture 27: The Shape of Space Astronomy Spring 2014.
Study of Proto-clusters by Cosmological Simulation Tamon SUWA, Asao HABE (Hokkaido Univ.) Kohji YOSHIKAWA (Tokyo Univ.)
FTC-Charm++: An In-Memory Checkpoint-Based Fault Tolerant Runtime for Charm++ and MPI Gengbin Zheng Lixia Shi Laxmikant V. Kale Parallel Programming Lab.
COSMOLOGY The study of the origin, structure, and future of the universe.
Teragrid 2009 Scalable Interaction with Parallel Applications Filippo Gioachin Chee Wai Lee Laxmikant V. Kalé Department of Computer Science University.
Massively Parallel Cosmological Simulations with ChaNGa Pritish Jetley, Filippo Gioachin, Celso Mendes, Laxmikant V. Kale and Thomas Quinn.
1 Scalable Cosmological Simulations on Parallel Machines Filippo Gioachin¹ Amit Sharma¹ Sayantan Chakravorty¹ Celso Mendes¹ Laxmikant V. Kale¹ Thomas R.
ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel James Wadsley Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit.
Cosmology On Petascale Computers. Fabio Governato, Greg Stinson, Chris Brook, Alison Brooks Joachim Stadel, Lucio Mayer, Ben Moore, George Lake (U. ZH)
ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit Sharma.
1 ChaNGa: The Charm N-Body GrAvity Solver Filippo Gioachin¹ Pritish Jetley¹ Celso Mendes¹ Laxmikant Kale¹ Thomas Quinn² ¹ University of Illinois at Urbana-Champaign.
ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit Sharma.
Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign.
Hydrodynamic Galactic Simulations
ChaNGa: Design Issues in High Performance Cosmology
Outline Part II. Structure Formation: Dark Matter
Parallel Objects: Virtualization & In-Process Components
Performance Evaluation of Adaptive MPI
Implementing Simplified Molecular Dynamics Simulation in Different Parallel Paradigms Chao Mei April 27th, 2006 CS498LVK.
Component Frameworks:
Milind A. Bhandarkar Adaptive MPI Milind A. Bhandarkar
Course Outline Introduction in algorithms and applications
Outline Part II. Structure Formation: Dark Matter
Gengbin Zheng, Esteban Meneses, Abhinav Bhatele and Laxmikant V. Kale
Parallel Programming in C with MPI and OpenMP
Support for Adaptivity in ARMCI Using Migratable Objects
Presentation transcript:

ChaNGa CHArm N-body GrAvity

Thomas Quinn Graeme Lufkin Joachim Stadel James Wadsley Greg Stinson Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit Sharma Lukasz Wesolowski Edgar Solomonik Orion Lawlor

Outline ● Scientific background – Cosmology and fundamental questions – Galaxy catalogs and simulations – Simulation Challenges ● Charm++ and those challenges – Previous state of the art: Gasoline – AMPI and Gasoline – Charm++ and the send paradigm – CkCache, etc. ● Future Challenges

Image courtesy NASA/WMAP Cosmology at 130,000 years

Results from CMB

Cosmology at 13.6 Gigayears

... is not so simple

Computational Cosmology ● CMB has fluctuations of 1e-5 ● Galaxies are overdense by 1e7 ● It happens (mostly) through Gravitational Collapse ● Making testable predictions from a cosmological hypothesis requires – Non-linear, dynamic calculation – e.g. Computer simulation

Simulating galaxies: Procedure 1. Simulate 100 Mpc volume at kpc resolution 2. Pick candidate galaxies for further study 3. Resimulate galaxies with same large scale structure but with higher resolution, and lower resolution in the rest of the computational volume. 4. At higher resolutions, include gas physics and star formation.

Gas Stars Dark Matter

Dwarf galaxy simulated to the present Reproduces: * Light profile * Mass profile * Star formation * Angular momentum i band image

Galactic structure in the local Universe: What’s needed ● 1 Million particles/galaxy for proper morphology/heavy element production ● 800 M core-hours ● Necessary for: – Comparing with Hubble Space Telescope surveys of the local Universe – Interpreting HST images of high redshift galaxies

Large Scale Structure: What’s needed ● 700 Megaparsec volume for “fair sample” of the Universe ● 18 trillion core-hours (~ exaflop year) ● Necessary for: – Interpreting future surveys (LSST) – Relating Cosmic Microwave Background to galaxy surveys ● Cmp. Exaflop example from P. Jetley: – 200 Mpc volume

Computational Challenges ● Large spacial dynamic range: > 100 Mpc to < 1 kpc – Hierarchical, adaptive gravity solver is needed ● Large temporal dynamic range: 10 Gyr to < 1 Myr – Multiple timestep algorithm is needed ● Gravity is a long range force – Hierarchical information needs to go across processor domains

15 Parallel Programming UIUC 9/27/2016 Basic Gravity algorithm... ● Newtonian gravity interaction – Each particle is influenced by all others: O(n²) algorithm ● Barnes-Hut approximation: O(nlogn) – Influence from distant particles combined into center of mass

Legacy Code: PKDGRAV/GASOLINE ● Originally implemented on KSR2 – Ported to: PVM, pthreads, MPI, T3D, CHARM++ ● KD tree domain decomposition/load balancing ● Software cache: latency amortization

PKDGRAV/GASOLINE Issues ● Load balancing creates more work, systematic errors. ● Multistep domain decomposition ● Latency amortization, but not hiding via software cache – Fast network is required – SPH scaling is poor ● Porting: MPI became the standard platform

Clustering and Load Balancing

Charm++ features ● “Automatic”, measurement-based load balancing. ● Natural overlap of computation and communication ● Not hardwired to a given data structure. ● Object Oriented: reuse of existing code. ● Portable ● NAMD: molecular dynamics is similar. ● Approachable group!

Building a Treecode in CHARM++: Porting GASOLINE ● AMPI port of GASOLINE – Very straightforward – Adding Virtual Processors gave poor performance: separate caches increased communication ● CHARM++ port of GASOLINE – Good match to RMI design – Charm++ allowed some minor speed improvements – Still, more than one element/processor does not work well

Building a Treecode in CHARM++: Starting afresh ● Follow Charm++ paradigm: send particle data as walk crossed boundaries ● Very large number of messages. ● Back to software cache User View

Overall Algorithm

ChaNGa Features ● Tree-based gravity solver ● High order multipole expansion ● Periodic boundaries (if needed) ● Individual multiple timesteps ● Dynamic load balancing with choice of strategies ● Checkpointing (via migration to disk) ● Visualization

9/27/2016Parallel Programming UIUC24 Zoom-in Scaling

Multistep Loadbalancer ● Use Charm++ measurement based load balancer ● Modification: provide LB database with information about timestepping. – “Large timestep”: balance based on previous large step – “Small step” balance based on previous small step – Maintains principle of persistence

Results on 3 rung example 613s 429s 228s

Multistep Scaling

Smooth Particle Hydrodynamics ● Making testable predictions needs Gastrophysics – High Mach number – Large density contrasts ● Gridless, Lagrangian method ● Galilean invariant ● Monte-Carlo Method for solving Navier-Stokes equation. ● Natural extension of particle method for gravity.

SPH Challenges ● Increased density contrasts/time stepping. ● K-nearest neighbor problem. – Trees! ● More data/particle than gravity ● Less computation than gravity ● Latency much more noticable

SPH Scaling

Ethernet scaling

Current uses ● Large scale structure – Dynamics of gas in galaxy clusters – Galaxy formation in the local Universe ● Galactic dynamics – Formation of nuclear star clusters – Disk heating from substructure ● Protoplanetary disks – Thermodynamics and radiative transfer

Future ● More Physics – Cooling/Star formation recipes – Charm++ allows reuse of PKDGRAV code ● Better gravity algorithms – New domain decomposition/load balancing strategies – Multicore/heterogeneous machines ● Other Astrophysical problems – Planet formation – Planetary rings

Charm++ features: reprise ● “Automatic”, measurement-based load balancing. – But needs thought and work ● Migration to GPGPU and SMP ● Object Oriented: reuse of existing code. ● Approachable group – Enhance Charm++ to solve our problems.

Summary ● Cosmological simulations provide a challenges to parallel implementations – Non-local data dependencies – Hierarchical in space and time ● ChaNGa has been successful in addressing this challenges using Charm++ features – Message priorities – New load balancers