ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel James Wadsley Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit.

Slides:



Advertisements
Similar presentations
Cosmological Structure Formation A Short Course III. Structure Formation in the Non-Linear Regime Chris Power.
Advertisements

The Charm++ Programming Model and NAMD Abhinav S Bhatele Department of Computer Science University of Illinois at Urbana-Champaign
Ryuji Morishima (UCLA/JPL). N-body code: Gravity solver + Integrator Gravity solver must be fast and handle close encounters Special hardware (N 2 ):
BIG BANG. EVIDENCE FOR BIG BANG Hot Big Bang Model: The universe began expanding a finite time ago from a very dense, very hot initial state. Dense Dense.
Simulating the joint evolution of quasars, galaxies and their large-scale distribution Springel et al., 2005 Presented by Eve LoCastro October 1, 2009.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500 Cluster.
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.
Abhinav Bhatele, Laxmikant V. Kale University of Illinois at Urbana-Champaign Sameer Kumar IBM T. J. Watson Research Center.
Astro-2: History of the Universe Lecture 4; April
Numerical issues in SPH simulations of disk galaxy formation Tobias Kaufmann, Lucio Mayer, Ben Moore, Joachim Stadel University of Zürich Institute for.
Cosmological Structure Formation A Short Course
PRESENTATION TOPIC  DARK MATTER &DARK ENERGY.  We know about only normal matter which is only 5% of the composition of universe and the rest is  DARK.
ASCI/Alliances Center for Astrophysical Thermonuclear Flashes Simulating Self-Gravitating Flows with FLASH P. M. Ricker, K. Olson, and F. X. Timmes Motivation:
In The Beginning  N-body simulations (~100s particles) – to study Cluster formation  Cold collapse produces too steep a density profile (Peebles 1970)
Universe in a box: simulating formation of cosmic structures Andrey Kravtsov Department of Astronomy & Astrophysics Center for Cosmological Physics (CfCP)
Daniel Blackburn Load Balancing in Distributed N-Body Simulations.
Simon Portegies Zwart (Univ. Amsterdam with 2 GRAPE-6 boards)
Derek C. Richardson (U Maryland) PKDGRAV : A Parallel k-D Tree Gravity Solver for N-Body Problems FMM 2004.
Cosmological N-body simulations of structure formation Jürg Diemand, Ben Moore and Joachim Stadel, University of Zurich.
Module on Computational Astrophysics Jim Stone Department of Astrophysical Sciences 125 Peyton Hall : ph :
박창범 ( 고등과학원 ) & 김주한 ( 경희대학교 ), J. R. Gott (Princeton, USA), J. Dubinski (CITA, Canada) 한국계산과학공학회 창립학술대회 Cosmological N-Body Simulation of Cosmic.
Topology Aware Mapping for Performance Optimization of Science Applications Abhinav S Bhatele Parallel Programming Lab, UIUC.
The Dawn of Creation and the Beauty of the Universe Wichita State University April 6, 2010 Steven Beckwith University of California.
The Dark Universe Dark Matter Matters! Exploring the Dark Universe June 28-29, 2007 Indiana University.
N Tropy: A Framework for Analyzing Massive Astrophysical Datasets Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey.
Parallelization Of The Spacetime Discontinuous Galerkin Method Using The Charm++ FEM Framework (ParFUM) Mark Hills, Hari Govind, Sayantan Chakravorty,
The Evolution of the Universe Nicola Loaring. The Big Bang According to scientists the Universe began ~15 billion years ago in a hot Big Bang. At creation.
Impact of Early Dark Energy on non-linear structure formation Margherita Grossi MPA, Garching Volker Springel Advisor : Volker Springel 3rd Biennial Leopoldina.
Our Evolving Universe1 Vital Statistics of the Universe Today… l l Observational evidence for the Big Bang l l Vital statistics of the Universe   Hubble’s.
Cosmology and extragalactic astronomy
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Sean Passmoor Supervised by Dr C. Cress Simulating the Radio Sky.
Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.
Diaspora in Cercetarea Stiintifica Bucuresti, Sept The Milky Way and its Satellite System in 3D Velocity Space: Its Place in the Current Cosmological.
Renaissance: Formation of the first light sources in the Universe after the Dark Ages Justin Vandenbroucke, UC Berkeley Physics 290H, February 12, 2008.
Lecture 29: From Smooth to Lumpy Astronomy 1143 – Spring 2014.
Racah Institute of physics, Hebrew University (Jerusalem, Israel)
Cosmology and Dark Matter III: The Formation of Galaxies Jerry Sellwood.
Cosmological N-Body Simulation - Topology of Large scale Structure Changbom Park with Juhan Kim (Korea Institute for Advanced Study) & J. R. Gott (Princeton),
Scalable and Topology-Aware Load Balancers in Charm++ Amit Sharma Parallel Programming Lab, UIUC.
What the Formation of the First Stars Left in its Wake.
Hierarchical Load Balancing for Large Scale Supercomputers Gengbin Zheng Charm++ Workshop 2010 Parallel Programming Lab, UIUC 1Charm++ Workshop 2010.
Study of Proto-clusters by Cosmological Simulation Tamon SUWA, Asao HABE (Hokkaido Univ.) Kohji YOSHIKAWA (Tokyo Univ.)
THE BIG BANG THEORY. HOW IT ALL BEGAN Scientists hypothesize that approximately 13.7 billion years ago, a rapid expansion created the universe, producing.
COSMOLOGY The study of the origin, structure, and future of the universe.
ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel James Wadsley Greg Stinson Laxmikant Kale Filippo Gioachin Pritish Jetley Celso.
Massively Parallel Cosmological Simulations with ChaNGa Pritish Jetley, Filippo Gioachin, Celso Mendes, Laxmikant V. Kale and Thomas Quinn.
1 Scalable Cosmological Simulations on Parallel Machines Filippo Gioachin¹ Amit Sharma¹ Sayantan Chakravorty¹ Celso Mendes¹ Laxmikant V. Kale¹ Thomas R.
Cosmology On Petascale Computers. Fabio Governato, Greg Stinson, Chris Brook, Alison Brooks Joachim Stadel, Lucio Mayer, Ben Moore, George Lake (U. ZH)
ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit Sharma.
1 ChaNGa: The Charm N-Body GrAvity Solver Filippo Gioachin¹ Pritish Jetley¹ Celso Mendes¹ Laxmikant Kale¹ Thomas Quinn² ¹ University of Illinois at Urbana-Champaign.
ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit Sharma.
Flexibility and Interoperability in a Parallel MD code Robert Brunner, Laxmikant Kale, Jim Phillips University of Illinois at Urbana-Champaign.
Hydrodynamic Galactic Simulations
ChaNGa: Design Issues in High Performance Cosmology
Outline Part II. Structure Formation: Dark Matter
Tamas Szalay, Volker Springel, Gerard Lemson
Performance Evaluation of Adaptive MPI
Implementing Simplified Molecular Dynamics Simulation in Different Parallel Paradigms Chao Mei April 27th, 2006 CS498LVK.
Component Frameworks:
Cosmology: SNC 1D.
Galaxies and the Universe
Course Outline Introduction in algorithms and applications
Case Studies with Projections
Formation of the Universe
Outline Part II. Structure Formation: Dark Matter
Gengbin Zheng, Esteban Meneses, Abhinav Bhatele and Laxmikant V. Kale
Origin of Universe - Big Bang
Support for Adaptivity in ARMCI Using Migratable Objects
Galaxies and the Universe
Presentation transcript:

ChaNGa CHArm N-body GrAvity

Thomas Quinn Graeme Lufkin Joachim Stadel James Wadsley Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit Sharma Lukasz Wesolowski Edgar Solomonik

Outline ● Scientific background – How to build a Galaxy – Types of Simulations – Simulation Challenges ● ChaNGa and those Challenges – Features – Tree gravity – Load balancing – Multistepping ● Future Challenges

Image courtesy NASA/WMAP Cosmology: How does this...

... turn into this?

Computational Cosmology ● CMB has fluctuations of 1e-5 ● Galaxies are overdense by 1e7 ● It happens (mostly) through Gravitational Collapse ● Making testable predictions from a cosmological hypothesis requires – Non-linear, dynamic calculation – e.g. Computer simulation

Simulation process ● Start with fluctuations based on Dark Matter properties ● Follow model analytically (good enough to get CMB) ● Create a realization of these fluctuations in particles. ● Follow the motions of these particles as they interact via gravity. ● Compare final distribution of particles with observed properties of galaxies.

Simulating galaxies: Procedure 1. Simulate 100 Mpc volume at kpc resolution 2. Pick candidate galaxies for further study 3. Resimulate galaxies with same large scale structure but with higher resolution, and lower resolution in the rest of the computational volume. 4. At higher resolutions, include gas physics and star formation.

Gas Stars Dark Matter

9/29/2016Parallel Programming UIUC10 Types of simulations Zoom In “Uniform” Volume Isolated Cluster/ Galaxy

Star Formation History: What’s needed ● 1 Million particles/galaxy for proper morphology/heavy element production ● 1 Petaflop-week of computation ● Necessary for: – Comparing with Hubble Space Telescope surveys of the local Universe – Interpreting HST images of high redshift galaxies

Large Scale Structure: What’s needed ● 6.5 Gigaparsec volume ● 10 Trillion particles (1 Petabyte of RAM) ● 1 Petaflop week of computation ● Necessary for: – Interpreting future surveys (LSST) – Relating Cosmic Microwave Background to galaxy surveys

Halo Simulations: What's needed ● Billions of particles in a single halo ● Dark Matter detection experiments ● Influence on disk ● Theories of gravitational collapse (Insight!)

Computational Challenges ● Large spacial dynamic range: > 100 Mpc to < 1 kpc – Hierarchical, adaptive gravity solver is needed ● Large temporal dynamic range: 10 Gyr to 1 Myr – Multiple timestep algorithm is needed ● Gravity is a long range force – Hierarchical information needs to go across processor domains

15 Parallel Programming UIUC 9/29/2016 Basic Gravity algorithm... ● Newtonian gravity interaction – Each particle is influenced by all others: O(n²) algorithm ● Barnes-Hut approximation: O(nlogn) – Influence from distant particles combined into center of mass

Parallel Tree Issues ● Push vs. Pull data – Push particles: better latency tolerance, more data transferred – Pull nodes: less data, latency ● Domain Decomposition – Shape matters, not just area

Legacy Code: PKDGRAV ● Originally implemented on KSR2 – Ported to: PVM, pthreads, MPI, T3D, CHARM++ ● KD tree domain decomposition/load balancing ● Software cache: latency amortization

Clustering and Load Balancing

PKDGRAV Issues ● Load balancing creates more work, systematic errors. ● Multistep domain decomposition ● Latency amortization, but not hiding ● Porting: MPI became the standard platform

Charm++ features ● “Automatic”, measurement-based load balancing. ● “Medium-grained” approach helps multistepping. ● Not hardwired to a given data structure. ● Object Oriented: reuse of existing code. ● Portable ● NAMD: molecular dynamics is similar. ● Approachable group!

ChaNGa Features ● Tree-based gravity solver ● High order multipole expansion ● Periodic boundaries (if needed) ● Individual multiple timesteps ● Dynamic load balancing with choice of strategies ● Checkpointing (via migration to disk) ● Visualization

Cosmological Comparisons: Mass Function Heitmann, et al. 2005

23 Parallel Programming UIUC 9/29/2016 Overall algorithm Processor 1 local work (low priority) remote work mi ss TreePiece C local work (low priority) remote work TreePiece B global work prefetc h visit of the tree TreePiece A local work (low priority) Start computation End computation global work re mo te present? request node CacheManager YES: return Processor n reply with requested data NO: fetch callback TreePiece on Processor 2 buff er High priority prefetc h visit of the tree

9/29/2016Parallel Programming UIUC24 Uniform Volume Scaling

9/29/2016Parallel Programming UIUC25 Zoom-in Scaling

Remote/local latency hiding Clustered data on 1,024 BlueGene/L processors 5.6s 5.0s Remote data work Local data work

9/29/2016Parallel Programming UIUC27 Load balancing with GreedyLB Zoom In 5M on 1,024 BlueGene/L processors 5.6s6.1s 4x messages

9/29/2016Parallel Programming UIUC28 Load balancing with OrbRefineLB Zoom in 5M on 1,024 BlueGene/L processors 5.6s 5.0s

Timestepping Challenges ● 1/m particles need m times more force evaluations ● Naively, simulation cost scales as N^(4/3)ln(N) – This is a problem when N ~ 1e9 or greater ● If each particle an individual timestep scaling reduces to N (ln(N))^2 ● A difficult dynamic load balancing problem

Multistep Loadbalancer ● Use Charm++ measurement based load balancer ● Modification: provide LB database with information about timestepping. – “Large timestep”: balance based on previous large step – “Small step” balance based on previous small step – Maintains principle of persistence

Results on 3 rung example 613s 429s 228s

Multistep Scaling

Smooth Particle Hydrodynamics ● Making testable predictions needs Gastrophysics – High Mach number – Large density contrasts ● Gridless, Lagrangian method ● Galilean invariant ● Monte-Carlo Method for solving Navier-Stokes equation. ● Natural extension of particle method for gravity.

SPH Challenges ● Increased density contrasts/time stepping. ● K-nearest neighbor problem. – Trees! ● More data/particle than gravity ● Less computation than gravity ● Latency much more noticable

SPH Scaling

Future ● More Physics – Cooling/Star formation recipes – Charm++ allows reuse of PKDGRAV code ● Better gravity algorithms – New domain decomposition/load balancing strategies – Multicore ● Other Astrophysical problems – Planet formation – Planetary rings

Charm++ features: reprise ● “Automatic”, measurement-based load balancing. – But needs thought and work ● “Medium-grained” approach helps multistepping. – But not fine grained ● Object Oriented: reuse of existing code. – C++ is less mature than C. ● Approachable group – Enhance Charm++ to solve our problems.

Summary ● Cosmological simulations provide a challenges to parallel implementations – Non-local data dependencies – Hierarchical in space and time ● ChaNGa has been successful in addressing this challenges using Charm++ features – Message priorities – New load balancers

Computing Challenge Summary ● The Universe is big => we will always be pushing for more resources ● New algorithm efforts will be made to make efficient use of the resources we have – Efforts made to abstract away from machine details – Parallelization efforts need to depend on more automated processes.