Presentation is loading. Please wait.

Presentation is loading. Please wait.

ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel James Wadsley Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit.

Similar presentations


Presentation on theme: "ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel James Wadsley Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit."— Presentation transcript:

1 ChaNGa CHArm N-body GrAvity

2 Thomas Quinn Graeme Lufkin Joachim Stadel James Wadsley Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit Sharma Lukasz Wesolowski Edgar Solomonik

3 Outline ● Scientific background – How to build a Galaxy – Types of Simulations – Simulation Challenges ● ChaNGa and those Challenges – Features – Tree gravity – Load balancing – Multistepping ● Future Challenges

4 Image courtesy NASA/WMAP Cosmology: How does this...

5 ... turn into this?

6 Computational Cosmology ● CMB has fluctuations of 1e-5 ● Galaxies are overdense by 1e7 ● It happens (mostly) through Gravitational Collapse ● Making testable predictions from a cosmological hypothesis requires – Non-linear, dynamic calculation – e.g. Computer simulation

7 Simulation process ● Start with fluctuations based on Dark Matter properties ● Follow model analytically (good enough to get CMB) ● Create a realization of these fluctuations in particles. ● Follow the motions of these particles as they interact via gravity. ● Compare final distribution of particles with observed properties of galaxies.

8 Simulating galaxies: Procedure 1. Simulate 100 Mpc volume at 10-100 kpc resolution 2. Pick candidate galaxies for further study 3. Resimulate galaxies with same large scale structure but with higher resolution, and lower resolution in the rest of the computational volume. 4. At higher resolutions, include gas physics and star formation.

9 Gas Stars Dark Matter

10 9/29/2016Parallel Programming Laboratory @ UIUC10 Types of simulations Zoom In “Uniform” Volume Isolated Cluster/ Galaxy

11 Star Formation History: What’s needed ● 1 Million particles/galaxy for proper morphology/heavy element production ● 1 Petaflop-week of computation ● Necessary for: – Comparing with Hubble Space Telescope surveys of the local Universe – Interpreting HST images of high redshift galaxies

12 Large Scale Structure: What’s needed ● 6.5 Gigaparsec volume ● 10 Trillion particles (1 Petabyte of RAM) ● 1 Petaflop week of computation ● Necessary for: – Interpreting future surveys (LSST) – Relating Cosmic Microwave Background to galaxy surveys

13 Halo Simulations: What's needed ● Billions of particles in a single halo ● Dark Matter detection experiments ● Influence on disk ● Theories of gravitational collapse (Insight!)

14 Computational Challenges ● Large spacial dynamic range: > 100 Mpc to < 1 kpc – Hierarchical, adaptive gravity solver is needed ● Large temporal dynamic range: 10 Gyr to 1 Myr – Multiple timestep algorithm is needed ● Gravity is a long range force – Hierarchical information needs to go across processor domains

15 15 Parallel Programming Laboratory @ UIUC 9/29/2016 Basic Gravity algorithm... ● Newtonian gravity interaction – Each particle is influenced by all others: O(n²) algorithm ● Barnes-Hut approximation: O(nlogn) – Influence from distant particles combined into center of mass

16 Parallel Tree Issues ● Push vs. Pull data – Push particles: better latency tolerance, more data transferred – Pull nodes: less data, latency ● Domain Decomposition – Shape matters, not just area

17 Legacy Code: PKDGRAV ● Originally implemented on KSR2 – Ported to: PVM, pthreads, MPI, T3D, CHARM++ ● KD tree domain decomposition/load balancing ● Software cache: latency amortization

18 Clustering and Load Balancing

19 PKDGRAV Issues ● Load balancing creates more work, systematic errors. ● Multistep domain decomposition ● Latency amortization, but not hiding ● Porting: MPI became the standard platform

20 Charm++ features ● “Automatic”, measurement-based load balancing. ● “Medium-grained” approach helps multistepping. ● Not hardwired to a given data structure. ● Object Oriented: reuse of existing code. ● Portable ● NAMD: molecular dynamics is similar. ● Approachable group!

21 ChaNGa Features ● Tree-based gravity solver ● High order multipole expansion ● Periodic boundaries (if needed) ● Individual multiple timesteps ● Dynamic load balancing with choice of strategies ● Checkpointing (via migration to disk) ● Visualization

22 Cosmological Comparisons: Mass Function Heitmann, et al. 2005

23 23 Parallel Programming Laboratory @ UIUC 9/29/2016 Overall algorithm Processor 1 local work (low priority) remote work mi ss TreePiece C local work (low priority) remote work TreePiece B global work prefetc h visit of the tree TreePiece A local work (low priority) Start computation End computation global work re mo te present? request node CacheManager YES: return Processor n reply with requested data NO: fetch callback TreePiece on Processor 2 buff er High priority prefetc h visit of the tree

24 9/29/2016Parallel Programming Laboratory @ UIUC24 Uniform Volume Scaling

25 9/29/2016Parallel Programming Laboratory @ UIUC25 Zoom-in Scaling

26 Remote/local latency hiding Clustered data on 1,024 BlueGene/L processors 5.6s 5.0s Remote data work Local data work

27 9/29/2016Parallel Programming Laboratory @ UIUC27 Load balancing with GreedyLB Zoom In 5M on 1,024 BlueGene/L processors 5.6s6.1s 4x messages

28 9/29/2016Parallel Programming Laboratory @ UIUC28 Load balancing with OrbRefineLB Zoom in 5M on 1,024 BlueGene/L processors 5.6s 5.0s

29 Timestepping Challenges ● 1/m particles need m times more force evaluations ● Naively, simulation cost scales as N^(4/3)ln(N) – This is a problem when N ~ 1e9 or greater ● If each particle an individual timestep scaling reduces to N (ln(N))^2 ● A difficult dynamic load balancing problem

30 Multistep Loadbalancer ● Use Charm++ measurement based load balancer ● Modification: provide LB database with information about timestepping. – “Large timestep”: balance based on previous large step – “Small step” balance based on previous small step – Maintains principle of persistence

31 Results on 3 rung example 613s 429s 228s

32 Multistep Scaling

33 Smooth Particle Hydrodynamics ● Making testable predictions needs Gastrophysics – High Mach number – Large density contrasts ● Gridless, Lagrangian method ● Galilean invariant ● Monte-Carlo Method for solving Navier-Stokes equation. ● Natural extension of particle method for gravity.

34 SPH Challenges ● Increased density contrasts/time stepping. ● K-nearest neighbor problem. – Trees! ● More data/particle than gravity ● Less computation than gravity ● Latency much more noticable

35 SPH Scaling

36 Future ● More Physics – Cooling/Star formation recipes – Charm++ allows reuse of PKDGRAV code ● Better gravity algorithms – New domain decomposition/load balancing strategies – Multicore ● Other Astrophysical problems – Planet formation – Planetary rings

37 Charm++ features: reprise ● “Automatic”, measurement-based load balancing. – But needs thought and work ● “Medium-grained” approach helps multistepping. – But not fine grained ● Object Oriented: reuse of existing code. – C++ is less mature than C. ● Approachable group – Enhance Charm++ to solve our problems.

38 Summary ● Cosmological simulations provide a challenges to parallel implementations – Non-local data dependencies – Hierarchical in space and time ● ChaNGa has been successful in addressing this challenges using Charm++ features – Message priorities – New load balancers

39

40 Computing Challenge Summary ● The Universe is big => we will always be pushing for more resources ● New algorithm efforts will be made to make efficient use of the resources we have – Efforts made to abstract away from machine details – Parallelization efforts need to depend on more automated processes.


Download ppt "ChaNGa CHArm N-body GrAvity. Thomas Quinn Graeme Lufkin Joachim Stadel James Wadsley Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit."

Similar presentations


Ads by Google