Accelerator Science Panagiotis Spentzouris Fermilab Estelle Cormier Tech-X.

Slides:



Advertisements
Similar presentations
Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Advertisements

Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
OpenFOAM on a GPU-based Heterogeneous Cluster
Claude TADONKI Mines ParisTech – LAL / CNRS / INP 2 P 3 University of Oujda (Morocco) – October 7, 2011 High Performance Computing Challenges and Trends.
An Introduction to Breakdown Simulations With PIC Codes C. Nieter, S.A. Veitzer, S. Mahalingam, P. Stoltz Tech-X Corporation MTA RF Workshop 2008 Particle-in-Cell.
1 Aug 7, 2004 GPU Req GPU Requirements for Large Scale Scientific Applications “Begin with the end in mind…” Dr. Mark Seager Asst DH for Advanced Technology.
SLAC is focusing on the modeling and simulation of DOE accelerators using high- performance computing The performance of high-brightness RF guns operating.
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
Loads Balanced with CQoS Nicole Lemaster, Damian Rouson, Jaideep Ray Sandia National Laboratories Sponsor: DOE CCA Meeting – January 22, 2009.
An approach for solving the Helmholtz Equation on heterogeneous platforms An approach for solving the Helmholtz Equation on heterogeneous platforms G.
T7/High Performance Computing K. Ko, R. Ryne, P. Spentzouris.
© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March
Processing of a CAD/CAE Jobs in grid environment using Elmer Electronics Group, Physics Department, Faculty of Science, Ain Shams University, Mohamed Hussein.
SciDAC Accelerator Simulation project: FNAL Booster modeling, status and plans Robert D. Ryne, P. Spentzouris.
ComPASS Project Overview Panagiotis Spentzouris, Fermilab ComPASS PI.
Improved pipelining and domain decomposition in QuickPIC Chengkun Huang (UCLA/LANL) and members of FACET collaboration SciDAC COMPASS all hands meeting.
CCA Common Component Architecture Manoj Krishnan Pacific Northwest National Laboratory MCMD Programming and Implementation Issues.
Oliver Boine-FrankenheimSIS100-4: High current beam dynamics studies SIS 100 ‘high current’ design challenges o Beam loss in SIS 100 needs to be carefully.
R. Ryne, NUG mtg: Page 1 High Energy Physics Greenbook Presentation Robert D. Ryne Lawrence Berkeley National Laboratory NERSC User Group Meeting.
FACET and beam-driven e-/e+ collider concepts Chengkun Huang (UCLA/LANL) and members of FACET collaboration SciDAC COMPASS all hands meeting 2009 LA-UR.
W.B.Mori UCLA Orion Center: Computer Simulation. Simulation component of the ORION Center Just as the ORION facility is a resource for the ORION Center,
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
Use of GPUs in ALICE (and elsewhere) Thorsten Kollegger TDOC-PG | CERN |
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
VORPAL Optimizations for Petascale Systems Paul Mullowney, Peter Messmer, Ben Cowan, Keegan Amyx, Stefan Muszala Tech-X Corporation Boyana Norris Argonne.
Components for Beam Dynamics Douglas R. Dechow, Tech-X Lois Curfman McInnes, ANL Boyana Norris, ANL With thanks to the Common Component Architecture (CCA)
Building an Electron Cloud Simulation using Bocca, Synergia2, TxPhysics and Tau Performance Tools Phase I Doe SBIR Stefan Muszala, PI DOE Grant No DE-FG02-08ER85152.
SAP Participants: Douglas Dechow, Tech-X Corporation Lois Curfman McInnes, Boyana Norris, ANL Physics Collaborators: James Amundson, Panagiotis Spentzouris,
The european ITM Task Force data structure F. Imbeaux.
NIH Resource for Biomolecular Modeling and Bioinformatics Beckman Institute, UIUC NAMD Development Goals L.V. (Sanjay) Kale Professor.
Overcoming Scaling Challenges in Bio-molecular Simulations Abhinav Bhatelé Sameer Kumar Chao Mei James C. Phillips Gengbin Zheng Laxmikant V. Kalé.
Computational Aspects of Multi-scale Modeling Ahmed Sameh, Ananth Grama Computing Research Institute Purdue University.
1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 
Interactive Computational Sciences Laboratory Clarence O. E. Burg Assistant Professor of Mathematics University of Central Arkansas Science Museum of Minnesota.
 Advanced Accelerator Simulation Panagiotis Spentzouris Fermilab Computing Division (member of the SciDAC AST project)
Manno, , © by Supercomputing Systems 1 1 COSMO - Dynamical Core Rewrite Approach, Rewrite and Status Tobias Gysi POMPA Workshop, Manno,
MESQUITE: Mesh Optimization Toolkit Brian Miller, LLNL
CCA Common Component Architecture CCA Forum Tutorial Working Group CCA Status and Plans.
Hiroshima, th August 2015 S. Ibarmia, P. Truscott On behalf of the CIRSOS team.
Accelerator Simulation in the Computing Division Panagiotis Spentzouris.
Accelerator Simulation in the Computing Division Panagiotis Spentzouris.
ComPASS Summary, Budgets & Discussion Panagiotis Spentzouris, Fermilab ComPASS PI.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Advanced User Support for MPCUGLES code at University of Minnesota October 09,
Space Charge with PyHEADTAIL and PyPIC on the GPU Stefan Hegglin and Adrian Oeftiger Space Charge Working Group meeting –
Welcome !! to the CAoPAC Workshop Carsten P. Welsch.
Warp LBNL Warp suite of simulation codes: developed to study high current ion beams (heavy-ion driven inertial confinement fusion). High.
EU-Russia Call Dr. Panagiotis Tsarchopoulos Computing Systems ICT Programme European Commission.
Large-scale accelerator simulations: Synergia on the Grid turn 1 turn 27 turn 19 turn 16 C++ Synergia Field solver (FFT, multigrid) Field solver (FFT,
Managed by UT-Battelle for the Department of Energy Python ORBIT in a Nutshell Jeff Holmes Oak Ridge National Laboratory Spallation Neutron Source Space.
 Accelerator Simulation P. Spentzouris Accelerator activity coordination meeting 03 Aug '04.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
Hybrid Parallel Implementation of The DG Method Advanced Computing Department/ CAAM 03/03/2016 N. Chaabane, B. Riviere, H. Calandra, M. Sekachev, S. Hamlaoui.
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
Fundamental aspects of muon beams submitted to Accelerator R&D panel for GARD funding consideration by J.P.Delahaye/SLAC & Robert D. Ryne/LBNL.
High Performance Computing Activities at Fermilab James Amundson Breakout Session 5C: Computing February 11, 2015.
GPU Acceleration of Particle-In-Cell Methods B. M. Cowan, J. R. Cary, S. W. Sides Tech-X Corporation.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Parallel Plasma Equilibrium Reconstruction Using GPU
Electron Ion Collider New aspects of EIC experiment instrumentation and computing, as well as their possible impact on and context in society (B) COMPUTING.
Challenges in Electromagnetic Modeling Scalable Solvers
Unstructured Grids at Sandia National Labs
Ray-Cast Rendering in VTK-m
Parallel 3D Finite Element Particle-In-Cell Simulations with Pic3P*
Development of the Nanoconfinement Science Gateway
L Ge, L Lee, A. Candel, C Ng, K Ko, SLAC
GENERAL VIEW OF KRATOS MULTIPHYSICS
Chapter 01: Introduction
Ph.D. Thesis Numerical Solution of PDEs and Their Object-oriented Parallel Implementations Xing Cai October 26, 1998.
Presentation transcript:

Accelerator Science Panagiotis Spentzouris Fermilab Estelle Cormier Tech-X

ComPASS Accelerator Science Drivers New techniques and technologies – Optimize, evolve concepts, design accelerator facilities based on new concepts Maximize performance of “conventional” techniques and technologies – Optimize operational parameters, understand dynamics (manipulation and control of beams in full 6D phase space) Desirable outcome: achieve higher gradients for energy frontier applications, minimize losses for intensity frontier applications

ComPASS Computational Accelerator Physics Challenges Energy Frontier. Modeling infrastructure to – Develop techniques, technologies and materials to achieve higher acceleration gradients dielectric and plasma wave structures, beam cooling – Optimize existing technologies Superconducting rf cavities – Optimize and test new designs (Muon and Electron Colliders,…) Intensity Frontier. Modeling infrastructure to – Understand and control instabilities, to minimize and mitigate beam losses Self-fields, wakefields, interaction with materials, geometry and long term tracking accuracy – Optimize existing technologies Srf cavities – Optimize and test designs (Project-X, …)

ComPASS Requirement Investigation Energy Frontier (driver) – LWFA (10 GeV/m stages) Multi-scale (laser, plasma, beam), machine design R&D – PWFA ( GeV/m) Multi-scale (plasma, beam), machine design R&D – DLA (accelerator on a chip, few GeV/m) Multi-scale, open structures, radiation, structure design R&D, thermal-mechanical-EM – Muon colliders (compact) Multi-scale (beam, cooling material), machine design R&D, multi-physics – Electron colliders (maximize conventional structure performance, 2-beam acceleration, 100 MeV/m) Intensity Frontier – proton Linac (maximize conventional structure performance) Wakefields, thermal-mechanical-EM analysis – proton Circular Loss mitigation Multi-scale (self-interactions, impedance, clouds…) white paper received no white paper

ComPASS Example: High-Intensity Proton Drivers Wide range of scales: –accelerator complex (10 3 m) → EM wavelength ( m) → component (10-1 m) → particle bunch (10 -3 m) –Need to correctly model intensity dependent effects and the accelerator lattice elements (fields, apertures), to identify and mitigate potential problems due to instabilities that increase beam loss; thousands of elements, millions of revolutions – Calculating 1e-5 losses at 1% requires modeling 1e9 particles, interacting with each-other and the structures around them at every step of the simulation

ComPASS Example:LWFA multi-scale physics

ComPASS Summary of Requirements Intensity Frontier accelerator needs – beam loss characterization and control – control room feedback : need fast turnaround of simulations – direct comparison between beam diagnostics detectors and simulated data (need development of new tools) Energy Frontier accelerator needs – beam stability characterization – ability to produce end-to-end designs – control room feedback: fast turnaround and fast analysis of massive data (requires development of new, automated, analysis tools) – new physics model capabilities (e.g., radiation and scattering) – better numerical models (less numerical noise) - identification of most suited numerical techniques All Frontiers – integrated / multi physics modeling: improve algorithms (integrate more physics in the model), massive computing resources – common interfaces, geometry and parameter description, job submission, from tablet to supercomputer.

ComPASS Summary of Findings All frontiers require integrated modeling. Currently, for high-fidelity modeling, because of the many degrees of freedom involved in the problem, we run “single physics” or “few physics” model simulations. This will require better algorithms with the ability to utilize massive computing resource (tightly coupled). All frontiers would like “consolidation” of user interfaces and community libraries and tools, including analysis tools All frontiers will benefit from coordinated development and computing R&D for a sustainable program

ComPASS Summary of Findings Energy Frontier future accelerators have a lot of specific new physics model capability needs (with some overlap, for example radiation and scattering, which is relevant to muon collider, plasma and gamma-gamma options). Also, Energy Frontier machines require better numerical models (less numerical noise). This has direct impact on the choice of numerical techniques for different physics implementations.

ComPASS Summary of Findings Intensity frontier machines would like “control room feedback” capabilities (because of the loss implications). Would it be possible with utilization of new computing technologies to deliver such fast turnaround? Such capability is also relevant to R&D experimental efforts (for example plasma driven acceleration), but there the requirements are even more stringent because, in addition to faster modeling, extracting useful information out of a simulation run involves analyzing massive data. So this leads to analysis tools development requirements that must perform in almost real time (as the experiment is running). For intensity frontier and for experimental programs develop analysis tools and techniques to allow for direct comparison between beam diagnostic detectors (what machine physicists see) and simulated quantities (what computational physicists use for input to the simulation and to describe the output of the models).

ComPASS INFRASTRUCTURE Resource and infrastructure needs

ComPASS Current methods and tools (partial list) Mature set of codes; most already HPC capable, R&D for new architectures. Variety of solvers: – Electrostatic: multigrid; AMR multigrid; spectral – Electromagnetic: finite element direct and hybrid; extended stencil finite-difference; AMR finite-difference – Quasi-static: spectral (QuickPIC) Specialized codes (beam- material interaction for ionization cooling, …) Software Libraries and Tools – Chombo, FFTW, Geant4, HDF5, LAPACK, METIS, MPI, MPI/IO, MUMPS, ScaLAPACK, SuperLU, TAU, TAO, Trillinos – C++, Fortran90, Python, CUDA – Analysis: ParaView, Python, R Language, ROOT, VisIt. Although simulated data volume is increasing can’t compare with HEP experiment requirements – Not a driver in storage, networking, etc. Will leverage HEP experiment motivated advances

ComPASS HPC example: scaling on current architectures Synergia modeling of proton drivers – Single- and multiple-bunch simulations Single-bunch strong scaling from 16 to 16,384 cores 32x32x1024 grid, 105M particles Weak scaling from 1M to 256M particles 128 to 32,768 cores Weak scaling from 64 to 1024 bunches 8192 to 131,072 cores Up to over particles Scaling results on ALCF machines: Mira (BG/Q) and Intrepid (BG/P)

ComPASS LLNL Sequoia IBM BlueGene/Q #2 - TOP500 Nov/ cores R max 16.3 PFlop/s 1.6 Mcores 97% 75% OSIRIS: 1.6 million cores and 2.2 PFLOPS Performance tests on Blue Waters cores (XE partition) Problem size (LWFA) cells = × 1024 × particles/cell (~ ) Computations 2.2 PFlop/s performance 31% of R peak HPC example: scaling on current architectures

ComPASS Emerging technology research GPUs and Multicore Shared memory is back Some things could get easier, some are harder Charge deposition in shared memory systems is the key challenge (PIC example) Multi-level parallelism very compatible with current communication avoidance approaches, but change in programing paradigm needed Could provide solutions for integrated high-fidelity modeling Multi-physics and parameter optimization Continuing R&D necessary

ComPASS Example of current R&D: GPU- accelerated results Benchmark with 2048x2048 grid, 150,994,944 particles, 36 particles/cell optimal block size = 128, optimal tile size = 16x16. Single precision GPU algorithm also implemented in OpenMP CPU:Inteli7GPU:M2090OpenMP(12 cores) Push22.1 ns0.532 ns1.678 ns Deposit8.5 ns0.227 ns0.818 ns Reorder0.4 ns0.115 ns0.113 ns Total Particle31.0 ns0.874 ns2.608 ns CPU:Inteli7GPU:M2090OpenMP (12 cores) Push66.5 ns0.426 ns5.645 ns Deposit36.7 ns0.918 ns3.362 ns Reorder0.4 ns0.698 ns0.056 ns Total Particle103.6 ns2.042 ns9.062 ns Electrostatic mx=16, my=16, dt=0.1 Total speedup was about 35 compared to 1 CPU, and about 3 compared to 12 CPUs. Electromagnetic mx=16, my=16, dt=0.04, c/vth=10 Total speedup was about 51 compared to 1 CPU, and about 4 compared to 12 CPUs.

ComPASS Example of current R&D: GPU and multicore results OpenMP resultsGPU results Scheme 1 Scheme 2 Node boundary

ComPASS Advanced algorithms: method of local corrections Potential-theoretic domain - decomposition Poisson solver compatible with AMR grids One V-cycle solver – Downsweep: build RHS for coarser grids using discrete convolutions and Legendre polynomial expansions exploits higher-order FD property of localization Convolutions performed with small FFTs and Hockney 1970 – Coarse solve Either MLC again, or FFT – Upsweep Solve for Φ h on boundary of patch Interpolation and summations Local Discrete Sine Transform Solve No iteration, accurate, no self- force problems, large number of flops per unit of communication (messages and DRAM).

ComPASS New Architectures Needs Opportunity for multi-scale, multi-physics modeling and “near-real- time” feedback, if could be used efficiently Work on algorithmic and code development to enable our codes to perform on new architectures (“lightweight” processors, accelerators, or hybrid configurations). Current strategy is to abstract and parameterize data structures so that are portable and enable efficient flow of data to a large number of processing units in order to maintain performance. – Already ported subset of solvers and PIC infrastructure on the GPU – Evaluate current approach, develop workflow tools and frameworks – Investigate new algorithms and approaches As technology evolves, we need specs of new production systems and availability of test machines well in advance of deployment Effort should be funded and coordinated as a program, to develop common set of libraries and interfaces and avoid duplication

ComPASS Summary Multi-physics modeling necessary to advance accelerator science. Requires – Frameworks to support advanced workflows – Efficient utilization of large computing resources, HPC in many cases Evolving technologies (light-weight CPU plus accelerators) require R&D and could result in significant changes – Advanced algorithmic research underway, will require continuing support – Programmatic coordination necessary to efficiently utilize resources Not a driver in data and networking, although could benefit from utilization of collaborative tools – Will benefit from advances driven by HEP experiment needs