Download presentation
Presentation is loading. Please wait.
Published byLuke Gervase Hines Modified over 9 years ago
1
Lattice QCD and GPU-s Robert Edwards, Theory Group Chip Watson, HPC & CIO Jie Chen & Balint Joo, HPC Jefferson Lab TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A AA A A A
2
Outline Will describe how: Capability computing + Capacity computing + SciDAC –Deliver science & NP milestones Collaborative efforts involve USQCD + JLab & DOE+NSF user communities 2
3
Hadronic & Nuclear Physics with LQCD Hadronic spectroscopy –Hadron resonance determinations –Exotic meson spectrum (JLab 12GeV ) Hadronic structure –3-D picture of hadrons from gluon & quark spin+flavor distributions –Ground & excited E&M transition form-factors (JLab 6GeV+12GeV+Mainz) –E&M polarizabilities of hadrons (Duke+CERN+Lund) Nuclear interactions –Nuclear processes relevant for stellar evolution –Hyperon-hyperon scattering –3 & 4 nucleon interaction properties [Collab. w/LLNL] (JLab+LLNL) Beyond the Standard Model –Neutron decay constraints on BSM from Ultra Cold Neutron source (LANL) 3
4
Bridges in Nuclear Physics NP Exascale 4
5
Spectroscopy Spectroscopy reveals fundamental aspects of hadronic physics –Essential degrees of freedom? –Gluonic excitations in mesons - exotic states of matter? Status –Can extract excited hadron energies & identify spins, –Pursuing full QCD calculations with realistic quark masses. New spectroscopy programs world-wide –E.g., BES III (Beijing), GSI/Panda (Darmstadt) –Crucial complement to 12 GeV program at JLab. Excited nucleon spectroscopy (JLab) JLab GlueX: search for gluonic excitations. 5
6
USQCD National Effort US Lattice QCD effort: Jefferson Laboratory, BNL and FNAL FNAL Weak matrix elements BNL RHIC Physics JLAB Hadronic Physics SciDAC – R&D Vehicle Software R&D INCITE resources (~20 TF-yr) + USQCD cluster facilities (17 TF-yr): Impact on DOE ’ s High Energy & Nuclear Physics Program 6
7
Gauge Generation: Cost Scaling Cost: reasonable statistics, box size and “physical” pion mass Extrapolate in lattice spacings: 10 ~ 100 PF-yr PF-years State-of-Art Today, 10TF-yr 2011 (100TF-yr) 7
8
Computational Requirements Gauge generation : Analysis Current calculations Weak matrix elements: 1 : 1 Baryon spectroscopy: 1 : 10 Nuclear structure: 1 : 4 Computational Requirements: Gauge Generation : Analysis 10 : 1 (2005) 1 : 3 (2010) Core work: Dirac inverters - use GPU-s 8
9
SciDAC Impact Software development –QCD friendly API’s and libraries: enables high user productivity –Allows rapid prototyping & optimization –Significant software effort for GPU-s Algorithm improvements –Operators & contractions: clusters (Distillation: PRL (2009)) –Mixed-precision Dirac-solvers: INCITE+clusters+GPU-s, 2-3X –Adaptive multi-grid solvers: clusters, ~8X (?) Hardware development via USQCD Facilities –Adding support for new hardware –GPU-s 9
10
Modern GPU Characteristics Hundreds of simple cores: high flop rate SIMD architecture (single instruction, multiple data) Complex (high bandwidth) memory hierarchy Gaming cards: no memory Error-Correction (ECC) – reliability issue I/O bandwidth << Memory bandwidth Commodity Processorsx86 CPUNVIDIA GT200New Fermi GPU #cores8240480 Clock speed3.2 GHz1.4 GHz Main memory bandwidth20 GB/s160 GB/s (gaming card) 180 GB/s (gaming card) I/O bandwidth7 GB/s (dual QDR IB) 3 GB/s 4 GB/s Power80 watts200 watts250 watts 10
11
Inverter Strong Scaling: V=32 3 x256 Local volume on GPU too small (I/O bottleneck) 3 Tflops 11
12
Science / Dollar for (Some) LQCD Capacity Apps 12
13
Hardware: ARRA GPU Clusters GPU clusters: ~530 cards Quads 2.4 GHz Nehalem 48 GB memory / node 117 nodes x 4 GPUs -> 468 GPUs Singles 2.4 GHz Nehalem 24 GB memory / node 64 nodes x 1 GPU -> 64 GPUs
14
530 GPUs at Jefferson Lab (July) 200,000 cores (1,600 million core hours / year) 600 Tflops peak single precision 100 Tflops aggregate sustained in the inverter, (mixed half / single precision) Significant increase in dedicated USQCD resources All this for only $1M with hosts, networking, etc. Disclaimer: To exploit this performance, code has to be run on the GPUs, not the CPU (Amdahl’s Law problem). SciDAC-2 (& 3) software effort: move more inverters & other code to gpu A Large Capacity Resource 14
15
New Science Reach in 2010-2011 QCD Spectrum Gauge generation: (next dataset) –INCITE: Crays&BG/P-s, ~ 16K – 24K cores –Double precision Analysis (existing dataset): two-classes –Propagators (Dirac matrix inversions) Few GPU level Single + half precision No memory error-correction –Contractions: Clusters: few cores Double precision + large memory footprint Cost (TF-yr) New: 10 TF-yr Old: 1 TF-yr 10 TF-yr 1 TF-yr 15
16
Isovector Meson Spectrum 16
17
Isovector Meson Spectrum 17 1004.4930
18
Exotic matter Exotics: world summary 18
19
Exotic matter Suggests (many) exotics within range of JLab Hall D Previous work: photo- production rates high Current GPU work: (strong) decays - important experimental input Exotics: first GPU results 19
20
Nucleon & Delta Spectrum First results from GPU-s < 2% error bars [ 56,2 + ] D-wave [ 70,1 - ] P-wave [ 70,1 - ] P-wave [ 56,2 + ] D-wave Discern structure: wave-function overlaps Change at light quark mass? Decays! Suggests spectrum at least as dense as quark model 20
21
Extending science reach USQCD: –Next calculations: physical quark masses: 100 TF – 1 PF-yr –New INCITE+Early Science application (ANL+ORNL+NERSC) –NSF Blue Waters Petascale (PRAC) Need SciDAC-3 –Significant software effort for next generation GPU-s & heterogeneous environments –Participate in emerging ASCR Exascale initiatives INCITE + LQCD synergy: –ARRA GPU system well matched to current leadership facilities 21
22
Path to Exascale Enabled by some hybrid GPU system? –Cray + Nvidia ?? NSF GaTech: Tier 2 (experimental facility) –Phase 1: HP cluster+GPU (Nvidia Tesla) –Phase 2: hybrid GPU+ ASCR Exascale facility –Case studies for Science, Software+Runtime, Hardware ASCR call proposals: Exascale Co-Design Center Exascale capacity resource will be needed 22
23
Summary Capability + Capacity + SciDAC –Deliver science & HEP+NP milestones Petascale (leadership) + Petascale (capacity)+SciDAC-3 Spectrum + decays First contact with experimental resolution Exascale (leadership) + Exascale (capacity)+SciDAC-3 Full resolution Spectrum + transitions Nuclear structure Collaborative efforts: USQCD + JLab user communities 23
24
Backup slides The end 24
25
JLab ARRA: Phase 1 25
26
JLab ARRA: Phase 2 26
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.