SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower QCD Project Review May 24-25, 2005 Code distribution see

Slides:



Advertisements
Similar presentations
I/O and the SciDAC Software API Robert Edwards U.S. SciDAC Software Coordinating Committee May 2, 2003.
Advertisements

SciDAC Software Infrastructure for Lattice Gauge Theory
Nuclear Physics in the SciDAC Era Robert Edwards Jefferson Lab SciDAC 2009 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
Lattice QCD Comes of Age y Richard C. Brower XLIst Rencontres de Moriond March QCD and Hadronic interactions at high energy.
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
BU SciDAC Meeting Balint Joo Jefferson Lab. Anisotropic Clover Why do it ?  Anisotropy -> Fine Temporal Lattice Spacing at moderate cost  Combine with.
Data-Parallel Programming Model Basic uniform operations across lattice: C(x) = A(x)*B(x) Distribute problem grid across a machine grid Want API to hide.
QDP++ and Chroma Robert Edwards Jefferson Lab
Designing Lattice QCD Clusters Supercomputing'04 November 6-12, 2004 Pittsburgh, PA.
Design Considerations Don Holmgren Lattice QCD Project Review May 24, Design Considerations Don Holmgren Lattice QCD Computing Project Review Cambridge,
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
Algorithms for Lattice Field Theory at Extreme Scales Rich Brower 1*, Ron Babich 1, James Brannick 2, Mike Clark 3, Saul Cohen 1, Balint Joo 4, Tony Kennedy.
SciDAC Software Infrastructure for Lattice Gauge Theory DOE meeting on Strategic Plan --- April 15, 2002 Software Co-ordinating Committee Rich Brower ---
HackLatt MILC with SciDAC C Carleton DeTar HackLatt 2008.
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
MILC Code Basics Carleton DeTar KITPC MILC Code Capabilities Molecular dynamics evolution –Staggered fermion actions (Asqtad, Fat7, HISQ,
1 Parallel Computing—Introduction to Message Passing Interface (MPI)
HackLatt MILC Code Basics Carleton DeTar HackLatt 2008.
SciDAC Software Infrastructure for Lattice Gauge Theory DOE Grant ’01 -- ’03 (-- ’05?) All Hands Meeting: FNAL Feb. 21, 2003 Richard C.Brower Quick Overview.
I/O and the SciDAC Software API Robert Edwards U.S. SciDAC Software Coordinating Committee May 2, 2003.
The science of simulation falsification algorithms phenomenology machines better theories computer architectures non-perturbative QFT experimental tests.
The QCDOC Project Overview and Status Norman H. Christ DOE LGT Review May 24-25, 2005.
LQCD Project Overview Don Holmgren LQCD Project Progress Review May 25-26, 2006 Fermilab.
Lattice QCD and GPU-s Robert Edwards, Theory Group Chip Watson, HPC & CIO Jie Chen & Balint Joo, HPC Jefferson Lab TexPoint fonts used in EMF. Read the.
SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower Annual Progress Review JLab, May 14, 2007 Code distribution see
Simulating Quarks and Gluons with Quantum Chromodynamics February 10, CS635 Parallel Computer Architecture. Mahantesh Halappanavar.
QCD Project Overview Ying Zhang September 26, 2005.
9/16/2000Ian Bird/JLAB1 Planning for JLAB Computational Resources Ian Bird.
Next KEK machine Shoji Hashimoto 3 rd ILFT Network Workshop at Jefferson Lab., Oct. 3-6, 2005.
Lattice QCD in Nuclear Physics Robert Edwards Jefferson Lab CCP 2011 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:
A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.
Chroma I: A High Level View Bálint Joó Jefferson Lab, Newport News, VA given at HackLatt'06 NeSC, Edinburgh March 29, 2006.
SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower & Robert Edwards June 24, 2003.
P. Vranas, IBM Watson Research Lab 1 BG/L architecture and high performance QCD P. Vranas IBM Watson Research Lab.
LQCD Clusters at JLab Chip Watson Jie Chen, Robert Edwards Ying Chen, Walt Akers Jefferson Lab.
UKQCD QCDgrid Richard Kenway. UKQCD Nov 2001QCDgrid2 why build a QCD grid? the computational problem is too big for current computers –configuration generation.
Lattice QCD and the SciDAC-2 LQCD Computing Project Lattice QCD Workflow Workshop Fermilab, December 18, 2006 Don Holmgren,
Crystal Ball Panel ORNL Heterogeneous Distributed Computing Research Al Geist ORNL March 6, 2003 SOS 7.
“DECISION” PROJECT “DECISION” PROJECT INTEGRATION PLATFORM CORBA PROTOTYPE CAST J. BLACHON & NGUYEN G.T. INRIA Rhône-Alpes June 10th, 1999.
HackLatt MILC Code Basics Carleton DeTar First presented at Edinburgh EPCC HackLatt 2008 Updated 2013.
Proposed 2007 Acquisition Don Holmgren LQCD Project Progress Review May 25-26, 2006 Fermilab.
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
2009/4/21 Third French-Japanese PAAP Workshop 1 A Volumetric 3-D FFT on Clusters of Multi-Core Processors Daisuke Takahashi University of Tsukuba, Japan.
May 25-26, 2006 LQCD Computing Review1 Jefferson Lab 2006 LQCD Analysis Cluster Chip Watson Jefferson Lab, High Performance Computing.
Chroma: An Application of the SciDAC QCD API(s) Bálint Joó School of Physics University of Edinburgh UKQCD Collaboration Soon to be moving to the JLAB.
1 1 What does Performance Across the Software Stack mean?  High level view: Providing performance for physics simulations meaningful to applications 
1 Cluster Development at Fermilab Don Holmgren All-Hands Meeting Jefferson Lab June 1-2, 2005.
Site Report on Physics Plans and ILDG Usage for US Balint Joo Jefferson Lab.
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
What is QCD? Quantum ChromoDynamics is the theory of the strong force
A QCD Grid: 5 Easy Pieces? Richard Kenway University of Edinburgh.
Interconnection network network interface and a case study.
Status and plans at KEK Shoji Hashimoto Workshop on LQCD Software for Blue Gene/L, Boston University, Jan. 27, 2006.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Extreme Computing’05 Parallel Graph Algorithms: Architectural Demands of Pathological Applications Bruce Hendrickson Jonathan Berry Keith Underwood Sandia.
U.S. Department of Energy’s Office of Science Midrange Scientific Computing Requirements Jefferson Lab Robert Edwards October 21, 2008.
QDP++ and Chroma Robert Edwards Jefferson Lab Collaborators: Balint Joo.
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
POE Parallel Operating Environment. Cliff Montgomery.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Jun Doi IBM Research – Tokyo Early Performance Evaluation of Lattice QCD on POWER+GPU Cluster 17 July 2015.
LQCD Computing Project Overview
Project Management – Part I
Computational Requirements
Lattice QCD Computing Project Review
LQCD Computing Operations
Is System X for Me? Cal Ribbens Computer Science Department
Patrick Dreher Research Scientist & Associate Director
Chroma: An Application of the SciDAC QCD API(s)
Cluster Computers.
Presentation transcript:

SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower QCD Project Review May 24-25, 2005 Code distribution see

Major Participants in Software Project * Software Coordinating Committee UK Peter Boyle Balint Joo

 QCD API Design  Completed Components  Performance  Future OUTLINE

Lattice QCD – extremely uniform  (Perfect) Load Balancing: Uniform periodic lattices & identical sublattices per processor.  (Complete) Latency hiding: overlap computation /communications  Data Parallel: operations on Small 3x3 Complex Matrices per link.  Critical kernel : Dirac Solver is ~90% of Flops. Lattice Dirac operator:

Optimized Dirac Operators, Inverters Level 3 QDP (QCD Data Parallel) Lattice Wide Operations, Data shifts Level 2 QMP (QCD Message Passing) QLA (QCD Linear Algebra) Level 1 QIO Binary Data Files / XML Metadata SciDAC QCD API C/C++, implemented over MPI, native QCDOC, M-via GigE mesh Optimised for P4 and QCDOC Exists in C/C++ ILDG collab

 QMP (Message Passing) version 2.0 GigE, in native QCDOC & over MPI.  QLA (Linear Algebra)  C routines (24,000 generated by Perl with test code).  Optimized (subset) for P4 Clusters in SSE assembly.  QCDOC optimized using BAGEL -- assembly generator for PowerPC, IBM power, UltraSparc, Alpha… by P. Boyle UK  QDP (Data Parallel) API  C version for MILC applications code  C++ version for new code base Chroma  I/0 and ILDG: Read/Write for XML/binary files compliant with ILDG  Level 3 (Optimized kernels) Dirac inverters for QCDOC and P4 clusters Fundamental Software Components are in Place:

Example of QDP++ Expression Typical for Dirac Operator:  QDP/C++ code:  Portable Expression Template Engine (PETE) Temporaries eliminated, expressions optimized

Performance of Optimized Dirac Inverters  QCDOC Level 3 inverters on 1024 processor racks  Level 2 on Infiniband P4 clusters at FNAL  Level 3 GigE P4 at JLab and Myrinet P4 at FNAL  Future Optimization on BlueGene/L et al.

Level 3 on QCDOC

Level 3 Aqstad QCDOC (double precision)

Level 2 Asqtad on single P4 node

Asqtad on Infiniband Cluster

Level 3 Domain Wall CG Inverter y y L s = 16, 4g is P4 2.8MHz, 800MHz FSB (6) (32 nodes) (4)

Alpha Testers on the 1K QCDOC Racks  MILC C code: Level 3 Asqtad HMD: 40 3 * 96 lattice for flavors  Chroma C++ code: Level 3 domain wall HMC: 24 3 * 32 (L s = 8) for 2+1 flavors  I/O: Reading and writing lattices to RAID and host AIX box  File transfers: 1 hop file transfers between BNL, FNAL & JLab

BlueGene/L Assessment  June ’04: 1024 node BG/L Wilson HMC sustain 1 Teraflop (peak 5.6) Bhanot, Chen, Gara, Sexton & Vranas hep-lat/  Feb ’05: QDP++/Chroma running on UK BG/L Machine  June ’05: BU and MIT each acquire 1024 Node BG/L  ’05-’06 Software Development: Optimize Linear Algebra (BAGEL assembler from P. Boyle) at UK Optimize Inverter (Level 3) at MIT Optimize QMP at BU/JLab

Future Software Development  On going: Optimization, Testing and Hardening of components. New Level 3 and application routines. Executables for BG/L, X1E, XT3, Power 3, PNNL cluster, APEmille…  Integration: Uniform Interfaces, Uniform Execution Environment for 3 Labs. Middleware for data sharing with ILDG.  Management: Data, Code, Job control as 3 Lab Metacenter (leverage PPDG)  Algorithm Research: Increased performance (leverage TOPS & NSF IRT). New physics (Scattering, resonances, SUSY, Fermion signs, etc?)