Presentation is loading. Please wait.

Presentation is loading. Please wait.

SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower Annual Progress Review JLab, May 14, 2007 Code distribution see

Similar presentations


Presentation on theme: "SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower Annual Progress Review JLab, May 14, 2007 Code distribution see"— Presentation transcript:

1 SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower Annual Progress Review JLab, May 14, 2007 Code distribution see http://www.usqcd.org/software.html

2 Software Committee Rich Brower (chair) brower@bu.eduRich Brower (chair) Carleton DeTar detar@physics.utah.eduCarleton DeTar Robert Edwards edwards@jlab.orgRobert Edwards Don Holmgren djholm@fnal.govDon Holmgren Bob Mawhinney rdm@phys.columbia.eduBob Mawhinney Chip Watson watson@jlab.orgChip Watson Ying Zhang yingz @renci.orgYing Zhang SciDAC-2 Minutes/Documents &Progress report http://super.bu.edu/~brower/scc.html http://super.bu.edu/~brower/scc.html

3 Major Participants in SciDAC Project ArizonaDoug ToussaintMITAndrew Pochinsky Dru RennerJoy Khoriaty BURich Brower *North CarolinaRob Fowler James OsbornYing Zhang * Mike ClarkJLabChip Watson * BNLChulwoo JungRobert Edwards * Enno ScholzJie Chen Efstratios EfstathiadisBalint Joo ColumbiaBob Mawhinney *IITXien-He Sun DePaulMassimo DiPierroIndianaSteve Gottlieb FNALDon Holmgren *Subhasish Basak Jim SimoneUtahCarleton DeTar * Jim KowalkowskiLudmila Levkova Amitoj SinghVanderbiltTed Bapty * Software Committee: Participants funded in part by SciDAC grant

4 QCD Software Infrastructure Goals: Create a unified software environment that will enable the US lattice community to achieve very high efficiency on diverse high performance hardware. Requirements: 1.Build on 20 year investment in MILC/CPS/Chroma 2.Optimize critical kernels for peak performance 3.Minimize software effort to port to new platforms & to create new applications

5 Solution for Lattice QCD  (Perfect) Load Balancing: Uniform periodic lattices & identical sublattices per processor.  (Complete) Latency hiding: overlap computation /communications  Data Parallel: operations on Small 3x3 Complex Matrices per link.  Critical kernels : Dirac Solver, HMC forces, etc.... 70%-90% Lattice Dirac operator:

6 Optimized Dirac Operators, Inverters Level 3 QDP (QCD Data Parallel) Lattice Wide Operations, Data shifts Level 2 QMP (QCD Message Passing) QLA (QCD Linear Algebra) Level 1 QIO Binary/ XML Metadata Files SciDAC-1 QCD API C/C++, implemented over MPI, native QCDOC, M-via GigE mesh Optimised for P4 and QCDOC Exists in C/C++ ILDG collab

7 Data Parallel QDP/C,C++ API Hides architecture and layout Operates on lattice fields across sites Linear algebra tailored for QCD Shifts and permutation maps across sites Reductions Subsets Entry/exit – attach to existing codes

8 Example of QDP++ Expression  Typical for Dirac Operator:  QDP/C++ code:  Use Portable Expression Template Engine (PETE) Temporaries eliminated, expressions optimized

9 QOP (Optimized in asm) Dirac Operator, Inverters, Force etc QDP (QCD Data Parallel) Lattice Wide Operations, Data shifts QMP (QCD Message Passing) QLA (QCD Linear Algebra) QIO Binary / XML files & ILDG SciDAC-2 QCD API QMC (QCD Multi-core interface) Uniform User Env Runtime, accounting, grid, QCD Physics Toolbox Shared Alg,Building Blocks, Visualization,Performance Tools Level 4 Workflow and Data Analysis tools Application Codes: MILC / CPS / Chroma / RoleYourOwnMILC CPSChroma Level 3 Level 2 Level 1 SciDAC-1/SciDAC-2 = Gold/Blue PERI TOPS

10 Level 3 Domain Wall CG Inverter y y L s = 16, 4g is P4 2.8MHz, 800MHz FSB (6)10 2 406 16 3 (32 nodes) 4 2 6 16(4) 5 2 20

11 Asqtad Inverter on Kaon cluster @ FNAL Mflop/s per core L Comparison of MILC C code vs SciDAC/QDP on L 4 sub-volumes for 16 and 64 core partition of Kaon

12 Level 3 on QCDOC Asqtad CG on L 4 subvolumes Mflop/s L Asqtad RHMC kernels 24 3 x32 with subvolumes 6 3 x18 DW RHMC kernels 32 3 x64x16 with subvolumes 4 3 x8x16

13 Building on SciDAC-1  Common Runtime Environment 1.File transfer, Batch scripts, Compile targets 2.Practical 3 Laboratory “Metafacility”  Fuller use of API in application code. 1. Integrate QDP into MILC & QMP into CPS 2. Universal use of QIO, File Formats, QLA etc 3. Level 3 Interface standards  Porting API to INCITE Platforms 1.BG/L & BG/P: QMP and QLA using XLC & Perl script 2. Cray XT4 & Opteron, clusters

14  Workflow and Cluster Reliability 1. Automated campaign to merge lattices, propagators and extract physics. (FNAL & Illinois Institute of Tech) 2.Cluster Reliability: (FNAL & Vanderbuilt)  Tool Box -- shared algorithms / building blocks 1. RHMC, eigenvector solvers, etc 2. Visualization and Performance Analysis (DePaul & PERC) 3. Multi-scale Algorithms (QCD/TOPS Collaboration)  Exploitation of Multi-core 1.Multi-core not Hertz is new paradigm 2.Plans for a QMC API (JLab & FNAL & PERC) New SciDAC-2 Goals http://lqcd.fnal.gov/workflow/WorkflowProject.html http://www.yale.edu/QCDNA/ See SciDAC-2 kickoff workshop Oct27-28, 2006 http://super.bu.edu/~brower/workshophttp://super.bu.edu/~brower/workshop

15 QMC – QCD Multi-Threading General Evaluation – OpenMP vs. Explicit Thread library (Chen) –Explicit thread library can do better than OpenMP but OpenMP is compiler dependent Simple Threading API: QMC –based on older smp_lib (Pochinsky) –use pthreads and investigate barrier synchronization algorithms Evaluate threads for SSE-Dslash Consider threaded version of QMP ( Fowler and Porterfield in RENCI) Parallel sites 0..7 Serial Code Parallel sites 8..15 Idle finalize / thread join

16 Conclusions  Progress has been made using a common QCD-API & libraries for Communication, Linear Algebra, I/0, optimized inverters etc.  But full Implementation, Optimization, Documentation & Maintenance of shared codes is a continuing challenge.  And there is much work to keep up with changing Hardware and Algorithms.  Still NEW users (young and old) with no prior lattice experience have initiated new lattice QCD research using SciDAC software!  The bottom line is PHYSICS is being well served.


Download ppt "SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower Annual Progress Review JLab, May 14, 2007 Code distribution see"

Similar presentations


Ads by Google