BU SciDAC Meeting Balint Joo Jefferson Lab. Anisotropic Clover Why do it ?  Anisotropy -> Fine Temporal Lattice Spacing at moderate cost  Combine with.

Slides:



Advertisements
Similar presentations
I/O and the SciDAC Software API Robert Edwards U.S. SciDAC Software Coordinating Committee May 2, 2003.
Advertisements

SciDAC Software Infrastructure for Lattice Gauge Theory
Nuclear Physics in the SciDAC Era Robert Edwards Jefferson Lab SciDAC 2009 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
Dynamical Chiral Fermions The `Grail’ – dyn. chiral fermions Generation of dyn. chiral fermions configs –RBC on the RIKEN QCDOC – Jan 05 (some %) –UKQCD.
QDP++ and Chroma Robert Edwards Jefferson Lab
Using subversion COMP 2400 Prof. Chris GauthierDickey.
Huey-Wen Lin — Parameter Tuning of Three-Flavor Dynamical Anisotropic Clover Action Huey-Wen Lin Robert G. Edwards Balint Joó Lattice.
Automated Tests in NICOS Nightly Control System Alexander Undrus Brookhaven National Laboratory, Upton, NY Software testing is a difficult, time-consuming.
Terms: Test (Case) vs. Test Suite
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
M. Taimoor Khan * Java Server Pages (JSP) is a server-side programming technology that enables the creation of dynamic,
5/5/2005Toni Räikkönen Internet based data collection from enterprises using XML questionnaires and XCola engine CoRD Meeting May 11th 2005.
Key Project Drivers - FY11 Ruth Pordes, June 15th 2010.
1 ITSK 2611 Welcome. 2 Operating System 3 What is an OS Resource Manager –Disk –Memory –CPU Device Manager –Printers –Video Card –Sound Card Utility.
SKA/KAT SPIN Presentation Software Engineering (!?) Robert Crida.
SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower Annual Progress Review JLab, May 14, 2007 Code distribution see
What’s new in Stack 3.2 Michael Youngstrom. Disclaimer This IS a presentation – So sit back and relax Please ask questions.
Resource Management and Accounting Working Group Working Group Scope and Components Progress made Current issues being worked Next steps Discussions involving.
ILDG Middleware Status Presented By: Bálint Joó, Jlab, USA Working Group Members: G. Beckett (EPCC, UK) T. Boku (CCS Tsukuba, Japan) D. Byrne (EPCC, UK)
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
QCD Project Overview Ying Zhang September 26, 2005.
Chroma I: A High Level View Bálint Joó Jefferson Lab, Newport News, VA given at HackLatt'06 NeSC, Edinburgh March 29, 2006.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
CMSBrownBag,05/29/2007 B.Mangano How to “use” CMSSW on own Linux Box and be happy In this context “use” means: - check-out pre-compiled CMSSW code - run.
Crystal Ball Panel ORNL Heterogeneous Distributed Computing Research Al Geist ORNL March 6, 2003 SOS 7.
Sage ACT! 2013 SDK Update Brian P. Mowka March 23, 2012 Template date: October 2010.
HackLatt MILC Code Basics Carleton DeTar First presented at Edinburgh EPCC HackLatt 2008 Updated 2013.
LHCb planning for DataGRID testbed0 Eric van Herwijnen Thursday, 10 may 2001.
Introduction Advantages/ disadvantages Code examples Speed Summary Running on the AOD Analysis Platforms 1/11/2007 Andrew Mehta.
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
1 ILDG Status in Japan  Lattice QCD Archive(LQA) a gateway to ILDG Japan Grid  HEPNet-J/sc an infrastructure for Japan Lattice QCD Grid A. Ukawa Center.
Virtual Batch Queues A Service Oriented View of “The Fabric” Rich Baker Brookhaven National Laboratory April 4, 2002.
May 25-26, 2006 LQCD Computing Review1 Jefferson Lab 2006 LQCD Analysis Cluster Chip Watson Jefferson Lab, High Performance Computing.
Chroma: An Application of the SciDAC QCD API(s) Bálint Joó School of Physics University of Edinburgh UKQCD Collaboration Soon to be moving to the JLAB.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Stephen Childs Trinity College Dublin &
SciDAC Software Infrastructure for Lattice Gauge Theory Richard C. Brower QCD Project Review May 24-25, 2005 Code distribution see
Report from Metadata Working Group ILDG9 (Dec.01,2006) T. Yoshie for MDWG CCS,Tsukuba ILDG8  QCDml1.3 solved all known issues, except “action normalization”
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
Marking up lattice QCD configurations and ensembles for ILDG Metadata Working Group P.Coddington, B.Joo, C.Maynard, D.Pleiter, T.Yoshie Working group members.
Site Report on Physics Plans and ILDG Usage for US Balint Joo Jefferson Lab.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Chapter 5 How are software packages developed?. What are the main steps in software project development? Writing Specifications - Analysis Phase Developing.
Version Control and SVN ECE 297. Why Do We Need Version Control?
A. Gheata, ALICE offline week March 09 Status of the analysis framework.
UKQCD NeSCAC Irving, 24/1/061 January 06 UKQCD meeting Staggered fermion project Alan Irving University of Liverpool.
EGEE is a project funded by the European Union under contract IST Experiment Software Installation toolkit on LCG-2
Feedback from CMS Andrew Lahiff STFC Rutherford Appleton Laboratory Contributions from Christoph Wissing, Bockjoo Kim, Alessandro Degano CernVM Users Workshop.
Alien and GSI Marian Ivanov. Outlook GSI experience Alien experience Proposals for further improvement.
EMI is partially funded by the European Commission under Grant Agreement RI EMI SA2 Report Andres ABAD RODRIGUEZ, CERN SA2.4, Task Leader EMI AHM,
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarksEGEE-III INFSO-RI MPI on the grid:
LQCD Computing Project Overview
Topic 2: Hardware and Software
Project Management – Part I
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
Getting Started with R.
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
Quality Control in the dCache team.
LQCD Computing Operations
Patrick Dreher Research Scientist & Associate Director
Welcome to IBC233 Taught by Cindy Laurin.
Dynamical Anisotropic-Clover Lattice Production for Hadronic Physics
CSCE 315 – Programming Studio, Fall 2017 Tanzir Ahmed
Welcome to IBC233 Taught by Cindy Laurin.
Chroma: An Application of the SciDAC QCD API(s)
Taught by Cindy Laurin And Mohamed Kassim
Arrays.
Presentation transcript:

BU SciDAC Meeting Balint Joo Jefferson Lab

Anisotropic Clover Why do it ?  Anisotropy -> Fine Temporal Lattice Spacing at moderate cost  Combine with Group Theoretical Baryon Operators -> Access to Excited States  Nice preliminary results – with just Wilson Excited states States with spin 5/2+

Anisotropic Clover Why do it ?  Part of Jlab 3 prong Lattice QCD programme Prong 1: Dynamical Anisotropic Clover Prong 2: DWF on a staggered sea (MILC Configs) Prong 3: Large Scale Dynamical DWF  This programme was specially commended by the DOE at our recent Science and Technology Review  Anisotropic Clover is a major part of the INCITE proposal (for XT3 and BG/?) machines

Anisotropic Clover Level 2  Clover Term and Inverse& Force Term  Wired into Chroma -> Provides HMC/RHMC Our Choice of Gauge Action:  Plaquette + Rectangle + Adjoint Term Fermion Action  Anisotropic Clover + Stout Smearing Stout Force Recursion Usual Barrage of DF techniques  Hasenbusch + Chronology for 2 flavours  RHMC for the +1 flavour  Multi time scale integrators

CG Inverter Performance We only got 7.3Tflops on 8K CPUs :( - but we didn't work much at all at optimzation

Clover Work Under SciDAC 2 Performance is OK but want better... Optimizations  Clover SSE Optimizations for Clusters & XT3 BAGEL terms for BG/???  Multi Mass Inverter, Trace Terms  Would like to optimize the actual bottleneck CG Inverter is not the current bottleneck Help from our friends at RENCI at identifying the exact hotspots? (Right now we rely on gprof) Algorithmic: Temporal Preconditioning ('later)

Thoughts at the back of my mind Are we actually going to get any time at ORNL?  We asked for a lot I think 20M CPU hours just for the clover stuff  Incite proposal was extremely hurried  We had to respond very quickly Many small groups did not have (stand?) a chance  How much effort should we be investing?  Should we be focusing on BlueGene/? and clusters more?

CRE and ILDG Progress on CRE has been slow. Why?  Manpower reasons in SciDAC 1?  People are happily running production already without it? In which case is it just LOW VALUE? where are the 'armies of new users' who need it? What are the issues?  Intimately tied to infrastructure at each site.  site infrastructure leverages off experiments different everywhere  High Maintenance PBS, LoadLeveller, NSF? dcache anyone? upgrade of mvapich, OpenMPI, IB fabric etc  Inherently non portable (what about ANL/ORNL)

CRE and ILDG If it has low value, no user demand and is high maintenance and won't work outsideour sites....  is it worth doing?  can we just drop it ? PLEASE?  Anyway common environments are so passe and 90s. Nowadays we should think about 'interoperable grid environments' – they're IN!

ILDG Middleware Progressed  but still on eXist MDC  dumb RC: (just remap the LFN to a FNAL dcache name) Issues:  Where is all the markup ?  Eventually need more sophisticated RC ?  Markup is NOT anisotropy aware (future fights in the MDWG – will take time)  working towards interoperability Meeting at JlLab Dec Can folks from BNL and FNAL come?

Testing and Release Unit Testing v.s. End to End Testing Too much existing code  We intermix QMP, QDP++, QIO, XpathReader, LIME, Chroma, Wilson Dslash or BAGEL Dslash, possibly BAGEL linear algebra, level 3 CG-DWF Unit testing all of these is difficult  End to End Tests: Compare the final result eg: correlation functions  Lots of output – selective diffs? QDP++ Uses XML, Selective Diffs through XMLDiff

Structure Test Consists of  Executable, Input XML, Expected Output XML  Metric file to decide which bits of the Output we need to check Runner – abstract away running  Trivial Runner (just re-echoes your commands)  MPIRUN runner (runs on 2 Jlab IB nodes)  prototype YOD runner (for XT3)  LoadLeveller runner (for BG/L) – yucky Driver Scripts  run interactively (eg scalar targets) & check  submit jobs to a queue, check later (for queues)

What has testing taught us? We run through this regression framework nightly: gcc3,gcc4, scalar, parscalar-ib What runs fine with gcc3.x on RHEL won't necessarily run fine with gcc4.x on FC5  Maintenance: Keep up with compilers – identify problems  ICC – catastrophic error: can't allocate register (SSE inline)  VACPP (XLC) – 'Internal Compiler error: Please contact IBM representative' on templates  PGI: No inline assembler? intrinsics?  we really MUST focus on this issue  or will it be GCC 3.4.x forever (seems most stable so far)

SciDAC Release Pages? What's the actual problem here?  Jlab page has releases that live in the JLAB CVS release directory previous versions (by vox populi) We strive to keep the pages up to date  Not everyone uses Jlab CVS. Why? do you prefer to run your own repository? do you you want to use Subversion? do you think only sissies use version control?  Centralizing release management is bad imagine if I had to be responsible for the release of a code that I myself could only pick up by web page?  Is it only John Kogut who is unhappy?

A possible solution to the problem which may or may not exist  A SourceForge like setup (Gforge)  Provides Per Project Web-Space, Release Tarball Space Source Code Management Modules (CVS & SVN)  May be able to 'proxy' for your own repo. Mailing Lists, Bugtracker, Newsfeeds yadda yadda Wiki like authentication  Our new Sysadmins are installing this at JLAB  But all the effort is wasted if folks don't use it...