TAU integration with Score-P Allen D. Malony, Sameer Shende {malony,sameer}@cs.uoregon.edu LMAC meeting, April 2012, T.U. Munich, Garching, Germany Virtual Institute High Productivity Supercomputing (VI-HPS) Performance Refactoring: Instrumentation, Measurement, Analysis (PRIMA)
Overview Introduction TAU Overview Status of TAU integration with Score-P Future work
TAU Performance System® Parallel performance framework and toolkit Support all HPC platforms, compilers, runtime systems Provide portable instrumentation, measurement, analysis Felix‘ part, 10-12 slides including 2 slides about the four tools Periscope, TAU, Scalasca, Vampir http://tau.uoregon.edu
TAU Performance System® (2) Instrumentation Fortran, C++, C, UPC, Java, Python, Chapel Automatic instrumentation Measurement and analysis support MPI, OpenSHMEM, ARMCI, PGAS pthreads, OpenMP, other thread models GPU, CUDA, OpenCL, OpenACC Parallel profiling and tracing Use of Score-P for native OTF2 generation capability Analysis Parallel profile analysis (ParaProf), data mining (PerfExplorer) Performance database technology (PerfDMF, TAUdb) 3D profile browser Felix‘ part, 10-12 slides including 2 slides about the four tools Periscope, TAU, Scalasca, Vampir
TAU Performance System® (3) Automatic instrumentation Source level Program Database Toolkit (PDT) ROSE to PDB generator routines, loops, memory, I/O, UPC constructs, … Compiler based instrumentation GNU, IBM, NAG, Intel, PGI, Pathscale, Cray Binary rewriting DyinstAPI, MAQAO, PEBIL (in progress) Automatic wrapper library generator tau_gen_wrapper MPI, I/O, memory, … Felix‘ part, 10-12 slides including 2 slides about the four tools Periscope, TAU, Scalasca, Vampir
TAU Performance System® (4) Other Callstack debugging I/O and heap memory usage Leak detection Interfaces PAPI for hardware counters Scalasca and Vampir for OTF2 tracing and visualization Dyninst and MAQAO for binary instrumentaion Integration with Score-P going forward BSD style license http://tau.uoregon.edu Felix‘ part, 10-12 slides including 2 slides about the four tools Periscope, TAU, Scalasca, Vampir
Score-P with TAU Integration Score-P software architecture Vampir Scalasca TAU Periscope Score-P measurement infrastructure Event traces (OTF2 format) Call-path profiles (CUBE4 and TAU formats) Online interface Hardware counters Memory management Other … Target application (MPI, OpenMP, hybrid, serial) Compiler TAU instrumentor OPARI 2 User Instrumentation
Score-P Software Components Separate software components: Vampir Scalasca TAU Periscope Score-P measurement infrastructure Event traces (OTF2 format) Call-path profiles (CUBE4 and TAU formats) Online interface Hardware counters Memory management Other … Target application (MPI, OpenMP, hybrid, serial) Compiler TAU instrumentor OPARI 2 User Instrumentation
Example: Score-P with TAU (LU NPB) TAU v2.21.2, PDT v3.17, with Score-P v1.0.1
Status and Future Work (PRIMA, LMAC) Existing functionality Regions mapped to interval timers in TAU Cube 4 profile reader integrated in ParaProf [Pavel] TAU instrumentation components can use Score-P Opari2 integrated in TAU ParaProf, PerfDMF, and PerfExplorer interoperability with CUBE LiveDVD updated from 32 bit FC11 to 32 bit and 64 bit FC16 Tutorials and outreach activities To Do Integrate a generic thread library callback mechanism Tau2otf2 trace converter for legacy TAU traces Event-based sampling / hybrid profiling integration (Zoltan/Chee Wai Lee) PRIMA: VI-HPS visitors to U. Oregon 2010-2011: Daniel Lorenz, Christian Roessel, Pavel Savianko Summer 2012: Daniel Lorenz, Christian Roessel
Projects DOE NSF PRIMA Vancouver SUPER Performance refactoring for core infrastructure TAU integration with Score-P Vancouver Focus on heterogeneous system Development of GPU performance measurement SUPER Multi-institutional Performance, energy, resilience NSF Glass Box (pending, with University of Houston, Georgia Tech) Interfaces within HPC software stack