TAU Performance System
TAU Performance SystemIBM Blue Gene Consortium2 TAU Parallel Performance System Multi-level performance instrumentation Multi-language automatic source instrumentation Flexible and configurable performance measurement Widely-ported parallel performance profiling system Computer system architectures and operating systems Different programming languages and compilers Support for multiple parallel programming paradigms Multi-threading, message passing, mixed-mode, hybrid
TAU Performance SystemIBM Blue Gene Consortium3 TAU Port to IBM BG/P Supports automatic instrumentation at: Source level (PDT, tau_instrumentor; KOJAK, opari) MPI Flexible and configurable performance measurement Support for profiling and tracing Support for PAPI counters on BG/P Uses bgxlC_r, bgxlc_r, bgxlf90_r as compilers To configure TAU: ./installtau -arch=bgp -mpi -pdt= -pdt_c++=xlC -papi= ./tau_validate --html --build bgp >& results.html Parallel Profile Analysis: Paraprof profile browser PerfDMF profile database Perfexplorer cross-experiment data analysis toolkit
TAU Performance SystemIBM Blue Gene Consortium4 Using TAU on IBM BGP (surveyor.alcf.anl.gov) Choose measurement configuration % ls /soft/apps/tau/tau_latest/bgp/lib/Makefile.* Makefile.tau-mpi-pdt Makefile.tau-mpi-pdt-trace Makefile.tau-callpath-mpi-pdt Makefile.tau-callpath-mpi-compensate-pdt Makefile.tau-depthlimit-mpi-pdt Makefile.tau-mpi-compensate-pdt Makefile.tau-multiplecounters-mpi-papi-pdt Makefile.tau-multiplecounters-mpi-papi-pdt-trace Makefile.tau-multiplecounters-papi-pdt Makefile.tau-multiplecounters-pthread-papi-pdt Makefile.tau-pdt Makefile.tau-phase-multiplecounters-mpi-compensate-papi-pdt Makefile.tau-phase-multiplecounters-mpi-papi-pdt Makefile.tau-pthread-pdt … % setenv TAU_MAKEFILE /soft/apps/tau/tau-2.17/bgp/lib/Makefile.tau-mpi-pdt % set path=(/soft/apps/tau/tau-2.17/ppc64/bin $path) # Front-end binaries Replace mpixlf90_r with tau_f90.sh and compile your application Use tau_cxx.sh and tau_cc.sh for C++ and C compilers respectively
TAU Performance SystemIBM Blue Gene Consortium5 Using TAU on IBM BGP (surveyor.alcf.anl.gov) Choose measurement configuration % ls /soft/apps/tau/tau_latest/bgp/lib/Makefile.* Makefile.tau-mpi-pdt Makefile.tau-mpi-pdt-trace Makefile.tau-callpath-mpi-pdt Makefile.tau-callpath-mpi-compensate-pdt Makefile.tau-depthlimit-mpi-pdt Makefile.tau-mpi-compensate-pdt Makefile.tau-multiplecounters-mpi-papi-pdt Makefile.tau-multiplecounters-mpi-papi-pdt-trace Makefile.tau-multiplecounters-papi-pdt Makefile.tau-multiplecounters-pthread-papi-pdt Makefile.tau-pdt Makefile.tau-phase-multiplecounters-mpi-compensate-papi-pdt Makefile.tau-phase-multiplecounters-mpi-papi-pdt Makefile.tau-pthread-pdt … % setenv TAU_MAKEFILE /soft/apps/tau/tau-2.17/bgp/lib/Makefile.tau-mpi-pdt % set path=(/soft/apps/tau/tau-2.17/ppc64/bin $path) # Front-end binaries Replace mpixlf90_r with tau_f90.sh and compile your application Use tau_cxx.sh and tau_cc.sh for C++ and C compilers respectively Visualize performance data with paraprof, pprof, vampir, jumpshot
TAU Performance SystemIBM Blue Gene Consortium6 TAU’s ParaProf 3D Profile Browser: Matmult
TAU Performance SystemIBM Blue Gene Consortium7 Profiling FLASH3 on IBM BG/P
TAU Performance SystemIBM Blue Gene Consortium8 Sedov 2D Auto Initial test run did not include a load balanced problem Small problem: too little work for 1024 processor Proof of concept to validate porting of tools
TAU Performance SystemIBM Blue Gene Consortium9 PerfExplorer: Cross Experiment Analysis
TAU Performance SystemIBM Blue Gene Consortium10 TAU PerfExplorer: Runtime Breakdown MPI_Barrier IO_OUTPUT
TAU Performance SystemIBM Blue Gene Consortium11 Relative Efficiency
TAU Performance SystemIBM Blue Gene Consortium12 Relative Speedup for One Event
TAU Performance SystemIBM Blue Gene Consortium13 TAU’s PerfExplorer: IBM BG/P
TAU Performance SystemIBM Blue Gene Consortium14 TAU Portal TAU portal supports the FLASH regression testing Allows groups to share profiling data in a secure way Allows users to launch TAU performance tools (paraprof, perfexplorer) Nightly regression testcases uploaded to the database automatically SVN checkout each night TAU: TAU Portal:
TAU Performance SystemIBM Blue Gene Consortium15 Portal: Nightly Performance Regression Testing
TAU Performance SystemIBM Blue Gene Consortium16 TAU Portal: Launch ParaProf/PerfExplorer
TAU Performance SystemIBM Blue Gene Consortium17 PerfExplorer: Regression Testing
TAU Performance SystemIBM Blue Gene Consortium18 PerfExplorer: Limiting Events (> 3% ), Oct 2007
TAU Performance SystemIBM Blue Gene Consortium19 PerfExplorer: Exclusive Time for Events (2007)
TAU Performance SystemIBM Blue Gene Consortium20 ParaProf: 3D Visualization
TAU Performance SystemIBM Blue Gene Consortium21 Support Acknowledgements Department of Energy (DOE) Office of Science LLNL, LANL, ASC Argonne National Laboratory University of Chicago Department of Defense NSF