TAU integration with Score-P

Slides:



Advertisements
Similar presentations
Performance Analysis Tools for High-Performance Computing Daniel Becker
Advertisements

Machine Learning-based Autotuning with TAU and Active Harmony Nicholas Chaimov University of Oregon Paradyn Week 2013 April 29, 2013.
Software & Services Group PinPlay: A Framework for Deterministic Replay and Reproducible Analysis of Parallel Programs Harish Patil, Cristiano Pereira,
Dynamic performance measurement control Dynamic event grouping Multiple configurable counters Selective instrumentation Application-Level Performance Access.
University of Houston Open Source Software Support for the OpenMP Runtime API for Profiling Oscar Hernandez, Ramachandra Nanjegowda, Van Bui, Richard Krufin.
Workload Characterization using the TAU Performance System Sameer Shende, Allen D. Malony, Alan Morris University of Oregon {sameer,
Robert Bell, Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science.
Scalability Study of S3D using TAU Sameer Shende
Sameer Shende Department of Computer and Information Science Neuro Informatics Center University of Oregon Tool Interoperability.
Profiling S3D on Cray XT3 using TAU Sameer Shende
TAU Parallel Performance System DOD UGC 2004 Tutorial Allen D. Malony, Sameer Shende, Robert Bell Univesity of Oregon.
The TAU Performance Technology for Complex Parallel Systems (Performance Analysis Bring Your Own Code Workshop, NRL Washington D.C.) Sameer Shende, Allen.
TAU Performance System
On the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks Bernd Mohr 1, Allen D. Malony 2, Rudi Eigenmann 3 1 Forschungszentrum.
Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science Institute University.
TAU Performance System Alan Morris, Sameer Shende, Allen D. Malony University of Oregon {amorris, sameer,
Performance Tools BOF, SC’07 5:30pm – 7pm, Tuesday, A9 Sameer S. Shende Performance Research Laboratory University.
Performance Instrumentation and Measurement for Terascale Systems Jack Dongarra, Shirley Moore, Philip Mucci University of Tennessee Sameer Shende, and.
June 2, 2003ICCS Performance Instrumentation and Measurement for Terascale Systems Jack Dongarra, Shirley Moore, Philip Mucci University of Tennessee.
TAU: Performance Regression Testing Harness for FLASH Sameer Shende
Scalability Study of S3D using TAU Sameer Shende
Kai Li, Allen D. Malony, Robert Bell, Sameer Shende Department of Computer and Information Science Computational.
The TAU Performance System Sameer Shende, Allen D. Malony, Robert Bell University of Oregon.
Sameer Shende, Allen D. Malony Computer & Information Science Department Computational Science Institute University of Oregon.
Contemporary Languages in Parallel Computing Raymond Hummel.
Performance Tools for Empirical Autotuning Allen D. Malony, Nick Chaimov, Kevin Huck, Scott Biersdorff, Sameer Shende
1 Score-P – A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Markus Geimer 2), Bert Wesarg 1), Brian Wylie.
SUPER 1 Bob Lucas University of Southern California Sept. 23, 2011 Science Pipeline Allen D. Malony University of Oregon May 6, 2014 Support for this work.
Paradyn Week – April 14, 2004 – Madison, WI DPOMP: A DPCL Based Infrastructure for Performance Monitoring of OpenMP Applications Bernd Mohr Forschungszentrum.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
TRACEREP: GATEWAY FOR SHARING AND COLLECTING TRACES IN HPC SYSTEMS Iván Pérez Enrique Vallejo José Luis Bosque University of Cantabria TraceRep IWSG'15.
SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering Introduction to VI-HPS Brian Wylie Jülich Supercomputing Centre.
Score-P – A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Alexandru Calotoiu German Research School for.
Using TAU on SiCortex Alan Morris, Aroon Nataraj Sameer Shende, Allen D. Malony University of Oregon {amorris, anataraj, sameer,
Technology + Process SDCI HPC Improvement: High-Productivity Performance Engineering (Tools, Methods, Training) for NSF HPC Applications Rick Kufrin *,
Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Software Overview Environment, libraries, debuggers, programming tools and applications Jonathan Carter NUG Training 3 Oct 2005.
Dynamic performance measurement control Dynamic event grouping Multiple configurable counters Selective instrumentation Application-Level Performance Access.
ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA This work.
Allen D. Malony, Sameer S. Shende, Alan Morris, Robert Bell, Kevin Huck, Nick Trebon, Suravee Suthikulpanit, Kai Li, Li Li
Preparatory Research on Performance Tools for HPC HCS Research Laboratory University of Florida November 21, 2003.
Shangkar Mayanglambam, Allen D. Malony, Matthew J. Sottile Computer and Information Science Department Performance.
Integrated Performance Views in Charm++: Projections meets TAU Scott Biersdorff Allen D. Malony Department Computer and Information Science University.
Allen D. Malony Department of Computer and Information Science Performance Research Laboratory.
Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs Allen D. Malony, Scott Biersdorff, Sameer Shende, Heike Jagode†, Stanimire.
TAU Performance System ® TAU is a profiling and tracing toolkit that supports programs written in C, C++, Fortran, Java, Python,
TAU Performance System Sameer Shende Performance Reseaerch Lab, University of Oregon
Presented by Jack Dongarra University of Tennessee and Oak Ridge National Laboratory KOJAK and SCALASCA.
Integration and Synthesis for Automated Performance Tuning TAU Performance System ®  Performance problem solving framework for HPC  Integrated, scalable,
Profiling OpenSHMEM with TAU Commander
Performance Tool Integration in Programming Environments for GPU Acceleration: Experiences with TAU and HMPP Allen D. Malony1,2, Shangkar Mayanglambam1.
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
An Approach for Parallelizing Legacy CFD Applications
Productive Performance Tools for Heterogeneous Parallel Computing
Introduction to the TAU Performance System®
Performance Technology for Scalable Parallel Systems
Tracing and Performance Analysis Tools for Heterogeneous Multicore System by Soon Thean Siew.
Tutorial Outline Welcome (Malony)
ARM Tools Working Group
Performance Tuning Team Chia-heng Tu June 30, 2009
Allen D. Malony, Sameer Shende
TAU Parallel Performance System
Performance Technology for Complex Parallel and Distributed Systems
A configurable binary instrumenter
TAU: A Framework for Parallel Performance Analysis
Allen D. Malony Computer & Information Science Department
Outline Introduction Motivation for performance mapping SEAA model
Performance Technology for Complex Parallel and Distributed Systems
Parallel Program Analysis Framework for the DOE ACTS Toolkit
Presentation transcript:

TAU integration with Score-P Allen D. Malony, Sameer Shende {malony,sameer}@cs.uoregon.edu LMAC meeting, April 2012, T.U. Munich, Garching, Germany Virtual Institute High Productivity Supercomputing (VI-HPS) Performance Refactoring: Instrumentation, Measurement, Analysis (PRIMA)

Overview Introduction TAU Overview Status of TAU integration with Score-P Future work

TAU Performance System® Parallel performance framework and toolkit Support all HPC platforms, compilers, runtime systems Provide portable instrumentation, measurement, analysis Felix‘ part, 10-12 slides including 2 slides about the four tools Periscope, TAU, Scalasca, Vampir http://tau.uoregon.edu

TAU Performance System® (2) Instrumentation Fortran, C++, C, UPC, Java, Python, Chapel Automatic instrumentation Measurement and analysis support MPI, OpenSHMEM, ARMCI, PGAS pthreads, OpenMP, other thread models GPU, CUDA, OpenCL, OpenACC Parallel profiling and tracing Use of Score-P for native OTF2 generation capability Analysis Parallel profile analysis (ParaProf), data mining (PerfExplorer) Performance database technology (PerfDMF, TAUdb) 3D profile browser Felix‘ part, 10-12 slides including 2 slides about the four tools Periscope, TAU, Scalasca, Vampir

TAU Performance System® (3) Automatic instrumentation Source level Program Database Toolkit (PDT) ROSE to PDB generator routines, loops, memory, I/O, UPC constructs, … Compiler based instrumentation GNU, IBM, NAG, Intel, PGI, Pathscale, Cray Binary rewriting DyinstAPI, MAQAO, PEBIL (in progress) Automatic wrapper library generator tau_gen_wrapper MPI, I/O, memory, … Felix‘ part, 10-12 slides including 2 slides about the four tools Periscope, TAU, Scalasca, Vampir

TAU Performance System® (4) Other Callstack debugging I/O and heap memory usage Leak detection Interfaces PAPI for hardware counters Scalasca and Vampir for OTF2 tracing and visualization Dyninst and MAQAO for binary instrumentaion Integration with Score-P going forward BSD style license http://tau.uoregon.edu Felix‘ part, 10-12 slides including 2 slides about the four tools Periscope, TAU, Scalasca, Vampir

Score-P with TAU Integration Score-P software architecture Vampir Scalasca TAU Periscope Score-P measurement infrastructure Event traces (OTF2 format) Call-path profiles (CUBE4 and TAU formats) Online interface Hardware counters Memory management Other … Target application (MPI, OpenMP, hybrid, serial) Compiler TAU instrumentor OPARI 2 User Instrumentation

Score-P Software Components Separate software components: Vampir Scalasca TAU Periscope Score-P measurement infrastructure Event traces (OTF2 format) Call-path profiles (CUBE4 and TAU formats) Online interface Hardware counters Memory management Other … Target application (MPI, OpenMP, hybrid, serial) Compiler TAU instrumentor OPARI 2 User Instrumentation

Example: Score-P with TAU (LU NPB) TAU v2.21.2, PDT v3.17, with Score-P v1.0.1

Status and Future Work (PRIMA, LMAC) Existing functionality Regions mapped to interval timers in TAU Cube 4 profile reader integrated in ParaProf [Pavel] TAU instrumentation components can use Score-P Opari2 integrated in TAU ParaProf, PerfDMF, and PerfExplorer interoperability with CUBE LiveDVD updated from 32 bit FC11 to 32 bit and 64 bit FC16 Tutorials and outreach activities To Do Integrate a generic thread library callback mechanism Tau2otf2 trace converter for legacy TAU traces Event-based sampling / hybrid profiling integration (Zoltan/Chee Wai Lee) PRIMA: VI-HPS visitors to U. Oregon 2010-2011: Daniel Lorenz, Christian Roessel, Pavel Savianko Summer 2012: Daniel Lorenz, Christian Roessel

Projects DOE NSF PRIMA Vancouver SUPER Performance refactoring for core infrastructure TAU integration with Score-P Vancouver Focus on heterogeneous system Development of GPU performance measurement SUPER Multi-institutional Performance, energy, resilience NSF Glass Box (pending, with University of Houston, Georgia Tech) Interfaces within HPC software stack