1 Performance Analysis with Vampir DKRZ Tutorial 2013 5 – 7 August, Hamburg Matthias Weber, Frank Winkler, Andreas Knüpfer ZIH, Technische Universität.

Slides:



Advertisements
Similar presentations
Performance Analysis Tools for High-Performance Computing Daniel Becker
Advertisements

Software Analysis at Philips Healthcare MSc Project Matthijs Wessels 01/09/2009 – 01/05/2010.
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
Intel® performance analyze tools Nikita Panov Idrisov Renat.
MCTS GUIDE TO MICROSOFT WINDOWS 7 Chapter 10 Performance Tuning.
The Path to Multi-core Tools Paul Petersen. Multi-coreToolsThePathTo 2 Outline Motivation Where are we now What is easy to do next What is missing.
1 Virtual Machine Resource Monitoring and Networking of Virtual Machines Ananth I. Sundararaj Department of Computer Science Northwestern University July.
Robert Bell, Allen D. Malony, Sameer Shende Department of Computer and Information Science Computational Science.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming with MPI and OpenMP Michael J. Quinn.
Lecture 5 Today’s Topics and Learning Objectives Quinn Chapter 7 Predict performance of parallel programs Understand barriers to higher performance.
MCITP Guide to Microsoft Windows Server 2008 Server Administration (Exam #70-646) Chapter 14 Server and Network Monitoring.
Module 8: Monitoring SQL Server for Performance. Overview Why to Monitor SQL Server Performance Monitoring and Tuning Tools for Monitoring SQL Server.
Advanced Computing Technology Center © 2005 IBM Corporation The IBM High Performance Computing Toolkit Guojing Cong.
1 The VAMPIR and PARAVER performance analysis tools applied to a wet chemical etching parallel algorithm S. Boeriu 1 and J.C. Bruch, Jr. 2 1 Center for.
Hands-On Microsoft Windows Server 2008 Chapter 11 Server and Network Monitoring.
CH 13 Server and Network Monitoring. Hands-On Microsoft Windows Server Objectives Understand the importance of server monitoring Monitor server.
Windows Server 2008 Chapter 11 Last Update
Microsoft ® Official Course Monitoring and Troubleshooting Custom SharePoint Solutions SharePoint Practice Microsoft SharePoint 2013.
With Microsoft Windows 7© 2012 Pearson Education, Inc. Publishing as Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Windows 7.
Computer System Architectures Computer System Software
1 Score-P – A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Markus Geimer 2), Bert Wesarg 1), Brian Wylie.
DKRZ Tutorial 2013, Hamburg Analysis report examination with CUBE Markus Geimer Jülich Supercomputing Centre.
MCTS Guide to Microsoft Windows 7
German National Research Center for Information Technology Research Institute for Computer Architecture and Software Technology German National Research.
Lecture 8. Profiling - for Performance Analysis - Prof. Taeweon Suh Computer Science Education Korea University COM503 Parallel Computer Architecture &
Analyzing parallel programs with Pin Moshe Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi- Keung Luk,
Adventures in Mastering the Use of Performance Evaluation Tools Manuel Ríos Morales ICOM 5995 December 4, 2002.
Trace Generation to Simulate Large Scale Distributed Application Olivier Dalle, Emiio P. ManciniMar. 8th, 2012.
Score-P – A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Alexandru Calotoiu German Research School for.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Chapter 34 Java Technology for Active Web Documents methods used to provide continuous Web updates to browser – Server push – Active documents.
The Vampir Performance Analysis Tool Hans–Christian Hoppe Gesellschaft für Parallele Anwendungen und Systeme mbH Pallas GmbH Hermülheimer Straße 10 D
Chapter 3 – Part 1 Word Processing Writer for Linux CMPF 112 : COMPUTING SKILLS.
Technische Universität München Application Performance Monitoring of a scalable Java web-application in a cloud infrastructure Final Presentation August.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
1 Performance Analysis with Vampir ZIH, Technische Universität Dresden.
A Profiler for a Multi-Core Multi-FPGA System by Daniel Nunes Supervisor: Professor Paul Chow September 30 th, 2008 University of Toronto Electrical and.
Profiling, Tracing, Debugging and Monitoring Frameworks Sathish Vadhiyar Courtesy: Dr. Shirley Moore (University of Tennessee)
Measuring Interactive Performance with VNCplay Nickolai Zeldovich, Ramesh Chandra Stanford University.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
UAB Dynamic Tuning of Master/Worker Applications Anna Morajko, Paola Caymes Scutari, Tomàs Margalef, Eduardo Cesar, Joan Sorribes and Emilio Luque Universitat.
ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA This work.
Belgrade, 25 September 2014 George S. Markomanolis, Oriol Jorba, Kim Serradell Performance analysis Tools: a case study of NMMB on Marenostrum.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
Parallel Programming with MPI and OpenMP
1 Typical performance bottlenecks and how they can be found Bert Wesarg ZIH, Technische Universität Dresden.
Lab 2 Parallel processing using NIOS II processors
Tool Visualizations, Metrics, and Profiled Entities Overview [Brief Version] Adam Leko HCS Research Laboratory University of Florida.
© 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group.
Virtual Application Profiler (VAPP) Problem – Increasing hardware complexity – Programmers need to understand interactions between architecture and their.
Full and Para Virtualization
Hardware process When the computer is powered up, it begins to execute fetch-execute cycle for the program that is stored in memory at the boot strap entry.
Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs Allen D. Malony, Scott Biersdorff, Sameer Shende, Heike Jagode†, Stanimire.
Threaded Programming Lecture 1: Concepts. 2 Overview Shared memory systems Basic Concepts in Threaded Programming.
Projections - A Step by Step Tutorial By Chee Wai Lee For the 2004 Charm++ Workshop.
CEPBA-Tools experiences with MRNet and Dyninst Judit Gimenez, German Llort, Harald Servat
Presented by Jack Dongarra University of Tennessee and Oak Ridge National Laboratory KOJAK and SCALASCA.
Introduction to HPC Debugging with Allinea DDT Nick Forrington
KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association SYSTEM ARCHITECTURE GROUP DEPARTMENT OF COMPUTER.
Embedded Real-Time Systems Processing interrupts Lecturer Department University.
Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Navigating TAU Visual Display ParaProf and TAU Portal Mahin Mahmoodi Pittsburgh Supercomputing Center 2010.
Software Architecture in Practice
MCTS Guide to Microsoft Windows 7
Characterization of Parallel Scientific Simulations
Performance Evaluation of Adaptive MPI
QNX Technology Overview
Intro. To Operating Systems
Projections Overview Ronak Buch & Laxmikant (Sanjay) Kale
Presentation transcript:

1 Performance Analysis with Vampir DKRZ Tutorial – 7 August, Hamburg Matthias Weber, Frank Winkler, Andreas Knüpfer ZIH, Technische Universität Dresden

2 DKRZ Tutorial 2013, Hamburg Outline Part I: Welcome to the Vampir Tool Suite –Mission –Event Trace Visualization –Vampir & VampirServer –Vampir Performance Charts Part II: Vampir Hands-on –Visualizing and analyzing NPB-MZ-MPI / BT Part III: Summary and Conclusion

3 DKRZ Tutorial 2013, Hamburg Mission Visualization of dynamics of complex parallel processes Requires two components –Monitor/Collector (Score-P) –Charts/Browser (Vampir) Typical questions that Vampir helps to answer: –What happens in my application execution during a given time in a given process or thread? –How do the communication patterns of my application execute on a real system? –Are there any imbalances in computation, I/O or memory usage and how do they affect the parallel execution of my application?

4 DKRZ Tutorial 2013, Hamburg Event Trace Visualization with Vampir Alternative and supplement to automatic analysis Show dynamic run-time behavior graphically at any level of detail Provide statistics and performance metrics Timeline charts Summary charts –Show application activities and communication along a time axis –Provide quantitative results for the currently selected time interval

5 DKRZ Tutorial 2013, Hamburg Vampir – Visualization Modes (1) Directly on front end or local machine % vampir Score-P Trace File (OTF2) Vampir 8 CPU Multi-Core Program Multi-Core Program Thread parallel Small/Medium sized trace

6 DKRZ Tutorial 2013, Hamburg Vampir – Visualization Modes (2) On local machine with remote VampirServer Score-P Vampir 8 Trace File (OTF2) VampirServer CPU Many-Core Program Many-Core Program Large Trace File (stays on remote machine) MPI parallel application LAN/WAN % vampirserver start –n 12 % vampir

7 DKRZ Tutorial 2013, Hamburg Usage Order of the Vampir Performance Analysis Toolset 1.Instrument your application with Score-P 2.Run your application with an appropriate test set 3.Analyze your trace file with Vampir –Small trace files can be analyzed on your local workstation 1.Start your local Vampir 2.Load trace file from your local disk –Large trace files should be stored on the HPC file system 1.Start VampirServer on your HPC system 2.Start your local Vampir 3.Connect local Vampir with the VampirServer on the HPC system 4.Load trace file from the HPC file system

8 DKRZ Tutorial 2013, Hamburg Main Displays of Vampir Timeline Charts: – Master Timeline – Process Timeline – Counter Data Timeline – Performance Radar Summary Charts: – Function Summary – Message Summary – Process Summary – Communication Matrix View

9 Vampir Hands-on Visualizing and Analyzing NPB-MZ-MPI / BT

10 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace % vampir scorep_bt-mz_B_4x4_trace/traces.otf2 Master Timeline Navigation Toolbar Function Summary Function Legend

11 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Master Timeline Detailed information about functions, communication and synchronization events for collection of processes.

12 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Detailed information about different levels of function calls in a stacked bar chart for an individual process. Process Timeline

13 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Typical Program Phases Initialisation Phase Computation Phase

14 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Detailed counter information over time for an individual process. Counter Data Timeline

15 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Performance Radar Detailed counter information over time for a collection of processes.

16 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Zoom in: Inititialization Phase Context View: Detailed information about function “initialize_”.

17 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Feature: Find Function Execution of function “initialize_” results in higher page fault rates.

18 DKRZ Tutorial 2013, Hamburg Vampir: NPB-MZ-MPI / BT source snippet #include "scorep/SCOREP_User.h" c c initialize data c call timer_start(1)... c c start the benchmark time step loop c do step = 1, niter SCOREP_USER_REGION_BEGIN( my_region_handle, "iteration", SCOREP_USER_REGION_TYPE_COMMON )... call exch_qbc(...) do iz = 1, proc_num_zones call adi(...) end do SCOREP_USER_REGION_END( my_region_handle ) end do call timer_stop(1) c c perform verification and print results c

19 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Computation Phase Computation phase results in higher floating point operations.

20 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace MPI communication results in lower floating point operations. Zoom in: Computation Phase

21 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Zoom in: Finalization Phase “Early reduce” bottleneck.

22 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Process Summary Function Summary: Overview of the accumulated information across all functions and for a collection of processes. Process Summary: Overview of the accumulated information across all functions and for every process independently.

23 DKRZ Tutorial 2013, Hamburg Vampir: Visualization of the NPB-MZ-MPI / BT trace Process Summary Find groups of similar processes and threads by using summarized function information.

24 Summary and Conclusion

25 DKRZ Tutorial 2013, Hamburg Summary Vampir & VampirServer –Interactive trace visualization and analysis –Intuitive browsing and zooming –Scalable to large trace data sizes (20 TByte) –Scalable to high parallelism ( processes) Vampir for Linux, Windows and Mac OS X Note: Vampir does neither solve your problems automatically nor point you directly at them. Yet, it does give you FULL insight into the execution of your application.

26 DKRZ Tutorial 2013, Hamburg Conclusion Performance analysis very important in HPC Use performance analysis tools for profiling and tracing Do not spend effort in DIY solutions, e.g. like printf-debugging Use tracing tools with some precautions –overhead –data volume Let us know about problems and about feature wishes:

27 DKRZ Tutorial 2013, Hamburg Vampir is available at Get support via

28 DKRZ Tutorial 2013, Hamburg Staff at ZIH - TU Dresden: Ronny Brendel, Holger Brunst, Jens Doleschal, Ronald Geisler, Daniel Hackenberg, Michael Heyde, Matthias Jurenz, Michael Kluge, Andreas Knüpfer, Matthias Lieber, Holger Mickler, Hartmut Mix, Matthias Weber, Bert Wesarg, Frank Winkler, Matthias Müller, Wolfgang E. Nagel Acknowledgement