Performance Analysis Tools for High-Performance Computing Daniel Becker 25-03-2010.

Slides:



Advertisements
Similar presentations
Forschungszentrum Jülich in der Helmholtz-Gesellschaft February 2007 – OGF19 A Collaborative Online Visualization and Steering (COVS) Framework for e-Science.
Advertisements

Barcelona Supercomputing Center. The BSC-CNS objectives: R&D in Computer Sciences, Life Sciences and Earth Sciences. Supercomputing support to external.
Prasanna Pandit R. Govindarajan
Machine Learning-based Autotuning with TAU and Active Harmony Nicholas Chaimov University of Oregon Paradyn Week 2013 April 29, 2013.
K T A U Kernel Tuning and Analysis Utilities Department of Computer and Information Science Performance Research Laboratory University of Oregon.
CSCI 4125 Programming for Performance Andrew Rau-Chaplin
Program Analysis and Tuning The German High Performance Computing Centre for Climate and Earth System Research Panagiotis Adamidis.
Productive Performance Tools for Heterogeneous Parallel Computing Allen D. Malony Department of Computer and Information Science University of Oregon Shigeo.
The Path to Multi-core Tools Paul Petersen. Multi-coreToolsThePathTo 2 Outline Motivation Where are we now What is easy to do next What is missing.
Scripting Languages For Virtual Worlds. Outline Necessary Features Classes, Prototypes, and Mixins Static vs. Dynamic Typing Concurrency Versioning Distribution.
On the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks Bernd Mohr 1, Allen D. Malony 2, Rudi Eigenmann 3 1 Forschungszentrum.
Parallel Simulation System Victor V. Okol’nishnikov Institute of Computational Mathematics and Mathematical Geophysics of SB RAS.
The shift from sequential to parallel and distributed computing is of fundamental importance for the advancement of computing practices. Unfortunately,
Performance Technology for Complex Parallel Systems REFERENCES.
The hybird approach to programming clusters of multi-core architetures.
Parallelization: Conway’s Game of Life. Cellular automata: Important for science Biology – Mapping brain tumor growth Ecology – Interactions of species.
Parallel Programming in.NET Kevin Luty.  History of Parallelism  Benefits of Parallel Programming and Designs  What to Consider  Defining Types of.
KUAS.EE Parallel Computing at a Glance. KUAS.EE History Parallel Computing.
High Performance Computation --- A Practical Introduction Chunlin Tian NAOC Beijing 2011.
D. Becker, M. Geimer, R. Rabenseifner, and F. Wolf Laboratory for Parallel Programming | September Synchronizing the timestamps of concurrent events.
1 Score-P – A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Markus Geimer 2), Bert Wesarg 1), Brian Wylie.
DKRZ Tutorial 2013, Hamburg Analysis report examination with CUBE Markus Geimer Jülich Supercomputing Centre.
Analysis report examination with CUBE Alexandru Calotoiu German Research School for Simulation Sciences (with content used with permission from tutorials.
Judit Giménez, Juan González, Pedro González, Jesús Labarta, Germán Llort, Eloy Martínez, Xavier Pegenaute, Harald Servat Brief introduction.
1 Performance Analysis with Vampir DKRZ Tutorial – 7 August, Hamburg Matthias Weber, Frank Winkler, Andreas Knüpfer ZIH, Technische Universität.
Slide 1 Auburn University Computer Science and Software Engineering Scientific Computing in Computer Science and Software Engineering Kai H. Chang Professor.
Paradyn Week – April 14, 2004 – Madison, WI DPOMP: A DPCL Based Infrastructure for Performance Monitoring of OpenMP Applications Bernd Mohr Forschungszentrum.
WORK ON CLUSTER HYBRILIT E. Aleksandrov 1, D. Belyakov 1, M. Matveev 1, M. Vala 1,2 1 Joint Institute for nuclear research, LIT, Russia 2 Institute for.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Visualization of Parallel Programming A Tool for Understanding Message Passing in Parallel Systems Andrew Schwartz ‘13 Computer Science Union College Advisor:
TRACEREP: GATEWAY FOR SHARING AND COLLECTING TRACES IN HPC SYSTEMS Iván Pérez Enrique Vallejo José Luis Bosque University of Cantabria TraceRep IWSG'15.
SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering Introduction to VI-HPS Brian Wylie Jülich Supercomputing Centre.
Adventures in Mastering the Use of Performance Evaluation Tools Manuel Ríos Morales ICOM 5995 December 4, 2002.
Score-P – A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir Alexandru Calotoiu German Research School for.
Presented by KOJAK and SCALASCA Jack Dongarra, Shirley Moore, Farzona Pulatova, and Fengguang Song University of Tennessee and Oak Ridge National Laboratory.
Parallelization: Area Under a Curve. AUC: An important task in science Neuroscience – Endocrine levels in the body over time Economics – Discounting:
General Purpose Computing on Graphics Processing Units: Optimization Strategy Henry Au Space and Naval Warfare Center Pacific 09/12/12.
GPU Computing April GPU Outpacing CPU in Raw Processing GPU NVIDIA GTX cores 1.04 TFLOPS CPU GPU CUDA Architecture Introduced DP HW Introduced.
Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore, CA This work.
Belgrade, 25 September 2014 George S. Markomanolis, Oriol Jorba, Kim Serradell Performance analysis Tools: a case study of NMMB on Marenostrum.
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
1 Typical performance bottlenecks and how they can be found Bert Wesarg ZIH, Technische Universität Dresden.
MESQUITE: Mesh Optimization Toolkit Brian Miller, LLNL
Weather Research & Forecasting Model Xabriel J Collazo-Mojica Alex Orta Michael McFail Javier Figueroa.
Belgrade, 26 September 2014 George S. Markomanolis, Oriol Jorba, Kim Serradell Overview of on-going work on NMMB HPC performance at BSC.
Abstract A Structured Approach for Modular Design: A Plug and Play Middleware for Sensory Modules, Actuation Platforms, Task Descriptions and Implementations.
October 2008 Integrated Predictive Simulation System for Earthquake and Tsunami Disaster CREST/Japan Science and Technology Agency (JST)
Lawrence Livermore National Laboratory S&T Principal Directorate - Computation Directorate Tools and Scalable Application Preparation Project Computation.
Xolotl: A New Plasma Facing Component Simulator Scott Forest Hull II Jr. Software Developer Oak Ridge National Laboratory
Shangkar Mayanglambam, Allen D. Malony, Matthew J. Sottile Computer and Information Science Department Performance.
Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs Allen D. Malony, Scott Biersdorff, Sameer Shende, Heike Jagode†, Stanimire.
SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering Analysis report examination with CUBE Markus Geimer Jülich Supercomputing.
Performane Analyzer Performance Analysis and Visualization of Large-Scale Uintah Simulations Kai Li, Allen D. Malony, Sameer Shende, Robert Bell Performance.
Other Tools HPC Code Development Tools July 29, 2010 Sue Kelly Sandia is a multiprogram laboratory operated by Sandia Corporation, a.
Projections - A Step by Step Tutorial By Chee Wai Lee For the 2004 Charm++ Workshop.
Presented by Jack Dongarra University of Tennessee and Oak Ridge National Laboratory KOJAK and SCALASCA.
Introduction to HPC Debugging with Allinea DDT Nick Forrington
Productive Performance Tools for Heterogeneous Parallel Computing
Cyber Physiology Analysis Framework Concept
TAU integration with Score-P
X10: Performance and Productivity at Scale
NSF start October 1, 2014 Datanet: CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science Indiana University.
A configurable binary instrumenter
Tutorial Overview February 2017
Physics-based simulation for visual computing applications
BigSim: Simulating PetaFLOPS Supercomputers
Lulesh AMT and Ascent Tasks Overview Goal
Presentation transcript:

Performance Analysis Tools for High-Performance Computing Daniel Becker

German Research School for Simulation Sciences Joint venture between Forschungszentrum Jülich and RWTH Aachen University –Research and education in the simulation sciences –International Masters program –Ph.D. program

Jülich Supercomputing Centre Research in Computational Science Computer Science Mathematics Jülich BG/P 294,912 cores Jülich Nehalem Cluster 26,304 cores

Scalable performance-analysis toolset for parallel codes Integrated performance analysis process –Performance overview on call-path level via runtime summarization –In-depth study of application behavior via event tracing –Switching between both options without recompilation or relinking Supported programming models –MPI-1, MPI-2 one-sided communication –OpenMP (basic features) Available under New BSD –

Research Challenges Scalability –Collection and representation of necessary runtime information –Analysis and visualization of performance behavior Analysis of asynchronous tasks –Examples: OpenMP 3.0, StarSs, CUDA, OpenCL,... –Threads and tasks - different dimensions of parallelism –Identification of performance properties –Representation of asynchronous performance data with respect to call-path profile data –Measurement, analysis and result presentation –Integration with current analysis approaches