SC 2012 © LLNL / JSC 1 HPCToolkit / Rice University Performance Analysis through callpath sampling  Designed for low overhead  Hot path analysis  Recovery.

Slides:



Advertisements
Similar presentations
Tridion 5.3 Templates.
Advertisements

Barcelona Supercomputing Center. The BSC-CNS objectives: R&D in Computer Sciences, Life Sciences and Earth Sciences. Supercomputing support to external.
INTRODUCTION TO SIMULATION WITH OMNET++ José Daniel García Sánchez ARCOS Group – University Carlos III of Madrid.
© 2005 Dorian C. Arnold Reliability in Tree-based Overlay Networks Dorian C. Arnold University of Wisconsin Paradyn/Condor Week March 14-18, 2005 Madison,
A BitTorrent Module for the OMNeT++ Simulator MASCOTS 2009, London, UK G. Xylomenos (with K. Katsaros, V.P. Kemerlis and C. Stais)
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Chapter 10 Application Development. Chapter Goals Describe the application development process and the role of methodologies, models and tools Compare.
Stack Management Each process/thread has two stacks  Kernel stack  User stack Stack pointer changes when exiting/entering the kernel Q: Why is this necessary?
October 30, 2008 Extensible Workflow Management for Simmod ESUG32, Frankfurt, Oct 30, 2008 Alexander Scharnweber (DLR) October 30, 2008 Slide 1 > Extensible.
SOA, BPM, BPEL, jBPM.
Tree-Based Density Clustering using Graphics Processors
SCRAM Software Configuration, Release And Management Background SCRAM has been developed to enable large, geographically dispersed and autonomous groups.
1 Parallel Performance Analysis with Open|SpeedShop Trilab Tools-Workshop Martin Schulz, LLNL/CASC LLNL-PRES
04/30/2013 Status of Krell Tools Built using Dyninst/MRNet Paradyn Week 2013 Madison, Wisconsin April 30, Paradyn Week 2013 LLNL-PRES
Implementation Yaodong Bi. Introduction to Implementation Purposes of Implementation – Plan the system integrations required in each iteration – Distribute.
XIP™ – the eXtensible Imaging Platform A rapid application development and deployment platform Lawrence Tarbox, Ph.D. September, 2010.
COMP 410 & Sky.NET May 2 nd, What is COMP 410? Forming an independent company The customer The planning Learning teamwork.
NOVA: CONTINUOUS PIG/HADOOP WORKFLOWS. storage & processing scalable file system e.g. HDFS distributed sorting & hashing e.g. Map-Reduce dataflow programming.
03/28/201211/18/2011 Status of Krell Tools Built using Dyninst/MRNet Paradyn Week 2012 College Park, MD March 28, Paradyn Week 2012 LLNL-PRES
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
1 Abstracting the Content of System Call Traces Waseem Fadel Abdelwahab Hamou-Lhadj Department of Electrical and Computer Engineering Concordia University.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
Magnetic Field Measurement System as Part of a Software Family Jerzy M. Nogiec Joe DiMarco Fermilab.
04/27/2011 Paradyn Week Open|SpeedShop & Component Based Tool Framework (CBTF) project status and news Jim Galarowicz, Don Maghrak The Krell Institute.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Selected Topics in Software Engineering - Distributed Software Development.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Adaptable Consistency Control for Distributed File Systems Simon Cuce Monash University Dept. of Computer Science and Software.
Blaise Barney, LLNL ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Verified Network Configuration. Verinec Goals Device independent network configuration Automated testing of configuration Automated distribution of configuration.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
3 Copyright © 2009, Oracle. All rights reserved. Accessing Non-Oracle Sources.
GreenBus Extensions for System-On-Chip Exploration.
Grid programming with components: an advanced COMPonent platform for an effective invisible grid © 2006 GridCOMP Grids Programming with components. An.
SIMO SIMulation and Optimization ”New generation forest planning system” Antti Mäkinen & Jussi Rasinmäki Dept. of Forest Resource Management.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Review of Parnas’ Criteria for Decomposing Systems into Modules Zheng Wang, Yuan Zhang Michigan State University 04/19/2002.
Centroute, Tenet and EmStar: Development and Integration Karen Chandler Centre for Embedded Network Systems University of California, Los Angeles.
31 Oktober 2000 SEESCOASEESCOA STWW - Programma Work Package 5 – Debugging Task Generic Debug Interface K. De Bosschere e.a.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Performane Analyzer Performance Analysis and Visualization of Large-Scale Uintah Simulations Kai Li, Allen D. Malony, Sameer Shende, Robert Bell Performance.
Mitglied der Helmholtz-Gemeinschaft Debugging and Validation Tools on Parallel Systems 2012 |Bernd Mohr Institute for Advanced Simulation (IAS) Jülich.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
APRIL 10, Meeting Agenda  Prototype 2 Goals  Robust Connections Demo  System Diagnostics Tool Demo  Final Prototype Risk Mitigation  Final.
APRIL 10, Meeting Agenda  Prototype 2 Goals  Robust Connections Demo  System Diagnostics Tool Demo  Final Prototype Risk Mitigation  Final.
Introduction to HPC Debugging with Allinea DDT Nick Forrington
Online Software November 10, 2009 Infrastructure Overview Luciano Orsini, Roland Moser Invited Talk at SuperB ETD-Online Status Review.
Online Performance Analysis and Visualization of Large-Scale Parallel Applications Kai Li, Allen D. Malony, Sameer Shende, Robert Bell Performance Research.
Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX.
Implementation of Classifier Tool in Twister Magesh khanna Vadivelu Shivaraman Janakiraman.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
Geant4 Computing Performance Task with Open|Speedshop Soon Yung Jun, Krzysztof Genser, Philippe Canal (Fermilab) 21 st Geant4 Collaboration Meeting, Ferrara,
Chapter Goals Describe the application development process and the role of methodologies, models, and tools Compare and contrast programming language generations.
Chapter 10 Application Development
Progress Apama Fundamentals
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
Development Environment
Spark Presentation.
Joseph JaJa, Mike Smorul, and Sangchul Song
Kilohertz Decision Making on Petabytes
Stephen Dawson-Haggerty
runtime verification Brief Overview Grigore Rosu
From Open|SpeedShop to a Component Based Tool Framework
Introduction to cosynthesis Rabi Mahapatra CSCE617
QNX Technology Overview
Stack Trace Analysis for Large Scale Debugging using MRNet
A GUI Based Aid for Generation of Code-Frameworks of TMOs
MapReduce: Simplified Data Processing on Large Clusters
ASP.NET Core Middleware Fundamentals
Pig Hive HBase Zookeeper
Presentation transcript:

SC 2012 © LLNL / JSC 1 HPCToolkit / Rice University Performance Analysis through callpath sampling  Designed for low overhead  Hot path analysis  Recovery of program structure from binary Image by John Mellor-Crummey

SC 2012 © LLNL / JSC 2 HPCToolkit: hpcviewer Callpath to hotspot Callpath to hotspot associated source code associated source code Image by John Mellor-Crummey

SC 2012 © LLNL / JSC 3 STAT: Aggregating Stack Traces for Debugging  Existing debuggers don’t scale Inherent limits in the approaches Need for new, scalable methodologies  Need to pre-analyze and reduce data Fast tools to gather state Help select nodes to run conventional debuggers on  Scalable tool: STAT Stack Trace Analysis Tool Goal: Identify equivalence classes Hierarchical and distributed aggregation of stack traces from all tasks Stack trace merge <1s from 200K+ cores (Project by LLNL, UW, UNM)

SC 2012 © LLNL / JSC 4 Distinguishing Behavior with Stack Traces

SC 2012 © LLNL / JSC 5 Appl … … 3D-Trace Space/Time Analysis

SC 2012 © LLNL / JSC 6 Scalable Representation 288 Nodes / 10 Snapshots

SC 2012 © LLNL / JSC 7 Component Based Tool Framework (CBTF)  Independent components connected by typed pipes  Transforming data coming from the application on the way to the user  External specification of which components to connect  Each combination of components is/can be “a tool”  Shared services Partners  Krell Institute  LANL, LLNL, SNLs  ORNL  UW, UMD  CMU Shared Tool Frameworks Services Tool Component Framework Pipeline Comp. Pipeline Comp. Pipeline Comp. Application

SC 2012 © LLNL / JSC 8 CBTF Modules  Data-Flow Model Accepts Inputs Performs Processing Emits Outputs  C++ Based  Provide Metadata Type & Version Input Names & Types Output Names & Types  Versioned Concurrent Versions  Packaging Executable-Embedded Shared Library Runtime Plugin

SC 2012 © LLNL / JSC 9 CBTF Component Networks  Components Specific Versions  Connections Matching Types  Arbitrary Component Topology Pipelines Graphs with cycles ….  Recursive Network itself is a component  XML-Specified

SC 2012 © LLNL / JSC 10 Specifying Component Networks to Create New Tools …. ExampleNetwork :/opt/myplugins myplugin A1 TestComponentA … … A1 out A2 in …  Users can create new tools by specifying new networks Combine existing functionality Reuse general model Add application specific details — Phase/context filters — Data mappings  Connection information Which components? Which ports connected? Grouping into networks  Implemented as XML User writable Could be generated by a GUI

SC 2012 © LLNL / JSC 11 CBTF Structure and Dependencies  Minimal Dependencies Easier Builds  Tool-Type Independent Performance Tools Debugging Tools etc…  Completed Components Base Library (libcbtf) XML-Based Component Networks (libcbtf-xml) MRNet Distributed Components (libcbtf-mrnet)  Planned Components TCP/IP Distributed Component Networks GUI Definition of Component Networks

SC 2012 © LLNL / JSC 12 Open|SpeedShop v2.0  CBTF created by componentizing the existing Open|SpeedShop  Motivation: scalability & maintainability Extensions for O|SS in CBTF (soon)  Threading overheads  Memory consumption  I/O profiling Further tools in progress  GPU performance analysis  Tools for system administration and health monitoring Tools on Top of CBTF

SC 2012 © LLNL / JSC 13 We need frameworks that enable …  Independently created and maintained components  Flexible connection of components  Assembly of new tools from these components by the user CBTF is designed as a generic tool framework  Components are connected by typed pipes  Infrastructure for hierarchical aggregation with user defined functions  Component specification is external through XML files  Tailor tools by combining generic and application specific tools CBTF is available as a pre-release version  First prototype of Open|SpeedShop v2.0 working  New extensions for O|SS exploiting CBTF advantages  Several new tools built on top of CBTF  Wiki at Summary: The Need for Components