03/28/201211/18/2011 Status of Krell Tools Built using Dyninst/MRNet Paradyn Week 2012 College Park, MD March 28, 2012 1Paradyn Week 2012 LLNL-PRES-503431.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

Configuration management
TSpaces Services Suite: Automating the Development and Management of Web Services Presenter: Kevin McCurley IBM Almaden Research Center Contact: Marcus.
INTRODUCTION TO SIMULATION WITH OMNET++ José Daniel García Sánchez ARCOS Group – University Carlos III of Madrid.
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin May 2-4, 2011 ProcControlAPI and StackwalkerAPI Integration into Dyninst Todd Frederick and Dan.
1 Coven a Framework for High Performance Problem Solving Environments Nathan A. DeBardeleben Walter B. Ligon III Sourabh Pandit Dan C. Stanzione Jr. Parallel.
DEV392: Extending SharePoint Products And Technologies Through Web Parts And ASP.NET Clint Covington, Program Manager Data And Developer Services - Office.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
This chapter is extracted from Sommerville’s slides. Text book chapter
1 Parallel Performance Analysis with Open|SpeedShop Trilab Tools-Workshop Martin Schulz, LLNL/CASC LLNL-PRES
Christopher Jeffers August 2012
04/30/2013 Status of Krell Tools Built using Dyninst/MRNet Paradyn Week 2013 Madison, Wisconsin April 30, Paradyn Week 2013 LLNL-PRES
Zhonghua Qu and Ovidiu Daescu December 24, 2009 University of Texas at Dallas.
An Introduction to Software Architecture
The Pipeline Processing Framework LSST Applications Meeting IPAC Feb. 19, 2008 Raymond Plante National Center for Supercomputing Applications.
Artdaq Introduction artdaq is a toolkit for creating the event building and filtering portions of a DAQ. A set of ready-to-use components along with hooks.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
TRACEREP: GATEWAY FOR SHARING AND COLLECTING TRACES IN HPC SYSTEMS Iván Pérez Enrique Vallejo José Luis Bosque University of Cantabria TraceRep IWSG'15.
Computing and the Web Operating Systems. Overview n What is an Operating System n Booting the Computer n User Interfaces n Files and File Management n.
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
04/27/2011 Paradyn Week Open|SpeedShop & Component Based Tool Framework (CBTF) project status and news Jim Galarowicz, Don Maghrak The Krell Institute.
SC 2012 © LLNL / JSC 1 HPCToolkit / Rice University Performance Analysis through callpath sampling  Designed for low overhead  Hot path analysis  Recovery.
BLU-ICE and the Distributed Control System Constraints for Software Development Strategies Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
March 17, 2005 Roadmap of Upcoming Research, Features and Releases Bart Miller & Jeff Hollingsworth.
The Network Performance Advisor J. W. Ferguson NLANR/DAST & NCSA.
Silicon Graphics, Inc. Presented by: Open|SpeedShop™ Status, Issues, Schedule, Demonstration Jim Galarowicz SGI Core Engineering March 21, 2006.
Ch 1. A Python Q&A Session Spring Why do people use Python? Software quality Developer productivity Program portability Support libraries Component.
07:44:46Service Oriented Cyberinfrastructure Lab, Introduction to BOINC By: Andrew J Younge
CS 390 Unix Programming Summer Unix Programming - CS 3902 Course Details Online Information Please check.
Chapter 5.4 DISTRIBUTED PROCESS IMPLEMENTAION Prepared by: Karthik V Puttaparthi
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Mantid Development introduction Nick Draper 11/04/2008.
Component Technology. Challenges Facing the Software Industry Today’s applications are large & complex – time consuming to develop, difficult and costly.
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Clever Framework Name That Doesn’t Violate Copyright Laws MARCH 27, 2015.
Middleware for FIs Apeego House 4B, Tardeo Rd. Mumbai Tel: Fax:
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Operating System What is an Operating System? A program that acts as an intermediary between a user of a computer and the computer hardware. An operating.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Memory: Relocation.
ANKITHA CHOWDARY GARAPATI
Debugging parallel programs. Breakpoint debugging Probably the most widely familiar method of debugging programs is breakpoint debugging. In this method,
National Center for Supercomputing ApplicationsNational Computational Science Grid Packaging Technology Technical Talk University of Wisconsin Condor/GPT.
F Drag and Drop Controls Display and Builder (Synoptic Display) Timofei Bolshakov, Andrey Petrov Fermilab Accelerator Controls Department March 26, 2007.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
August 2003 At A Glance The IRC is a platform independent, extensible, and adaptive framework that provides robust, interactive, and distributed control.
Internet Applications (Cont’d) Basic Internet Applications – World Wide Web (WWW) Browser Architecture Static Documents Dynamic Documents Active Documents.
1 Chapter 12 Configuration management This chapter is extracted from Sommerville’s slides. Text book chapter 29 1.
Centroute, Tenet and EmStar: Development and Integration Karen Chandler Centre for Embedded Network Systems University of California, Los Angeles.
Ryan Rasmussen Maggie Krause Jiajun Yang. Hardware Progress Mechanical assembly complete Received APM case and power module last week Connected wi-fi.
Performance Testing Test Complete. Performance testing and its sub categories Performance testing is performed, to determine how fast some aspect of a.
Paradyn Week Monday 8:45-9:15am - The Deconstruction of Dyninst: Lessons Learned and Best Practices The Deconstruction of Dyninst: Lessons Learned.
April 2007The Deconstruction of Dyninst: Part 1- the SymtabAPI The Deconstruction of Dyninst Part 1: The SymtabAPI Giridhar Ravipati University of Wisconsin,
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Source Level Debugging of Parallel Programs Roland Wismüller LRR-TUM, TU München Germany.
FlowLevel Client, server & elements monitoring and controlling system Message Include End Dial Start.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
CEPBA-Tools experiences with MRNet and Dyninst Judit Gimenez, German Llort, Harald Servat
Visual Programming Borland Delphi. Developing Applications Borland Delphi is an object-oriented, visual programming environment to develop 32-bit applications.
Beyond Application Profiling to System Aware Analysis Elena Laskavaia, QNX Bill Graham, QNX.
“Port Monitor”: progress & open questions Torsten Wilde and James Kohl Oak Ridge National Laboratory CCA Forum Quarterly Meeting Santa Fe, NM ~ October.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
VisIt Project Overview
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
GWE Core Grid Wizard Enterprise (
Cross Platform Development using Software Matrix
From Open|SpeedShop to a Component Based Tool Framework
New Features in Dyninst 6.1 and 6.2
An Introduction to Software Architecture
Presentation transcript:

03/28/201211/18/2011 Status of Krell Tools Built using Dyninst/MRNet Paradyn Week 2012 College Park, MD March 28, Paradyn Week 2012 LLNL-PRES

03/28/201211/18/2011 Presenters  Jim Galarowicz, Krell  Don Maghrak, Krell  Larger team  William Hachfeld and Dave Whitney, Dane Gardner, Krell  Martin Schulz, Matt Legendre, Chris Chambreau, LLNL  TJ Machado, Mike Mason, Jennifer Green, David Montoya, LANL  Mahesh Rajan, SNLs  Dyninst group Bart Miller, UW and team Jeff Hollingsworth, UMD and team  Phil Roth, ORNL 2Paradyn Week 2012

03/28/201211/18/2011 Outline  Welcome ① Open|SpeedShop overview and status ② SWAT (Scalable Targeted Debugger for Scientific and Commercial Computing) ③ Component Based Tool Framework (overview) ④ Component Based Tool Framework (tools) ⑤ Next generation: CBTF  Questions 3Paradyn Week 2012

03/28/201211/18/2011 Open|SpeedShop ( ) Paradyn Week 2012 March 28, Paradyn Week 2012

03/28/201211/18/2011  What is Open|SpeedShop?  HPC Linux, platform independent application performance tool  What can Open|SpeedShop do for the user?  Give lightweight overview of where program spends time  Find hot call paths in user program and libraries  Give access to hardware counter event information  Record calls to POSIX I/O functions, give timing, call paths, and optional info like: bytes read, file names...  Record calls to MPI functions. give timing, call paths, and optional info like: source, destination ranks,.....  Help pinpoint numerical problem areas by tracking FPE  Maps the performance information back to the source and displays source annotated with the performance information.  osspcsamp “How you run your application outside of O|SS” Paradyn Week Project Overview: What is Open|SpeedShop?

03/28/201211/18/2011  Update on status of Open|SpeedShop  More focus on CBTF the past year, but…  Added functionality to Open|SpeedShop Improved installation of Cray and Blue Gene versions Support for Cray dynamic executables – Execute similar to cluster: osspcsamp “how you run your application” Blue Gene P static executable support Added more options to compare script (osscompare) Support for doing arithmetic on the columns of information to create derived metrics  Starting DOE SBIR to research and add performance analysis support for GPU/Accelerators  Porting to Blue Gene Q  Adding a CBTF component instrumentor for data collection that leverages Lightweight MRNet for scalable data gathering and filtering. Talk more about this in the CBTF portion of talk Paradyn Week Open|SpeedShop

03/28/201211/18/2011 Currently:  What UW/UMD software is used in Open|SpeedShop?  symtabAPI – for symbol resolution on the Blue Gene/Q  dyninstAPI – for dynamic instrumentation and binary rewriting Includes the subcomponents that comprise “Dyninst”. Inserts performance info gathering collectors and runtimes into the application.  MRNet – for transfer mechanism Future:  What UW/UMD software will we use in Open|SpeedShop ?  Will be using much of the same software but in a different context through CBTF (code in CBTF components and component networks) Paradyn Week UW/UMD Software - Open|SpeedShop

03/28/201211/18/2011 SWAT Paradyn Week 2012 March 28, Paradyn Week 2012

03/28/201211/18/2011  What is SWAT?  A Commercialized version of the STAT debugger primarily developed by LLNL/UW  UW and Argo Navis* teaming together on SBIR to:  Port SWAT to more platforms  Test and extend the stack walking component used by SWAT, the StackwalkerAPI to work with more compilers, platforms, …  Enhance the GUI so that it is portable, robust, and easy to use. It will support simplified modes of use.  Develop more advanced call tree reduction algorithms  Improve SWAT’s ability to display complex stack trees  Uses StackWalkerAPI and MRNet *Commercial entity associated with Krell Paradyn Week SWAT

03/28/201211/18/2011 Component Based Tool Framework (CBTF) Paradyn Week 2012 March 28, Paradyn Week 2012

03/28/201211/18/2011  What is CBTF?  A Framework for writing Tools that are Based on Components.  Consists of: Libraries that support the creation of reusable components, component networks (single node and distributed) and support connection of the networks. Tool building libraries (decomposed from O|SS)  Benefits of CBTF  Components are reusable and easily added to new tools.  With a large component repository new tools can be written quickly with little code.  Create scalable tools by virtue of the distributed network based on MRNet.  Components can be shared with other projects Paradyn Week CBTF

03/28/201211/18/2011 CBTF: Base CBTF Libraries 12  Create components, component networks, distributed component networks cbtf framework tools libcbtf libcbtf-xml libcbtf-mrnet examples servicesmessagescore pcsampDemo Using LW MRNet BE daemonToolDemo Using MRNet components Paradyn Week 2012 contrib

03/28/201211/18/2011  Reusable objects with 0-N inputs and 0-M outputs.  Designed to be connected together  Connections defined in C++ or XML file.  Components are written in C++  Components can do anything your C code can do.  Run system commands, open files, do calculations. Paradyn Week Main concepts: Components

03/28/201211/18/2011 CBTF: Component Networks  Components  Specific Versions  Connections  Matching Types  Arbitrary Component Topology  Pipelines  Graphs with cycles  ….  Recursive  CBTF Component Network is itself a component.  XML-Specified 14 Paradyn Week 2012

03/28/201211/18/2011  Simple way to build the CBTF networks and connect the components.  No need to recompile (if the available components provide the capabilities needed). Paradyn Week XML File TestNetwork_Backend /users/mmason/mycode-rr/psToolGui psPlugin mrnetPlugin psCmdComp psCmd... Backend_In PacketToString in1 … psCmd out StringToPacket in2 DownwardStream Backend_In UpwardStream Backend_Output

03/28/201211/18/2011  CBTF uses a transport mechanism to handle all of its communications.  Currently that transport mechanism is MRNet  Multicast/Reduction Network  Scalable tree structure  Hierarchical on-line data aggregation  CBTF views MRNet as just another component.  In the future it could be swapped with some other transport mechanism, if desired. Paradyn Week MRNet

03/28/201211/18/2011  Three Networks where components can be connected  Frontend, Backend, multiple Filter levels  Every level is homogeneous  Each Network also has some number of inputs and outputs.  Any component network can be run on any level, but logically  Frontend component network Interact with or Display info to the user  Filter component network Filter or Aggregate info from below Make decisions about what is sent up or down the tree  Backend component network Real work of the tool (extracting information) Paradyn Week CBTF Networks

03/28/201211/18/2011 CBTF: Tool Building Support 18  To enable tool builders to get started  Add a tool building side to CBTF (tools subdirectory under cbtf) cbtf framework tools libcbtf libcbtf-xml libcbtf-mrnet examples servicesmessagescore pcsampDemo Using LW MRNet BE daemonToolDemo Using MRNet  Note:  The directory structure is subject to change  daemonToolDemo doesn’t rely on any service, message, or core “tools” code components Paradyn Week 2012 contrib

03/28/201211/18/2011 CBTF: Software Stack (Future Tools)  Open|SpeedShop  Using Services, Messages, Core built using CBTF infrastructure  Full fledged multipurpose performance tool  Customized Tools  Use the CBTF infrastructure, not necessarily any support from the tools support sub-directories  If tool creator sees a useful service in tools, they can choose to use it (along with any message and/or core library).  Aimed at specific tool needs determined by application code teams 19 Paradyn Week 2012

03/28/201211/18/2011  What can this framework be used for?  CBTF is flexible and general enough  To be used for any tool that needs to “do something” on a large number of nodes and filter or collect the results.  Sysadmin Tools  Poll information on a large number of nodes  Run commands or manipulate files on the backends  Make decisions at the filter level to reduce output or interaction  Performance Analysis Tools  Massively parallel applications need scalable tools  Have components running along side the application  Debugging Tools  Use cluster analysis to reduce thousands (or more) processes into a small number of groups Paradyn Week Using CBTF Beyond O|SS

03/28/201211/18/2011  memTool (tools/contrib/memTool)  Displays memory info for each node and total memory used for each MPI process. 21 Tool Example: Simple Memory Tool Frontend Backend PID MEM BE FE Paradyn Week 2012

03/28/201211/18/2011  PS Tool (tools/contrib/psTool)  Run ps on all backends, use filters to identify the procs running on all nodes and what procs are unique to each node. Paradyn Week Tool Example: Sysadmin FE F F BE Frontend Filter Backend PS Diff Sam e

03/28/201211/18/2011  CBTF based TBON-FS implementation (tools/contrib/tbonFS)  Perform group file operations on the backend local file system.  Run top on all backends.  Run grep, tail, read or write files on all backends.  Copy a file from NFS to the local /tmp on all backends. Paradyn Week Tool Example: Sysadmin FE F F BE Frontend Filter Backend T T G G T T R R W W Cat

03/28/201211/18/2011  Stack trace grouping (tools/contrib/stack)  Run gstack on each pid for a parallel application and group similar stack traces together. Paradyn Week Tool Example: Debugging Frontend Filter Backend gstack group

03/28/201211/18/2011  Why do we want to connect CBTF to O|SS?  Scalability…..  Allow filtering of performance data as it moves from the application to the client tool.  Eliminates the current “fileio” method which writes temporary files to disk. New method does not write files.  CBTF hides many details of MRNet coding by providing components that handle the actual MRNet implementation when using MRnet as a transport mechanism  For O|SS we will use: MRNet, symtabAPI initially, then dyninstAPI (and components) for “online/dynamic” version. 25 Connecting CBTF to Open|SpeedShop Paradyn Week 2012

03/28/201211/18/2011  Support Active Harmony integration with CBTF  More detailed documentation of examples, demo tools  Tool start up investigation/implementation (launchmon, libi, …)  Variations dependent on platform type (BG, Cray)  Tool services, messages, component creation to support more types of collection (don’t have all O|SS in cbtf/tools)  Continue porting to Cray and Blue Gene platforms  Add filtering components for MRNet communication node deployment  Continue the connection of CBTF to Open|SpeedShop 26 CBTF Next Steps Paradyn Week 2012

03/28/201211/18/2011  Where to find information  CBTF wiki:  Source Access  Currently -> Friendly access available through request Probably will be completely open shortly  Source hosted at ORNL git repository  CBTF Tutorial, Step by Step Instructional Info on CBTF wiki.  Technical paper is needed…  Always looking to collaborate with others, please contact us. 27 CBTF Information An Introduction into Performance Analysis for HPC Systems with Open|Speedshop

03/28/201211/18/2011 Questions  Jim Galarowicz   Don Maghrak   Questions about Open|SpeedShop or CBTF  28Paradyn Week 2012