21 Sep 2005 1 UPC Performance Analysis Tool: Status and Plans Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant.

Slides:



Advertisements
Similar presentations
Performance Analysis and Optimization through Run-time Simulation and Statistics Philip J. Mucci University Of Tennessee
Advertisements

On the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks Bernd Mohr 1, Allen D. Malony 2, Rudi Eigenmann 3 1 Forschungszentrum.
Instrumentation and Profiling David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA
Supplement 02CASE Tools1 Supplement 02 - Case Tools And Franchise Colleges By MANSHA NAWAZ.
Intel Trace Collector and Trace Analyzer Evaluation Report Hans Sherburne, Adam Leko UPC Group HCS Research Laboratory University of Florida Color encoding.
1 The VAMPIR and PARAVER performance analysis tools applied to a wet chemical etching parallel algorithm S. Boeriu 1 and J.C. Bruch, Jr. 2 1 Center for.
GASP: A Performance Tool Interface for Global Address Space Languages & Libraries Adam Leko 1, Dan Bonachea 2, Hung-Hsun Su 1, Bryan Golden 1, Hans Sherburne.
06 April 2006 Parallel Performance Wizard: Analysis Module Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr.
PAPI Tool Evaluation Bryan Golden 1/4/2004 HCS Research Laboratory University of Florida.
Prospector : A Toolchain To Help Parallel Programming Minjang Kim, Hyesoon Kim, HPArch Lab, and Chi-Keung Luk Intel This work will be also supported by.
Tool Visualizations, Metrics, and Profiled Entities Overview Adam Leko HCS Research Laboratory University of Florida.
Silicon Graphics, Inc. Presented by: Open|SpeedShop ™ A Dyninst Based Performance Tool Overview & Status Jim Galarowicz Manager - SGI Tools group Bill.
The Performance Evaluation Research Center (PERC) Patrick H. Worley, ORNL Participating Institutions: Argonne Natl. Lab.Univ. of California, San Diego.
UPC/SHMEM PAT High-level Design v.1.1 Hung-Hsun Su UPC Group, HCS lab 6/21/2005.
MpiP Evaluation Report Hans Sherburne, Adam Leko UPC Group HCS Research Laboratory University of Florida.
Paradyn Week – April 14, 2004 – Madison, WI DPOMP: A DPCL Based Infrastructure for Performance Monitoring of OpenMP Applications Bernd Mohr Forschungszentrum.
Analyzing parallel programs with Pin Moshe Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi- Keung Luk,
Silicon Graphics, Inc. Presented by: Open|SpeedShop™ An Open Source Performance Debugger Overview Jack Carter MTS SGI Core Engineering.
SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering Introduction to VI-HPS Brian Wylie Jülich Supercomputing Centre.
Adventures in Mastering the Use of Performance Evaluation Tools Manuel Ríos Morales ICOM 5995 December 4, 2002.
Scalable Analysis of Distributed Workflow Traces Daniel K. Gunter and Brian Tierney Distributed Systems Department Lawrence Berkeley National Laboratory.
BTEC Unit 06 – Lesson 08 Principals of Software Design Mr C Johnston ICT Teacher
SC 2012 © LLNL / JSC 1 HPCToolkit / Rice University Performance Analysis through callpath sampling  Designed for low overhead  Hot path analysis  Recovery.
11 July 2005 Tool Evaluation Scoring Criteria Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko,
VAMPIR. Visualization and Analysis of MPI Resources Commercial tool from PALLAS GmbH VAMPIRtrace - MPI profiling library VAMPIR - trace visualization.
Performance Model & Tools Summary Hung-Hsun Su UPC Group, HCS lab 2/5/2004.
Silicon Graphics, Inc. Presented by: Open|SpeedShop™ Status, Issues, Schedule, Demonstration Jim Galarowicz SGI Core Engineering March 21, 2006.
John Mellor-Crummey Robert Fowler Nathan Tallent Gabriel Marin Department of Computer Science, Rice University Los Alamos Computer Science Institute HPCToolkit.
Overview of CrayPat and Apprentice 2 Adam Leko UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red: Negative.
SvPablo Evaluation Report Hans Sherburne, Adam Leko UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:
Martin Schulz Center for Applied Scientific Computing Lawrence Livermore National Laboratory Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Performance Analysis Tool List Hans Sherburne Adam Leko HCS Research Laboratory University of Florida.
Blaise Barney, LLNL ASC Tri-Lab Code Development Tools Workshop Thursday, July 29, 2010 Lawrence Livermore National Laboratory, P. O. Box 808, Livermore,
Profiling, Tracing, Debugging and Monitoring Frameworks Sathish Vadhiyar Courtesy: Dr. Shirley Moore (University of Tennessee)
KOJAK Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red: Negative.
Portable Parallel Performance Tools Shirley Browne, UTK Clay Breshears, CEWES MSRC Jan 27-28, 1998.
Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:
Summary Presentation (3/24/2005) UPC Group HCS Research Laboratory University of Florida.
UPC Performance Tool Interface Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko, Sr. Research Assistant.
HPCToolkit Evaluation Report Hans Sherburne, Adam Leko UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:
1 Independent Study Summary 3/22/05 B. Golden. 2 Presentation Methodology: Introduction Why use visualizations?  To facilitate user comprehension  To.
Preparatory Research on Performance Tools for HPC HCS Research Laboratory University of Florida November 21, 2003.
11 July 2005 Briefing on Tool Evaluations Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko, Sr.
Allen D. Malony Department of Computer and Information Science TAU Performance Research Laboratory University of Oregon Discussion:
Tool Visualizations, Metrics, and Profiled Entities Overview [Brief Version] Adam Leko HCS Research Laboratory University of Florida.
Dynaprof Evaluation Report Adam Leko, Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red:
Overview of dtrace Adam Leko UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red: Negative note Green: Positive.
TAU Evaluation Report Adam Leko, Hung-Hsun Su UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red: Negative.
Overview of AIMS Hans Sherburne UPC Group HCS Research Laboratory University of Florida Color encoding key: Blue: Information Red: Negative note Green:
Performance Analysis with Parallel Performance Wizard Prashanth Prakash, Research Assistant Dr. Vikas Aggarwal, Research Scientist. Vrishali Hajare, Research.
Parallel Performance Wizard: a Performance Analysis Tool for UPC (and other PGAS Models) Max Billingsley III 1, Adam Leko 1, Hung-Hsun Su 1, Dan Bonachea.
Parallelization Strategies Laxmikant Kale. Overview OpenMP Strategies Need for adaptive strategies –Object migration based dynamic load balancing –Minimal.
Concept Diagram Hung-Hsun Su UPC Group, HCS lab 1/27/2004.
The Performance Evaluation Research Center (PERC) Participating Institutions: Argonne Natl. Lab.Univ. of California, San Diego Lawrence Berkeley Natl.
Integrated Performance Views in Charm++: Projections meets TAU Scott Biersdorff Allen D. Malony Department Computer and Information Science University.
Spring ’05 Independent Study Midterm Review Hans Sherburne HCS Research Laboratory University of Florida.
11 July 2005 Emergence of Performance Analysis Tools for Shared-Memory HPC in UPC or SHMEM Professor Alan D. George, Principal Investigator Mr. Hung-Hsun.
1 Presentation Methodology Summary B. Golden. 2 Introduction Why use visualizations?  To facilitate user comprehension  To convey complexity and intricacy.
Testing plan outline Adam Leko Hans Sherburne HCS Research Laboratory University of Florida.
PAPI on Blue Gene L Using network performance counters to layout tasks for improved performance.
Other Tools HPC Code Development Tools July 29, 2010 Sue Kelly Sandia is a multiprogram laboratory operated by Sandia Corporation, a.
A Dynamic Tracing Mechanism For Performance Analysis of OpenMP Applications - Caubet, Gimenez, Labarta, DeRose, Vetter (WOMPAT 2001) - Presented by Anita.
Presented by Jack Dongarra University of Tennessee and Oak Ridge National Laboratory KOJAK and SCALASCA.
Profiling/Tracing Method and Tool Evaluation Strategy Summary Slides Hung-Hsun Su UPC Group, HCS lab 1/25/2005.
Parallel Performance Wizard: A Generalized Performance Analysis Tool Hung-Hsun Su, Max Billingsley III, Seth Koehler, John Curreri, Alan D. George PPW.
Online Performance Analysis and Visualization of Large-Scale Parallel Applications Kai Li, Allen D. Malony, Sameer Shende, Robert Bell Performance Research.
Kai Li, Allen D. Malony, Sameer Shende, Robert Bell
ARM Tools Working Group
Many-core Software Development Platforms
A configurable binary instrumenter
Presentation transcript:

21 Sep UPC Performance Analysis Tool: Status and Plans Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko, Sr. Research Assistant Mr. Bryan Golden, Research Assistant Mr. Hans Sherburne, Research Assistant HCS Research Laboratory University of Florida PAT

21 Sep Outline Accomplishments PAT High-level Design PAT Development Plan Q&A

21 Sep Accomplishments Survey of PAT technology Evaluation of existing parallel PAT using a standardized scoring system Study of UPC language with respect to performance analysis Usability study Identification of reusable components Requirements for PAT High-level design for PAT PAT = Performance Analysis Tool

21 Sep Performance Analysis Technologies Three general performance analysis approaches  Analytical modeling Mostly predictive methods Could also be used in conjunction with experimental performance measurement Pros: easy to use, fast, can be done without running the program Cons: usually not very accurate  Simulation Pros: allow performance estimation of program with various system architectures Cons: slow, not generally applicable for regular UPC/SHMEM users  Experimental performance measurement Strategy used by most modern PATs Uses actual event measurement to perform the analysis Pros: most accurate Cons: can be time-consuming PAT = Performance Analysis Tool

21 Sep Tools Study: List of Tools Profiling tools  TAU (Univ. of Oregon)  mpiP (ORNL, LLNL)  HPCToolkit (Rice Univ.)  SvPablo (Univ. of Illinois, Urbana-Champaign)  DynaProf (Univ. of Tennessee, Knoxville) Tracing tools  Intel Cluster Tools (Intel)  MPE/Jumpshot (ANL)  Dimemas & Paraver (European Ctr. for Parallelism of Barcelona)  MPICL/ParaGraph (Univ. of Illinois, Univ. of Tennessee, ORNL) Other tools  KOJAK (Forschungszentrum Jülich, UTK)  Paradyn (Univ. of Wisconsin, Madison) Also quickly reviewed  CrayPat/Apprentice2 (Cray)  DynTG (LLNL)  AIMS (NASA)  Eclipse Parallel Tools Platform (LANL)  Open/Speedshop (SGI)

21 Sep Tools Study: Lessons Learned Most effective tools present data to user at many levels  Profile and summary data that gives top-level view of program’s execution  Trace data that shows exact behavior and communication patterns  Hardware counters that show how well program runs on current architecture Common problems experienced  Difficult installations  Poor documentation, no “quickstart” guide  Too much user overhead  Not enough data available from tools to troubleshoot performance problems  Difficult to navigate tool to get to performance data or nonstandard user interfaces  Inability to relate data back to source code

21 Sep Overall Scores

21 Sep Language Study & Usability Study Language study  Hybrid pre-processor/wrapper library approach appears appropriate UPC functions  wrappers Others (direct memory access, upc_forall, etc.)  pre-processor  UPC profiling interface Avoid prevention of compiler optimization Obtain low-level / implementation specific data Usability study  Elicit user feedback through a Performance Tool Usability Survey – limited success thus far  Review and develop a concise summary of literature regarding usability for parallel performance tools

21 Sep Extensibility & Code Reuse Instrumentation & measurement units  Source instrumentation: tau_instrument, PDToolkit  Binary instrumentation: DynInst, DPCL, ATOM, Pin  Hardware counters: PAPI, PCL  General trace and profile routines: TAU, EPILOG & EARL from KOJAK  Use PAPI, develop own instrumentation technique (UPC profiling interface) Analysis unit  Source code correlation from wrapper libraries: HPCToolkit, mpiP  Pattern matching for bottleneck identification: EXPERT from KOJAK  Borrow idea + develop one-sided communication model Presentation unit  Trace visualization: Jumpshot, Intel Trace Analyzer or VampirNG using libvtf  Profile visualization: CUBE from KOJAK, Paraprof from TAU  Source code viewer: ToolGear, HPCToolkit  Develop on our own but borrow small pieces of code UPC PAT  Develop new tool that borrows ideas from existing tools

21 Sep PAT High-level Design Semi-automatic source-level instrumentation as default  Minimize user workload  Reduce development complexity  Provide easy source-code correlation Tracing mode as default with profiling support Analyses: load balancing, scalability, memory system Visualizations:  Timeline display  Speedup chart  Call-tree graph  Communication volume graph  Memory access graph  Profiling table

21 Sep PAT High-level Design: Experimental Performance Measurement Stages Instrumentation – user- assisted or automatic insertion of instrumentation code Measurement – actual measuring stage Analysis – filtering, aggregation, analysis of data gathered Presentation – display of analyzed data to user, deals directly with user Optimization – process of finding and resolving bottlenecks

21 Sep PAT High-Level Design – Simplified Version

21 Sep

21 Sep Research and Release Schedule Beta-tester needed!! PAT ReleaseDelivery Date I and M modules for 1st platformsApril 1, 2006 (Month 6) I, M, and A modules for 1st platformsJuly 1, 2006 (Month 9) I and M modules for 2nd platformsJuly 1, 2006 (Month 9) Beta-1 (I, M, A, and P modules for all platforms)October 1, 2006 (Month 12) Beta-2December 1, 2006 (Month 14) Beta-3February 1, 2007 (Month 16) Version 1.0April 1, 2007 (Month 18)

21 Sep Q & A