Presentation is loading. Please wait.

Presentation is loading. Please wait.

11 July 2005 Briefing on Tool Evaluations Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko, Sr.

Similar presentations


Presentation on theme: "11 July 2005 Briefing on Tool Evaluations Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko, Sr."— Presentation transcript:

1 11 July 2005 Briefing on Tool Evaluations Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko, Sr. Research Assistant Mr. Bryan Golden, Research Assistant Mr. Hans Sherburne, Research Assistant HCS Research Laboratory University of Florida PAT

2 11 July 2005 Purpose & Methodology

3 11 July 2005 3 Purpose of Evaluations Investigate performance analysis methods used in existing tools  Determine what features are necessary for tools to be effective  Examine usability of tools  Find out what performance factors existing tools focus on  Create standardized evaluation strategy and apply to popular existing tools Gather information about tool extensibility  Generate list of reusable components from existing tools  Identify any tools that may serve as basis for our SHMEM and UPC performance tool  Take best candidates for extension and gain experience modifying to support new features

4 11 July 2005 4 Evaluation Methodology Generate list of desirable characteristics for performance tools Categorize based on influence of a tool’s:  Usability/productivity  Portability  Scalability  Miscellaneous Will present list of characteristics and actual scores in later slide Assign importance rating to each  Minor (not really important)  Average (nice to have)  Important (should include)  Critical (absolutely needed) Formulate a scoring strategy for each  Give numerical scores 1-5: 5 best 0: not applicable  Create objective scoring criteria where possible  Use relative scores for subjective categories

5 11 July 2005 5 Performance Tool Test Suite Method used to ensure subjective scores consistent across each tool Also used to determine effectiveness of performance tool Includes  Suite of C MPI microbenchmarks that have specific performance problems: PPerfMark [1,2], based on GrindStone [3]  Large-scale program: NAS NPB LU benchmark [4]  “Control” program with good parallel efficiency to test for false positives: CAMEL cryptanalysis C MPI implementation (HCS lab) For each program in test suite, assign  FAIL: Tool was unable to provide information to identify bottleneck  TOSS-UP: Tool indicated a bottleneck was occurring, but user must be clever to find out and fix  PASS: Tool clearly showed where bottleneck was occurring and gave enough information so a competent user could fix it

6 11 July 2005 6 Performance Tool Test Suite (2) What should performance tool tell us? CAMEL  No communication bottlenecks, CPU-bound code  Performance could be improved by using non-blocking MPI calls LU  Large number of small messages  Dependence on network bandwidth and latency  Identify which routines take the most time

7 11 July 2005 7 Performance Tool Test Suite (3) Big message  Several large messages sent  Dependence on network bandwidth Intensive server  First node overloaded with work Ping-pong  Many small messages, overall execution time dependent on network latency Random barrier  One node holds up barrier  One procedure responsible for slow node behavior Small messages  One node is bombarded with lots of messages Wrong way  Point-to-point messages sent in wrong order System time  Most time spent in system calls Diffuse procedure  Similar to random barrier  One node holds up barrier  Time for slow procedure “diffused” across several nodes in round-robin fashion

8 11 July 2005 Overview of Tools Evaluated

9 11 July 2005 9 List of Tools Evaluated Profiling tools  TAU (Univ. of Oregon)  mpiP (ORNL, LLNL)  HPCToolkit (Rice Univ.)  SvPablo (Univ. of Illinois, Urbana-Champaign)  DynaProf (Univ. of Tennessee, Knoxville) Tracing tools  Intel Cluster Tools (Intel)  MPE/Jumpshot (ANL)  Dimemas & Paraver (European Ctr. for Parallelism of Barcelona)  MPICL/ParaGraph (Univ. of Illinois, Univ. of Tennessee, ORNL)

10 11 July 2005 10 List of Tools Evaluated (2) Other tools  KOJAK (Forschungszentrum Jülich, ICL @ UTK)  Paradyn (Univ. of Wisconsin, Madison) Also quickly reviewed  CrayPat/Apprentice 2 (Cray)  DynTG (LLNL)  AIMS (NASA)  Eclipse Parallel Tools Platform (LANL)  Open/Speedshop (SGI)

11 11 July 2005 Profiling Tools

12 11 July 2005 12 Tuning and Analysis Utilities (TAU) Developer: University of Oregon Current versions:  TAU 2.14.4  Program database toolkit 3.3.1 Website:  http://www.cs.uoregon.edu/research/paracomp/tau/tautools/ Contact:  Sameer Shende: sameer@cs.uoregon.edu

13 11 July 2005 13 TAU Overview Measurement mechanisms  Source (manual)  Source (automatic via PDToolkit)  Binary (DynInst) Key features  Supports both profiling and tracing No built-in trace viewer Generic export utility for trace files (.vtf,.slog2,.alog)  Many supported architectures  Many supported languages: C, C++, Fortran, Python, Java, SHMEM (TurboSHMEM and Cray SHMEM), OpenMP, MPI, Charm  Hardware counter support via PAPI

14 11 July 2005 14 TAU Visualizations

15 11 July 2005 15 mpiP Developer: ORNL, LLNL Current version:  mpiP v2.8 Website:  http://www.llnl.gov/CASC/mpip/ Contacts:  Jeffrey Vetter: vetterjs@ornl.gov  Chris Chambreau: chcham@llnl.gov

16 11 July 2005 16 mpiP Overview Measurement mechanism  Profiling via MPI profiling interface Key features  Simple, lightweight profiling  Source code correlation (facilitated by mpipview) Gives profile information for MPI callsites Uses PMPI interface with extra libraries (libelf, libdwarf, libunwind) to do source correlation

17 11 July 2005 17 mpiP Source Code Browser

18 11 July 2005 18 HPCToolkit Developer: Rice University Current version:  HPCToolkit v1.1 Website:  http://www.hipersoft.rice.edu/hpctoolkit/ Contact:  John Mellor-Crummey: johnmc@cs.rice.edu  Rob Fowler: rjf@cs.rice.edu

19 11 July 2005 19 HPCToolkit Overview Measurement Mechanism  Hardware counters (requires PAPI on Linux) Key Features  Create hardware counter profiles for any executable via sampling No instrumentation necessary Relies on PAPI overflow events and program counter values to relate PAPI metrics to source code  Source code correlation of performance data, even for optimized code  Navigation pane in viewer assists in locating resource- consuming functions

20 11 July 2005 20 HPCToolkit Source Browser

21 11 July 2005 21 SvPablo Developer: University of Illinois Current versions:  SvPablo 6.0  SDDF component 5.5  Trace Library component 5.1.4 Website:  http://www.renci.unc.edu/Software/Pablo/pablo.htm Contact:  ?

22 11 July 2005 22 SvPablo Overview Measurement mechanism  Profiling via source code instrumentation Key features  Single GUI integrates instrumentation and performance data display  Assisted source code instrumentation  Management of multiple instances of instrumented sourced code and corresponding performance data  Simplified scalability analysis of performance data from multiple runs

23 11 July 2005 23 SvPablo Visualization

24 11 July 2005 24 Dynaprof Developer: Philip Mucci (UTK) Current versions:  Dynaprof CVS as of 2/21/2005  DynInst API v4.1.1 (dependency)  PAPI v3.0.7 (dependency) Website:  http://www.cs.utk.edu/~mucci/dynaprof/ Contact:  Philip Mucci: mucci@cs.utk.edu

25 11 July 2005 25 Dynaprof Overview Measurement mechanism  Profiling via PAPI and DynInst Key features  Simple, gdb -like command line interface  No instrumentation step needed – binary instrumentation at runtime  Produces simple text-based profile output similar to gprof for PAPI metrics Wallclock time CPU time ( getrusage )

26 11 July 2005 Tracing Tools

27 11 July 2005 27 Intel Trace Collector/Analyzer Developer: Intel Current versions:  Intel Trace Collector 5.0.1.0  Intel Trace Analyzer 4.0.3.1 Website:  http://www.intel.com/software/products/cluster Contact:  http://premier.intel.com

28 11 July 2005 28 Intel Trace Collector/Analyzer Overview Measurement Mechanism  MPI profiling interface for MPI programs  Static binary instrumentation (proprietary method) Key Features  Simple, straightforward operation  Comprehensive set of visualizations  Source code correlation pop-up dialogs  Views are linked, allowing analysis of specific portions/phases of execution trace

29 11 July 2005 29 Intel Trace Analyzer Visualizations

30 11 July 2005 30 MPE/Jumpshot Developer: Argonne National Laboratory Current versions:  MPE 1.26  Jumpshot-4 Website:  http://www-unix.mcs.anl.gov/perfvis/ Contacts:  Anthony Chan: chan@mcs.anl.gov  David Ashton: ashton@mcs.anl.gov  Rusty Lusk: lusk@mcs.anl.gov  William Gropp: gropp@mcs.anl.gov

31 11 July 2005 31 MPE/Jumpshot Overview Measurement Mechanism  MPI profiling interface for MPI programs Key Features  Distributed with MPICH  Easy to generate traces of MPI programs Compile with mpicc -mpilog  Scalable logfile format for efficient visualization  Java-based timeline viewer with extensive scrolling and zooming support

32 11 July 2005 32 Jumpshot Visualization

33 11 July 2005 33 CEPBA Tools (Dimemas, Paraver) Developer: European Center for Parallelism of Barcelona Current versions:  MPITrace 1.1  Paraver 3.3  Dimemas 2.3 Website:  http://www.cepba.upc.es/tools_i.htm Contact:  Judit Gimenez: judit@cepba.upc.edu

34 11 July 2005 34 Dimemas/Paraver Overview Measurement Mechanism  MPI profiling interface Key Features  Paraver Sophisticated trace file viewer, uses “tape” metaphor Supports displaying hardware counter metrics along in trace visualization Uses modular software architecture, very customizable  Dimemas Trace-driven simulator Uses simple models for real hardware Generates “predictive traces” that can be viewed by Paraver

35 11 July 2005 35 Paraver Visualizations

36 11 July 2005 36 Paraver Visualizations (2)

37 11 July 2005 37 MPICL/ParaGraph Developer:  ParaGraph: University of Illinois, University of Tennessee  MPICL: ORNL Current versions:  Paragraph (no version number, but last available update 1999)  MPICL 2.0 Website:  http://www.csar.uiuc.edu/software/paragraph/  http://www.csm.ornl.gov/picl/ Contacts:  ParaGraph Michael Heath: heath@cs.uiuc.edu Jennifer Finger  MPICL Patrick Worley: worleyph@ornl.gov

38 11 July 2005 38 MPICL/Paragraph Overview Measurement Mechanism  MPI profiling interface  Other wrapper libraries for obsolete vendor-specific message-passing libraries Key Features  Large number of different visualizations (about 27)  Several types Utilization visualizations Communication visualizations “Task” visualizations Other visualizations

39 11 July 2005 39 Paragraph Visualizations: Utilization

40 11 July 2005 40 Paragraph Visualizations: Communication

41 11 July 2005 Other Tools

42 11 July 2005 42 KOJAK Developer: Forschungszentrum Jülich, ICL @ UTK Current versions:  Stable: KOJAK-v2.0  Development: KOJAK v2.1b1 Website:  http://icl.cs.utk.edu/kojak/  http://www.fz-juelich.de/zam/kojak/ Contacts:  Felix Wolf: fwolf@cs.utk.edu  Bernd Mohr: b.mohr@fz-juelich.de  Generic email: kojak@cs.utk.edu

43 11 July 2005 43 KOJAK Overview Measurement Mechanism  MPI profiling interface  Binary instrumentation on a few platforms Key Features  Generates and analyzes trace files Automatic classification of bottlenecks  Simple, scalable profile viewer with source correlation  Exports traces to Vampir format

44 11 July 2005 44 KOJAK Visualization

45 11 July 2005 45 Paradyn Developer: University of Wisconsin, Madison Current versions:  Paradyn: 4.1.1  DynInst: 4.1.1  KernInst: 2.0.1 Website:  http://www.paradyn.org/index.html Contact:  Matthew Legendre: legendre@cs.wisc.edu

46 11 July 2005 46 Paradyn Overview Measurement Mechanism  Dynamic binary instrumentation Key Features  Dynamic instrumentation at runtime No instrumentation phase  Visualizes user-selectable metrics while program is running  Automatic performance bottleneck detection via Performance Consultant  Users can define their own metrics using a TCL-like language  All analysis happens while program is running

47 11 July 2005 47 Paradyn Visualizations

48 11 July 2005 48 Paradyn Performance Consultant

49 11 July 2005 Evaluation Ratings

50 11 July 2005 50 Scoring System Scores given for each category  Usability/productivity  Portability  Scalability  Miscellaneous Scoring formula shown below  Used to generate scores for each category  Weighted sum based on characteristic’s importance Importance multipliers used  Critical: 1.0  Important: 0.75  Average: 0.5  Minor: 0.25 Overall score is sum of all category scores

51 11 July 2005 51 Characteristics: Portability, Miscellaneous, Scalability Categories Portability  Critical Extensibility Hardware support  Important Software support  Minor Heterogeneity support Miscellaneous  Important Cost Interoperability Scalability  Critical Filtering and aggregation Multiple executions Performance bottleneck identification  Minor Searching Note: See appendix for details on how scores were assigned for each characteristic.

52 11 July 2005 52 Characteristics: Usability/Productivity Category Usability/productivity  Critical Available metrics Learning curve Multiple analyses/views Profiling/tracing support Source code correlation  Important Documentation Manual (user) overhead Measurement accuracy Stability  Average Response time Technical support  Minor Installation

53 11 July 2005 53 Usability/Productivity Scores

54 11 July 2005 54 Portability Scores

55 11 July 2005 55 Scalability Scores

56 11 July 2005 56 Miscellaneous Scores

57 11 July 2005 57 Overall Scores

58 11 July 2005 58 Extensibility Study & Demo Question: Should we write new tool from scratch, or reuse existing tool? To help answer, we added preliminary support for GPSHMEM to two tools  Picked top candidate tools for extension, KOJAK and TAU (based on portability scores)  Added weak binding support for GPSHMEM (GCC only)  Created simple GPSHMEM wrapper libraries for KOJAK and TAU  Will study creating comparable components from scratch in near future Notes/caveats  No advanced analyses for one-sided memory operations available in either TAU or KOJAK Only simple support added! Analyzing one-sided operations is difficult.  GPSHMEM requires source patches for weak binding support, only currently works with GCC compilers  Adding UPC support to these tools would require several more orders of magnitude of work (Demo)

59 11 July 2005 59 Q&A

60 11 July 2005 60 References [1]Kathryn Mohror and Karen L. Karavanic. "Performance Tool Support for MPI-2 on Linux," SC2004, November 2004, Pittsburgh, PA. [2]Kathryn Mohror and Karen L. Karavanic. "Performance Tool Support for MPI-2 on Linux," PSU CS Department Technical Report, April 2004. [3]Jeffrey K. Hollingsworth, Michael Steele. “Grindstone: A Test Suite for Parallel Performance Tools,” Technical Report CS-TR-3703, University of Maryland, Oct. 1996. [4]David Bailey, Tim Harris, William Saphir, Rob van der Wijngaart, Alex Woo, and Maurice Yarrow. “The NAS Parallel Benchmarks 2.0,” Technical Report NAS-95-020, NASA, December, 1995.

61 11 July 2005 Appendix: Tool Characteristics Used in Evaluations

62 11 July 2005 Usability/Portability Characteristics

63 11 July 2005 63 Available Metrics Description  Depth of metrics provided by tool  Examples Communication statistics or events Hardware counters Importance rating  Critical, users must be able to obtain representative performance data to debug performance problems Rating strategy  Scored using relative ratings (subjective characteristic)  Compare tool’s available metrics with metrics provided by other tools

64 11 July 2005 64 Documentation Quality Description  Quality of documentation provided  Includes user’s manuals, READMEs, and “quick start” guides Importance rating  Important, can have a large affect on overall usability Rating strategy  Scored using relative ratings (subjective characteristic)  Correlated to how long it takes to decipher documentation enough to use tool  Tools with quick start guides or clear, concise high-level documentation receive higher scores

65 11 July 2005 65 Installation Description  Measure of time needed for installation  Also incorporates level of expertise necessary to perform installation Importance  Minor, installation only needs to be done once and may not even be done by end user Rating strategy  Scored using relative ratings based on mean installation time for all tools  All tools installed by a single person with significant system administration experiences

66 11 July 2005 66 Learning Curve Description  Difficulty level associated with learning to use tool effectively Importance rating  Critical, tools that are perceived as being too difficult to operate by users will be avoided Rating strategy  Scored using relative ratings (subjective characteristic)  Based on time necessary to get acquainted with all features needed for day-to-day operation of tool

67 11 July 2005 67 Manual Overhead Description  Amount of user effort needed to instrument their code Importance rating  Important, tool must not cause more work for user in end (instead it should reduce time!) Rating strategy  Use hypothetical test case MPI program, ~2.5 kloc in 20.c files with 50 user functions  Score one point for each of the following actions that can be completed on a fresh copy of source code in 10 minutes (estimated) Instrument all MPI calls Instrument all functions Instrument five arbitrary functions Instrument all loops, or a subset of loops Instrument all function callsites, or a subset of callsites (about 35)

68 11 July 2005 68 Measurement Accuracy Description  How much runtime instrumentation overhead tool imposes Importance rating  Important, inaccurate data may lead to incorrect diagnosis which creates more work for user with no benefit Rating strategy  Use standard application: CAMEL MPI program  Score based on runtime overhead of instrumented executable (wallclock time) 0-4%: five points 5-9%: four points 10-14%: three points 15-19%: two points 20% or greater: one point

69 11 July 2005 69 Multiple Analyses/Views Description  Different ways tool presents data to user  Different analyses available from within tool Importance rating  Critical, tools must provide enough ways of looking at data so that users may track down performance problems Rating strategy  Score based on relative number of views and analyses provided by each tool  Approximately one point for each different view and analyses provided by tool

70 11 July 2005 70 Profiling/Tracing Support Description  Low-overhead profile mode offered by tool  Comprehensive event trace offered by tool Importance rating  Critical, profile mode useful for quick analysis and trace mode necessary for examining what really happens during execution Rating strategy  Two points if a profiling mode is available  Two points if a tracing mode is available  One extra point if trace file size is within a few percent of best trace file size across all tools

71 11 July 2005 71 Response Time Description  How much time is needed to get data from tool Importance rating  Average, user should not have to wait an extremely long time for data but high-quality information should always be first goal of tools Rating strategy  Score is based on relative time taken to get performance data from tool  Tools that perform post-mortem complicated analyses or bottleneck detection receive lower scores  Tools that provide data while program is running receive five points

72 11 July 2005 72 Source Code Correlation Description  How well tool relates performance data back to original source code Importance rating  Critical, necessary to see which statements and regions of code are causing performance problems Rating strategy  Four to five points if tool supports source correlation to function or line level  One to three points if tool supports indirect method of attributing data to functions or source lines  Zero points if tool does not provide enough data to map performance metrics back to source code

73 11 July 2005 73 Stability Description  How likely tool is to crash while under use Importance rating  Important, unstable tools will frustrate users and decrease productivity Rating strategy  Scored using relative ratings (subjective characteristic)  Score takes into account Number of crashes experienced during evaluation Severity of crashes Number of bugs encountered

74 11 July 2005 74 Technical Support Description  How quick responses are received from tool developers or support departments  Quality of information and helpfulness of responses Importance rating  Average, important for users during installation and initial use of tool but becomes less important as time goes on Rating strategy  Relative rating based on personal communication with our contacts for each tool (subjective characteristic)  Timely, informative responses result in four or more points

75 11 July 2005 Portability Characteristics

76 11 July 2005 76 Extensibility Description  How easy tool may be extended to support UPC and SHMEM Importance rating  Critical, tools that cannot be extended for UPC and SHMEM are almost useless for us Rating strategy  Commercial tools receive zero points Regardless of if export or import functionality is available Interoperability covered by another characteristic  Subjective score based on functionality provided by tool  Also incorporates quality of code (after quick review)

77 11 July 2005 77 Hardware Support Description  Number and depth of hardware platforms supported Importance rating  Critical, essential for portability Rating strategy  Based on our estimate of important architectures for UPC and SHMEM  Award one point for support of each of the following architectures IBM SP (AIX) IBM BlueGene/L AlphaServer (Tru64) Cray X1/X1E (UnicOS) Cray XD1 (Linux w/Cray proprietary interconnect) SGI Altix (Linux w/NUMALink) Generic 64-bit Opteron/Itanium Linux cluster support

78 11 July 2005 78 Heterogeneity Description  Tool support for running programs across different architectures within a single run Importance rating  Minor, not very useful on shared-memory machines Rating strategy  Five points if heterogeneity is supported  Zero points if heterogeneity is not supported

79 11 July 2005 79 Software Support Description  Number of languages, libraries, and compilers supported Importance rating  Important, should support many compilers and not hinder library support but hardware support and extensibility are more important Rating strategy  Score based on relative number of languages, libraries, and compilers supported compared with other tools  Tools that instrument or record data for existing closed-source libraries receive an extra point (up to max of five points)

80 11 July 2005 Scalability Characteristics

81 11 July 2005 81 Filtering and Aggregation Description  How well tool is able to provide users with tools to simplify and summarize data being displayed Importance rating  Critical, necessary for users to effectively work with large data sets generated by performance tools Rating strategy  Scored using relative ratings (slightly subjective characteristic)  Tools that provide many different ways of filtering and aggregating data receive higher scores

82 11 July 2005 82 Multiple Executions Description  Support for relating and comparing performance information from different runs  Examples Automated display of speedup charts Differences between time taken for methods using different algorithms or variants of a single algorithm Importance rating  Critical, import for doing scalability analysis Rating strategy  Five points if tool supports relating data from different runs  Zero points if not

83 11 July 2005 83 Performance Bottleneck Detection Description  How well tool identifies each known (and unknown) bottleneck in our test suite Importance rating  Critical, bottleneck detection the most important function of a performance tool Rating strategy  Score proportional to the number of PASS ratings given for test suite programs  Slightly subjective characteristic; have to guess that the user is able to determine bottleneck based on data provided by tool

84 11 July 2005 84 Searching Description  Ability of the tool to search for particular information or events Importance rating  Minor, can be useful but difficult to provide users with a powerful search that is user-friendly Rating strategy  Five points if searching is support Points deducted if only simple search available  Zero points if no search functionality

85 11 July 2005 Miscellaneous Characteristics

86 11 July 2005 86 Cost Description  How much (per seat) the tool costs to use Importance rating  Important, tools that are prohibitively expensive reduce overall availability of tool Rating strategy  Scale based on per-seat cost Free: five points $1.00 to $499.99: four points $500.00 to $999.99: three points $1,000.00 to $1,999.99: two points $2,000.00 or more: one point

87 11 July 2005 87 Interoperability Description  How well the tool works and integrates with other performance tools Importance rating  Important, tools lacking in areas like trace visualization can make up for it by exporting data that other tools can understand (also helpful for getting data from 3 rd -party sources) Rating strategy  Zero if data cannot be imported or exported from tool  One point for export of data in a simple ASCII format  Additional points (up to five) for each format the tool can export from and import into


Download ppt "11 July 2005 Briefing on Tool Evaluations Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko, Sr."

Similar presentations


Ads by Google