Download presentation
Presentation is loading. Please wait.
Published byJessie Allen Modified over 9 years ago
1
11 July 2005 Briefing on Tool Evaluations Professor Alan D. George, Principal Investigator Mr. Hung-Hsun Su, Sr. Research Assistant Mr. Adam Leko, Sr. Research Assistant Mr. Bryan Golden, Research Assistant Mr. Hans Sherburne, Research Assistant HCS Research Laboratory University of Florida PAT
2
11 July 2005 Purpose & Methodology
3
11 July 2005 3 Purpose of Evaluations Investigate performance analysis methods used in existing tools Determine what features are necessary for tools to be effective Examine usability of tools Find out what performance factors existing tools focus on Create standardized evaluation strategy and apply to popular existing tools Gather information about tool extensibility Generate list of reusable components from existing tools Identify any tools that may serve as basis for our SHMEM and UPC performance tool Take best candidates for extension and gain experience modifying to support new features
4
11 July 2005 4 Evaluation Methodology Generate list of desirable characteristics for performance tools Categorize based on influence of a tool’s: Usability/productivity Portability Scalability Miscellaneous Will present list of characteristics and actual scores in later slide Assign importance rating to each Minor (not really important) Average (nice to have) Important (should include) Critical (absolutely needed) Formulate a scoring strategy for each Give numerical scores 1-5: 5 best 0: not applicable Create objective scoring criteria where possible Use relative scores for subjective categories
5
11 July 2005 5 Performance Tool Test Suite Method used to ensure subjective scores consistent across each tool Also used to determine effectiveness of performance tool Includes Suite of C MPI microbenchmarks that have specific performance problems: PPerfMark [1,2], based on GrindStone [3] Large-scale program: NAS NPB LU benchmark [4] “Control” program with good parallel efficiency to test for false positives: CAMEL cryptanalysis C MPI implementation (HCS lab) For each program in test suite, assign FAIL: Tool was unable to provide information to identify bottleneck TOSS-UP: Tool indicated a bottleneck was occurring, but user must be clever to find out and fix PASS: Tool clearly showed where bottleneck was occurring and gave enough information so a competent user could fix it
6
11 July 2005 6 Performance Tool Test Suite (2) What should performance tool tell us? CAMEL No communication bottlenecks, CPU-bound code Performance could be improved by using non-blocking MPI calls LU Large number of small messages Dependence on network bandwidth and latency Identify which routines take the most time
7
11 July 2005 7 Performance Tool Test Suite (3) Big message Several large messages sent Dependence on network bandwidth Intensive server First node overloaded with work Ping-pong Many small messages, overall execution time dependent on network latency Random barrier One node holds up barrier One procedure responsible for slow node behavior Small messages One node is bombarded with lots of messages Wrong way Point-to-point messages sent in wrong order System time Most time spent in system calls Diffuse procedure Similar to random barrier One node holds up barrier Time for slow procedure “diffused” across several nodes in round-robin fashion
8
11 July 2005 Overview of Tools Evaluated
9
11 July 2005 9 List of Tools Evaluated Profiling tools TAU (Univ. of Oregon) mpiP (ORNL, LLNL) HPCToolkit (Rice Univ.) SvPablo (Univ. of Illinois, Urbana-Champaign) DynaProf (Univ. of Tennessee, Knoxville) Tracing tools Intel Cluster Tools (Intel) MPE/Jumpshot (ANL) Dimemas & Paraver (European Ctr. for Parallelism of Barcelona) MPICL/ParaGraph (Univ. of Illinois, Univ. of Tennessee, ORNL)
10
11 July 2005 10 List of Tools Evaluated (2) Other tools KOJAK (Forschungszentrum Jülich, ICL @ UTK) Paradyn (Univ. of Wisconsin, Madison) Also quickly reviewed CrayPat/Apprentice 2 (Cray) DynTG (LLNL) AIMS (NASA) Eclipse Parallel Tools Platform (LANL) Open/Speedshop (SGI)
11
11 July 2005 Profiling Tools
12
11 July 2005 12 Tuning and Analysis Utilities (TAU) Developer: University of Oregon Current versions: TAU 2.14.4 Program database toolkit 3.3.1 Website: http://www.cs.uoregon.edu/research/paracomp/tau/tautools/ Contact: Sameer Shende: sameer@cs.uoregon.edu
13
11 July 2005 13 TAU Overview Measurement mechanisms Source (manual) Source (automatic via PDToolkit) Binary (DynInst) Key features Supports both profiling and tracing No built-in trace viewer Generic export utility for trace files (.vtf,.slog2,.alog) Many supported architectures Many supported languages: C, C++, Fortran, Python, Java, SHMEM (TurboSHMEM and Cray SHMEM), OpenMP, MPI, Charm Hardware counter support via PAPI
14
11 July 2005 14 TAU Visualizations
15
11 July 2005 15 mpiP Developer: ORNL, LLNL Current version: mpiP v2.8 Website: http://www.llnl.gov/CASC/mpip/ Contacts: Jeffrey Vetter: vetterjs@ornl.gov Chris Chambreau: chcham@llnl.gov
16
11 July 2005 16 mpiP Overview Measurement mechanism Profiling via MPI profiling interface Key features Simple, lightweight profiling Source code correlation (facilitated by mpipview) Gives profile information for MPI callsites Uses PMPI interface with extra libraries (libelf, libdwarf, libunwind) to do source correlation
17
11 July 2005 17 mpiP Source Code Browser
18
11 July 2005 18 HPCToolkit Developer: Rice University Current version: HPCToolkit v1.1 Website: http://www.hipersoft.rice.edu/hpctoolkit/ Contact: John Mellor-Crummey: johnmc@cs.rice.edu Rob Fowler: rjf@cs.rice.edu
19
11 July 2005 19 HPCToolkit Overview Measurement Mechanism Hardware counters (requires PAPI on Linux) Key Features Create hardware counter profiles for any executable via sampling No instrumentation necessary Relies on PAPI overflow events and program counter values to relate PAPI metrics to source code Source code correlation of performance data, even for optimized code Navigation pane in viewer assists in locating resource- consuming functions
20
11 July 2005 20 HPCToolkit Source Browser
21
11 July 2005 21 SvPablo Developer: University of Illinois Current versions: SvPablo 6.0 SDDF component 5.5 Trace Library component 5.1.4 Website: http://www.renci.unc.edu/Software/Pablo/pablo.htm Contact: ?
22
11 July 2005 22 SvPablo Overview Measurement mechanism Profiling via source code instrumentation Key features Single GUI integrates instrumentation and performance data display Assisted source code instrumentation Management of multiple instances of instrumented sourced code and corresponding performance data Simplified scalability analysis of performance data from multiple runs
23
11 July 2005 23 SvPablo Visualization
24
11 July 2005 24 Dynaprof Developer: Philip Mucci (UTK) Current versions: Dynaprof CVS as of 2/21/2005 DynInst API v4.1.1 (dependency) PAPI v3.0.7 (dependency) Website: http://www.cs.utk.edu/~mucci/dynaprof/ Contact: Philip Mucci: mucci@cs.utk.edu
25
11 July 2005 25 Dynaprof Overview Measurement mechanism Profiling via PAPI and DynInst Key features Simple, gdb -like command line interface No instrumentation step needed – binary instrumentation at runtime Produces simple text-based profile output similar to gprof for PAPI metrics Wallclock time CPU time ( getrusage )
26
11 July 2005 Tracing Tools
27
11 July 2005 27 Intel Trace Collector/Analyzer Developer: Intel Current versions: Intel Trace Collector 5.0.1.0 Intel Trace Analyzer 4.0.3.1 Website: http://www.intel.com/software/products/cluster Contact: http://premier.intel.com
28
11 July 2005 28 Intel Trace Collector/Analyzer Overview Measurement Mechanism MPI profiling interface for MPI programs Static binary instrumentation (proprietary method) Key Features Simple, straightforward operation Comprehensive set of visualizations Source code correlation pop-up dialogs Views are linked, allowing analysis of specific portions/phases of execution trace
29
11 July 2005 29 Intel Trace Analyzer Visualizations
30
11 July 2005 30 MPE/Jumpshot Developer: Argonne National Laboratory Current versions: MPE 1.26 Jumpshot-4 Website: http://www-unix.mcs.anl.gov/perfvis/ Contacts: Anthony Chan: chan@mcs.anl.gov David Ashton: ashton@mcs.anl.gov Rusty Lusk: lusk@mcs.anl.gov William Gropp: gropp@mcs.anl.gov
31
11 July 2005 31 MPE/Jumpshot Overview Measurement Mechanism MPI profiling interface for MPI programs Key Features Distributed with MPICH Easy to generate traces of MPI programs Compile with mpicc -mpilog Scalable logfile format for efficient visualization Java-based timeline viewer with extensive scrolling and zooming support
32
11 July 2005 32 Jumpshot Visualization
33
11 July 2005 33 CEPBA Tools (Dimemas, Paraver) Developer: European Center for Parallelism of Barcelona Current versions: MPITrace 1.1 Paraver 3.3 Dimemas 2.3 Website: http://www.cepba.upc.es/tools_i.htm Contact: Judit Gimenez: judit@cepba.upc.edu
34
11 July 2005 34 Dimemas/Paraver Overview Measurement Mechanism MPI profiling interface Key Features Paraver Sophisticated trace file viewer, uses “tape” metaphor Supports displaying hardware counter metrics along in trace visualization Uses modular software architecture, very customizable Dimemas Trace-driven simulator Uses simple models for real hardware Generates “predictive traces” that can be viewed by Paraver
35
11 July 2005 35 Paraver Visualizations
36
11 July 2005 36 Paraver Visualizations (2)
37
11 July 2005 37 MPICL/ParaGraph Developer: ParaGraph: University of Illinois, University of Tennessee MPICL: ORNL Current versions: Paragraph (no version number, but last available update 1999) MPICL 2.0 Website: http://www.csar.uiuc.edu/software/paragraph/ http://www.csm.ornl.gov/picl/ Contacts: ParaGraph Michael Heath: heath@cs.uiuc.edu Jennifer Finger MPICL Patrick Worley: worleyph@ornl.gov
38
11 July 2005 38 MPICL/Paragraph Overview Measurement Mechanism MPI profiling interface Other wrapper libraries for obsolete vendor-specific message-passing libraries Key Features Large number of different visualizations (about 27) Several types Utilization visualizations Communication visualizations “Task” visualizations Other visualizations
39
11 July 2005 39 Paragraph Visualizations: Utilization
40
11 July 2005 40 Paragraph Visualizations: Communication
41
11 July 2005 Other Tools
42
11 July 2005 42 KOJAK Developer: Forschungszentrum Jülich, ICL @ UTK Current versions: Stable: KOJAK-v2.0 Development: KOJAK v2.1b1 Website: http://icl.cs.utk.edu/kojak/ http://www.fz-juelich.de/zam/kojak/ Contacts: Felix Wolf: fwolf@cs.utk.edu Bernd Mohr: b.mohr@fz-juelich.de Generic email: kojak@cs.utk.edu
43
11 July 2005 43 KOJAK Overview Measurement Mechanism MPI profiling interface Binary instrumentation on a few platforms Key Features Generates and analyzes trace files Automatic classification of bottlenecks Simple, scalable profile viewer with source correlation Exports traces to Vampir format
44
11 July 2005 44 KOJAK Visualization
45
11 July 2005 45 Paradyn Developer: University of Wisconsin, Madison Current versions: Paradyn: 4.1.1 DynInst: 4.1.1 KernInst: 2.0.1 Website: http://www.paradyn.org/index.html Contact: Matthew Legendre: legendre@cs.wisc.edu
46
11 July 2005 46 Paradyn Overview Measurement Mechanism Dynamic binary instrumentation Key Features Dynamic instrumentation at runtime No instrumentation phase Visualizes user-selectable metrics while program is running Automatic performance bottleneck detection via Performance Consultant Users can define their own metrics using a TCL-like language All analysis happens while program is running
47
11 July 2005 47 Paradyn Visualizations
48
11 July 2005 48 Paradyn Performance Consultant
49
11 July 2005 Evaluation Ratings
50
11 July 2005 50 Scoring System Scores given for each category Usability/productivity Portability Scalability Miscellaneous Scoring formula shown below Used to generate scores for each category Weighted sum based on characteristic’s importance Importance multipliers used Critical: 1.0 Important: 0.75 Average: 0.5 Minor: 0.25 Overall score is sum of all category scores
51
11 July 2005 51 Characteristics: Portability, Miscellaneous, Scalability Categories Portability Critical Extensibility Hardware support Important Software support Minor Heterogeneity support Miscellaneous Important Cost Interoperability Scalability Critical Filtering and aggregation Multiple executions Performance bottleneck identification Minor Searching Note: See appendix for details on how scores were assigned for each characteristic.
52
11 July 2005 52 Characteristics: Usability/Productivity Category Usability/productivity Critical Available metrics Learning curve Multiple analyses/views Profiling/tracing support Source code correlation Important Documentation Manual (user) overhead Measurement accuracy Stability Average Response time Technical support Minor Installation
53
11 July 2005 53 Usability/Productivity Scores
54
11 July 2005 54 Portability Scores
55
11 July 2005 55 Scalability Scores
56
11 July 2005 56 Miscellaneous Scores
57
11 July 2005 57 Overall Scores
58
11 July 2005 58 Extensibility Study & Demo Question: Should we write new tool from scratch, or reuse existing tool? To help answer, we added preliminary support for GPSHMEM to two tools Picked top candidate tools for extension, KOJAK and TAU (based on portability scores) Added weak binding support for GPSHMEM (GCC only) Created simple GPSHMEM wrapper libraries for KOJAK and TAU Will study creating comparable components from scratch in near future Notes/caveats No advanced analyses for one-sided memory operations available in either TAU or KOJAK Only simple support added! Analyzing one-sided operations is difficult. GPSHMEM requires source patches for weak binding support, only currently works with GCC compilers Adding UPC support to these tools would require several more orders of magnitude of work (Demo)
59
11 July 2005 59 Q&A
60
11 July 2005 60 References [1]Kathryn Mohror and Karen L. Karavanic. "Performance Tool Support for MPI-2 on Linux," SC2004, November 2004, Pittsburgh, PA. [2]Kathryn Mohror and Karen L. Karavanic. "Performance Tool Support for MPI-2 on Linux," PSU CS Department Technical Report, April 2004. [3]Jeffrey K. Hollingsworth, Michael Steele. “Grindstone: A Test Suite for Parallel Performance Tools,” Technical Report CS-TR-3703, University of Maryland, Oct. 1996. [4]David Bailey, Tim Harris, William Saphir, Rob van der Wijngaart, Alex Woo, and Maurice Yarrow. “The NAS Parallel Benchmarks 2.0,” Technical Report NAS-95-020, NASA, December, 1995.
61
11 July 2005 Appendix: Tool Characteristics Used in Evaluations
62
11 July 2005 Usability/Portability Characteristics
63
11 July 2005 63 Available Metrics Description Depth of metrics provided by tool Examples Communication statistics or events Hardware counters Importance rating Critical, users must be able to obtain representative performance data to debug performance problems Rating strategy Scored using relative ratings (subjective characteristic) Compare tool’s available metrics with metrics provided by other tools
64
11 July 2005 64 Documentation Quality Description Quality of documentation provided Includes user’s manuals, READMEs, and “quick start” guides Importance rating Important, can have a large affect on overall usability Rating strategy Scored using relative ratings (subjective characteristic) Correlated to how long it takes to decipher documentation enough to use tool Tools with quick start guides or clear, concise high-level documentation receive higher scores
65
11 July 2005 65 Installation Description Measure of time needed for installation Also incorporates level of expertise necessary to perform installation Importance Minor, installation only needs to be done once and may not even be done by end user Rating strategy Scored using relative ratings based on mean installation time for all tools All tools installed by a single person with significant system administration experiences
66
11 July 2005 66 Learning Curve Description Difficulty level associated with learning to use tool effectively Importance rating Critical, tools that are perceived as being too difficult to operate by users will be avoided Rating strategy Scored using relative ratings (subjective characteristic) Based on time necessary to get acquainted with all features needed for day-to-day operation of tool
67
11 July 2005 67 Manual Overhead Description Amount of user effort needed to instrument their code Importance rating Important, tool must not cause more work for user in end (instead it should reduce time!) Rating strategy Use hypothetical test case MPI program, ~2.5 kloc in 20.c files with 50 user functions Score one point for each of the following actions that can be completed on a fresh copy of source code in 10 minutes (estimated) Instrument all MPI calls Instrument all functions Instrument five arbitrary functions Instrument all loops, or a subset of loops Instrument all function callsites, or a subset of callsites (about 35)
68
11 July 2005 68 Measurement Accuracy Description How much runtime instrumentation overhead tool imposes Importance rating Important, inaccurate data may lead to incorrect diagnosis which creates more work for user with no benefit Rating strategy Use standard application: CAMEL MPI program Score based on runtime overhead of instrumented executable (wallclock time) 0-4%: five points 5-9%: four points 10-14%: three points 15-19%: two points 20% or greater: one point
69
11 July 2005 69 Multiple Analyses/Views Description Different ways tool presents data to user Different analyses available from within tool Importance rating Critical, tools must provide enough ways of looking at data so that users may track down performance problems Rating strategy Score based on relative number of views and analyses provided by each tool Approximately one point for each different view and analyses provided by tool
70
11 July 2005 70 Profiling/Tracing Support Description Low-overhead profile mode offered by tool Comprehensive event trace offered by tool Importance rating Critical, profile mode useful for quick analysis and trace mode necessary for examining what really happens during execution Rating strategy Two points if a profiling mode is available Two points if a tracing mode is available One extra point if trace file size is within a few percent of best trace file size across all tools
71
11 July 2005 71 Response Time Description How much time is needed to get data from tool Importance rating Average, user should not have to wait an extremely long time for data but high-quality information should always be first goal of tools Rating strategy Score is based on relative time taken to get performance data from tool Tools that perform post-mortem complicated analyses or bottleneck detection receive lower scores Tools that provide data while program is running receive five points
72
11 July 2005 72 Source Code Correlation Description How well tool relates performance data back to original source code Importance rating Critical, necessary to see which statements and regions of code are causing performance problems Rating strategy Four to five points if tool supports source correlation to function or line level One to three points if tool supports indirect method of attributing data to functions or source lines Zero points if tool does not provide enough data to map performance metrics back to source code
73
11 July 2005 73 Stability Description How likely tool is to crash while under use Importance rating Important, unstable tools will frustrate users and decrease productivity Rating strategy Scored using relative ratings (subjective characteristic) Score takes into account Number of crashes experienced during evaluation Severity of crashes Number of bugs encountered
74
11 July 2005 74 Technical Support Description How quick responses are received from tool developers or support departments Quality of information and helpfulness of responses Importance rating Average, important for users during installation and initial use of tool but becomes less important as time goes on Rating strategy Relative rating based on personal communication with our contacts for each tool (subjective characteristic) Timely, informative responses result in four or more points
75
11 July 2005 Portability Characteristics
76
11 July 2005 76 Extensibility Description How easy tool may be extended to support UPC and SHMEM Importance rating Critical, tools that cannot be extended for UPC and SHMEM are almost useless for us Rating strategy Commercial tools receive zero points Regardless of if export or import functionality is available Interoperability covered by another characteristic Subjective score based on functionality provided by tool Also incorporates quality of code (after quick review)
77
11 July 2005 77 Hardware Support Description Number and depth of hardware platforms supported Importance rating Critical, essential for portability Rating strategy Based on our estimate of important architectures for UPC and SHMEM Award one point for support of each of the following architectures IBM SP (AIX) IBM BlueGene/L AlphaServer (Tru64) Cray X1/X1E (UnicOS) Cray XD1 (Linux w/Cray proprietary interconnect) SGI Altix (Linux w/NUMALink) Generic 64-bit Opteron/Itanium Linux cluster support
78
11 July 2005 78 Heterogeneity Description Tool support for running programs across different architectures within a single run Importance rating Minor, not very useful on shared-memory machines Rating strategy Five points if heterogeneity is supported Zero points if heterogeneity is not supported
79
11 July 2005 79 Software Support Description Number of languages, libraries, and compilers supported Importance rating Important, should support many compilers and not hinder library support but hardware support and extensibility are more important Rating strategy Score based on relative number of languages, libraries, and compilers supported compared with other tools Tools that instrument or record data for existing closed-source libraries receive an extra point (up to max of five points)
80
11 July 2005 Scalability Characteristics
81
11 July 2005 81 Filtering and Aggregation Description How well tool is able to provide users with tools to simplify and summarize data being displayed Importance rating Critical, necessary for users to effectively work with large data sets generated by performance tools Rating strategy Scored using relative ratings (slightly subjective characteristic) Tools that provide many different ways of filtering and aggregating data receive higher scores
82
11 July 2005 82 Multiple Executions Description Support for relating and comparing performance information from different runs Examples Automated display of speedup charts Differences between time taken for methods using different algorithms or variants of a single algorithm Importance rating Critical, import for doing scalability analysis Rating strategy Five points if tool supports relating data from different runs Zero points if not
83
11 July 2005 83 Performance Bottleneck Detection Description How well tool identifies each known (and unknown) bottleneck in our test suite Importance rating Critical, bottleneck detection the most important function of a performance tool Rating strategy Score proportional to the number of PASS ratings given for test suite programs Slightly subjective characteristic; have to guess that the user is able to determine bottleneck based on data provided by tool
84
11 July 2005 84 Searching Description Ability of the tool to search for particular information or events Importance rating Minor, can be useful but difficult to provide users with a powerful search that is user-friendly Rating strategy Five points if searching is support Points deducted if only simple search available Zero points if no search functionality
85
11 July 2005 Miscellaneous Characteristics
86
11 July 2005 86 Cost Description How much (per seat) the tool costs to use Importance rating Important, tools that are prohibitively expensive reduce overall availability of tool Rating strategy Scale based on per-seat cost Free: five points $1.00 to $499.99: four points $500.00 to $999.99: three points $1,000.00 to $1,999.99: two points $2,000.00 or more: one point
87
11 July 2005 87 Interoperability Description How well the tool works and integrates with other performance tools Importance rating Important, tools lacking in areas like trace visualization can make up for it by exporting data that other tools can understand (also helpful for getting data from 3 rd -party sources) Rating strategy Zero if data cannot be imported or exported from tool One point for export of data in a simple ASCII format Additional points (up to five) for each format the tool can export from and import into
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.