Download presentation
Presentation is loading. Please wait.
Published byDorothy Paul Modified over 9 years ago
1
1 Parallel Performance Analysis with Open|SpeedShop Half Day Tutorial @ SC 2008 Austin, TX
2
Tutorial @ SC2008, Austin, TX Slide 2 What is Open|SpeedShop? Comprehensive Open Source Performance Analysis Framework Combining Profiling and Tracing Combining Profiling and Tracing Common workflow for all experiments Common workflow for all experiments Flexible instrumentation Flexible instrumentation Extensibility through plugins Extensibility through pluginsPartners DOE/NNSA Tri-Labs (LLNL, LANL, SNLs) DOE/NNSA Tri-Labs (LLNL, LANL, SNLs) Krell Institute Krell Institute Universities of Wisconsin and Maryland Universities of Wisconsin and Maryland
3
Tutorial @ SC2008, Austin, TX Slide 3 Highlights Open Source Performance Analysis Tool Framework Most common performance analysis steps all in one tool Most common performance analysis steps all in one tool Extensible by using plugins for data collection and representation Extensible by using plugins for data collection and representation Several Instrumentation Options All work on unmodified application binaries All work on unmodified application binaries Offline and online data collection / attach to running applications Offline and online data collection / attach to running applications Flexible and Easy to use User access through GUI, Command Line, and Python Scripting User access through GUI, Command Line, and Python Scripting Large Range of Platforms Linux Clusters with x86, IA-64, Opteron, and EM64T CPUs Linux Clusters with x86, IA-64, Opteron, and EM64T CPUs Easier portability with offline data collection mechanism Easier portability with offline data collection mechanismAvailability Used at all three ASC labs with lab-size applications Used at all three ASC labs with lab-size applications Full source available on sourceforge.net Full source available on sourceforge.net
4
Tutorial @ SC2008, Austin, TX Slide 4 O|SS Target Audience Programmers/code teams Use Open|SpeedShop out of the box Use Open|SpeedShop out of the box Powerful performance analysis Powerful performance analysis Ability to integrate O|SS into projects Ability to integrate O|SS into projects Tool developers Single, comprehensive infrastructure Single, comprehensive infrastructure Easy deployment of new tools Easy deployment of new tools Project/product specific customizations Predefined/custom experiments Predefined/custom experiments
5
Tutorial @ SC2008, Austin, TX Slide 5 Tutorial Goals Introduce Open|SpeedShop Basic concepts & terminology Basic concepts & terminology Running first examples Running first examples Provide Overview of Features Sampling & Tracing in O|SS Sampling & Tracing in O|SS Performance comparisons Performance comparisons Parallel performance analysis Parallel performance analysis Overview of advanced techniques Interactive performance analysis Interactive performance analysis Scripting & Python integration Scripting & Python integration
6
Tutorial @ SC2008, Austin, TX Slide 6 “Rules” Let’s keep this interactive Feel free to ask as we go along Feel free to ask as we go along Online demos as we go along Online demos as we go along Feel free to play along Live CDs with O|SS installed (for PCs) Live CDs with O|SS installed (for PCs) Ask us if you get stuck Ask us if you get stuck Feedback on O|SS What is good/missing in the tool? What is good/missing in the tool? What should be done differently? What should be done differently? Please report bugs/incompatibilities Please report bugs/incompatibilities
7
Tutorial @ SC2008, Austin, TX Slide 7 Presenters Martin Schulz, LLNL Jim Galarowicz, Krell Don Maghrak, Krell David Montoya, LANL Scott Cranford, Sandia Larger Team: William Hachfeld, Krell William Hachfeld, Krell Samuel Gutierrez, LANL Samuel Gutierrez, LANL Joseph Kenny, Sandia NLs Joseph Kenny, Sandia NLs Chris Chambreau, LLNL Chris Chambreau, LLNL
8
Tutorial @ SC2008, Austin, TX Slide 8 Outline Introduction & Overview Running a First Experiment O|SS Sampling Simple Comparisons Break (30 minutes) I/O Tracing Experiments Parallel Performance Analysis Installation Requirements and Process Advanced Capabilities
9
9 Section 1 Overview & Terminology Parallel Performance Analysis with Open|SpeedShop
10
Tutorial @ SC2008, Austin, TX Slide 10 Results Experiment Workflow Run Application “Experiment” Results can be displayed using several “Views” Process Management Panel Consists of one or more data “Collectors” Stored in SQL database
11
Tutorial @ SC2008, Austin, TX Slide 11 High-level Architecture GUIpyO|SS CLI AMD and Intel based clusters/SSI using Linux CLI Open Source Software Code Instrumentation Experiments
12
Tutorial @ SC2008, Austin, TX Slide 12 Performance Experiments Concept of an Experiment What to measure and analyze? What to measure and analyze? Experiment chosen by user Experiment chosen by user Any experiment can be applied to any code Any experiment can be applied to any code Consists of Collectors and Views Collectors define specific data sources Collectors define specific data sources Hardware counters Tracing of certain routines Views specify data aggregation and presentation Views specify data aggregation and presentation Multiple collectors per experiment possible Multiple collectors per experiment possible
13
Tutorial @ SC2008, Austin, TX Slide 13 Experiment Types in O|SS Sampling Experiments Periodically interrupt run and record location Periodically interrupt run and record location Report statistical distribution of these locations Report statistical distribution of these locations Typically provides good overview Typically provides good overview Overhead mostly low and uniform Overhead mostly low and uniform Tracing Experiments Gather and store individual application events, e.g., function invocations (MPI, I/O, …) Gather and store individual application events, e.g., function invocations (MPI, I/O, …) Provides detailed, low-level information Provides detailed, low-level information Higher overhead, potentially bursty Higher overhead, potentially bursty
14
Tutorial @ SC2008, Austin, TX Slide 14 Sampling Experiments PC Sampling (pcsamp) Record PC in user defined time intervals Record PC in user defined time intervals Low overhead overview of time distribution Low overhead overview of time distribution User Time (usertime) PC Sampling + Call stacks for each sample PC Sampling + Call stacks for each sample Provides inclusive & exclusive timing data Provides inclusive & exclusive timing data Hardware Counters (hwc, hwctime) Sample HWC overflow events Sample HWC overflow events Access to data like cache and TLB misses Access to data like cache and TLB misses * Updated
15
Tutorial @ SC2008, Austin, TX Slide 15 Tracing Experiments I/O Tracing (io, iot) Record invocation of all POSIX I/O events Record invocation of all POSIX I/O events Provides aggregate and individual timings Provides aggregate and individual timings MPI Tracing (mpi, mpit, mpiotf) Record invocation of all MPI routines Record invocation of all MPI routines Provides aggregate and individual timings Provides aggregate and individual timings Floating Point Exception Tracing (fpe) Triggered by any FPE caused by the code Triggered by any FPE caused by the code Helps pinpoint numerical problem areas Helps pinpoint numerical problem areas * Updated
16
Tutorial @ SC2008, Austin, TX Slide 16 Parallel Experiments O|SS supports MPI and threaded codes Tested with a variety of MPI implementation Tested with a variety of MPI implementation Thread support based on POSIX threads Thread support based on POSIX threads OpenMPI only supported through POSIX threads OpenMPI only supported through POSIX threads Any experiment can be parallel Automatically applied to all tasks/threads Automatically applied to all tasks/threads Default views aggregate across all tasks/threads Default views aggregate across all tasks/threads Data from individual tasks/threads available Data from individual tasks/threads available Specific parallel experiments (e.g., MPI)
17
Tutorial @ SC2008, Austin, TX Slide 17 High-level Architecture GUIpyO|SS CLI AMD and Intel based clusters/SSI using Linux CLI Open Source Software Code Instrumentation
18
Tutorial @ SC2008, Austin, TX Slide 18 Code Instrumentation in O|SS Dynamic Instrumentation through DPCL Data delivered directly to tool online Data delivered directly to tool online Ability to attach to running application Ability to attach to running application
19
Tutorial @ SC2008, Austin, TX Slide 19 Initial Solution: DPCL MPI Application O|SS DPCL Communication Bottleneck OS limitations Communication Bottleneck OS limitations
20
Tutorial @ SC2008, Austin, TX Slide 20 Code Instrumentation in O|SS Dynamic Instrumentation through DPCL Data delivered directly to tool online Data delivered directly to tool online Ability to attach to running application Ability to attach to running application Offline/External Data Collection Instrument application at startup Instrument application at startup Write data to raw files and convert to O|SS Write data to raw files and convert to O|SS
21
Tutorial @ SC2008, Austin, TX Slide 21 Offline Data Collection MPI Application O|SS post- mortem Offline MPI Application O|SS DPCL
22
Tutorial @ SC2008, Austin, TX Slide 22 Code Instrumentation in O|SS Dynamic Instrumentation through DPCL Data delivered directly to tool online Data delivered directly to tool online Ability to attach to running application Ability to attach to running application Offline/External Data Collection Instrument application at startup Instrument application at startup Write data to raw files and convert to O|SS Write data to raw files and convert to O|SS Scalable Data Collection with MRNet Similar to DPCL, but scalable transport layer Similar to DPCL, but scalable transport layer Ability for interactive online analysis Ability for interactive online analysis
23
Tutorial @ SC2008, Austin, TX Slide 23 Hierarchical Online Collection MPI Application O|SS post- mortem Offline MPI Application O|SS DPCL MPI Application O|SS MRNet
24
Tutorial @ SC2008, Austin, TX Slide 24 Code Instrumentation in O|SS Dynamic Instrumentation through DPCL Data delivered directly to tool online Data delivered directly to tool online Ability to attach to running application Ability to attach to running application Offline/External Data Collection Instrument application at startup Instrument application at startup Write data to raw files and convert to O|SS Write data to raw files and convert to O|SS Scalable Data Collection with MRNet Similar to DPCL, but scalable transport layer Similar to DPCL, but scalable transport layer Ability for interactive online analysis Ability for interactive online analysis
25
Tutorial @ SC2008, Austin, TX Slide 25 High-level Architecture GUIpyO|SS CLI AMD and Intel based clusters/SSI using Linux CLI Open Source Software Code Instrumentation
26
Tutorial @ SC2008, Austin, TX Slide 26 Three Interfaces Experiment Commands expAttach expCreate expDetach expGo expView List Commands listExp listHosts listStatus Session Commands setBreak openGui import openss my_filename=oss.FileList("myprog.a.out") my_exptype=oss.ExpTypeList("pcsamp") my_id=oss.expCreate(my_filename,my_exptype) oss.expGo() My_metric_list = oss.MetricList("exclusive") my_viewtype = oss.ViewTypeList("pcsamp“) result = oss.expView(my_id,my_viewtype,my_metric_list)
27
Tutorial @ SC2008, Austin, TX Slide 27 Summary Open|SpeedShop provides comprehensive performance analysis options Important terminology Experiments: types of performance analysis Experiments: types of performance analysis Collectors: data sources Collectors: data sources Views: data presentation and aggregation Views: data presentation and aggregation Sampling vs. Tracing Sampling: overview data at low overhead Sampling: overview data at low overhead Tracing: details, but at higher cost Tracing: details, but at higher cost
28
28 Section 2 Running you First Experiment Parallel Performance Analysis with Open|SpeedShop
29
Tutorial @ SC2008, Austin, TX Slide 29 Running your First Experiment What do we mean by an experiment? Running a very basic experiment What does the command syntax look like? What does the command syntax look like? What are the outputs from the experiment? What are the outputs from the experiment? Viewing and Interpreting gathered measurements Introduce additional command syntax
30
Tutorial @ SC2008, Austin, TX Slide 30 What is an Experiment? The concept of an experiment Identify the application/executable to profile Identify the application/executable to profile Identify the type of performance data that is to be gathered Identify the type of performance data that is to be gathered Together they form what we call an experiment Together they form what we call an experimentApplication/executable Doesn’t need to be recompiled but needs –g type option to associate gathered data with functions and/or statements. Doesn’t need to be recompiled but needs –g type option to associate gathered data with functions and/or statements. Type of performance data (metric) Sampling based (program counter, call stack, # of events) Sampling based (program counter, call stack, # of events) Tracing based (wrap functions and record information) Tracing based (wrap functions and record information)
31
Tutorial @ SC2008, Austin, TX Slide 31 Basic experiment syntax openss –offline –f “executable” pcsamp openss is the command to invoke Open|SpeedShop openss is the command to invoke Open|SpeedShop -offline indicates the user interface to use (immediate command) -offline indicates the user interface to use (immediate command) There are a number of user interface options -f is the option for specifying the executable name -f is the option for specifying the executable name The “executable” can be a sequential or parallel command pcsamp indicates what type of performance data (metric) you will gather pcsamp indicates what type of performance data (metric) you will gather Here pcsamp indicates that we will periodically take a sample of the address that the program counter is pointing to. We will associate that address with a function and/or source line. There are several existing performance metric choices
32
Tutorial @ SC2008, Austin, TX Slide 32 What are the outputs? Outputs from : openss –offline –f “executable” pcsamp Normal program output while executable is running Normal program output while executable is running The sorted list of performance information The sorted list of performance information A list of the top time taking functions The corresponding sample derived time for each function A performance information database file A performance information database file The database file contains all the information needed to view the data at anytime in the future without the executable(s). Symbol table information from executable(s) and system libraries Performance data openss gathered Time stamps for when dso(s) were loaded and unloaded
33
Tutorial @ SC2008, Austin, TX Slide 33 Example Run with Output * Updated openss –offline –f “orterun -np 128 sweep3d.mpi” pcsamp
34
Tutorial @ SC2008, Austin, TX Slide 34 Example Run with Output * Updated openss –offline –f “orterun -np 128 sweep3d.mpi” pcsamp
35
Tutorial @ SC2008, Austin, TX Slide 35 Using the Database file Database file is one of the outputs from running: openss –offline –f “executable” pcsamp Use this file to view the data Use this file to view the data How to open the database file with openss How to open the database file with openss openss –f openss –f openss (then use menus or wizard to open) openss –cli exprestore –f exprestore –f In this example, we show both: In this example, we show both: openss –f X.0.openss (GUI) openss –cli –f X.0.openss (CLI) X.0.openss is the file name openss creates by default * Updated
36
Tutorial @ SC2008, Austin, TX Slide 36 Output from Example Run Loading the database file: openss –cli –f X.0.openss * NEW
37
Tutorial @ SC2008, Austin, TX Slide 37 openss –f X.0.openss: Control your job, focus stats panel, create process subsets Process Management Panel * NEW
38
Tutorial @ SC2008, Austin, TX Slide 38 GUI view of gathered data * Updated
39
Tutorial @ SC2008, Austin, TX Slide 39 Associate Source & Data * Updated
40
Tutorial @ SC2008, Austin, TX Slide 40 Additonal experiment syntax openss –offline –f “executable” pcsamp -offline indicates the user interface is immediate command mode. -offline indicates the user interface is immediate command mode. Uses offline (LD_PRELOAD) collection mechanism. Uses offline (LD_PRELOAD) collection mechanism. openss –cli –f “executable” pcsamp -cli indicates the user interface is interactive command line. -cli indicates the user interface is interactive command line. Uses online (dynamic instrumentation) collection mechanism. Uses online (dynamic instrumentation) collection mechanism. openss –f “executable” pcsamp No option indicates the user interface is graphical user. No option indicates the user interface is graphical user. Uses online (dynamic instrumentation) collection mechanism. Uses online (dynamic instrumentation) collection mechanism. openss –batch < input.commands.file Executes from file of cli commands Executes from file of cli commands
41
Tutorial @ SC2008, Austin, TX Slide 41 Demo of Basic Experiment Run Program Counter Sampling (pcsamp) on a sequential application.
42
42 Section 3 Sampling Experiments Parallel Performance Analysis with Open|SpeedShop
43
Tutorial @ SC2008, Austin, TX Slide 43 Sampling Experiments PC Sampling Approximates CPU Time For Line and Function Approximates CPU Time For Line and Function No Call Stacks No Call Stacks User Time Inclusive vs. Exclusive Time Inclusive vs. Exclusive Time Includes Call stacks Includes Call stacks HW Counters Samples Hardware Counter Overflows Samples Hardware Counter Overflows
44
Tutorial @ SC2008, Austin, TX Slide 44 Sampling - Considerations Sampling: Statistical Subset of All Events Low Overhead Low Overhead Low Perturbation Low Perturbation Good to Get Overview / Find Hotspots Good to Get Overview / Find Hotspots
45
Tutorial @ SC2008, Austin, TX Slide 45 Example 1 Offline usertime Experiment – smg2000 [samuel@yra084 test]$ openss -cli [samuel@yra084 test]$ openss -cli Welcome to OpenSpeedShop 1.6 Welcome to OpenSpeedShop 1.6 openss>>RunOfflineExp("mpirun -np 16 smg2000 -n 100 100 100","usertime") openss>>RunOfflineExp("mpirun -np 16 smg2000 -n 100 100 100","usertime") Experiment Application
46
Tutorial @ SC2008, Austin, TX Slide 46 Example 1- Views Default View Values Aggregated Across All Ranks Values Aggregated Across All Ranks Manually Include/Exclude Individual Processes Manually Include/Exclude Individual Processes
47
Tutorial @ SC2008, Austin, TX Slide 47 Example 1- Views Cont. Load Balance View Load Balance View Calculates min, max, average Across Calculates min, max, average Across Ranks, Processes, or Threads Ranks, Processes, or Threads
48
Tutorial @ SC2008, Austin, TX Slide 48 Example 1- Views Cont. Butterfly View Butterfly View Callers Of hypre_SMGResidual Callees Of hypre_SMGResidual * Updated
49
Tutorial @ SC2008, Austin, TX Slide 49 * Updated Example 1 Cont. Source Code Mapping
50
Tutorial @ SC2008, Austin, TX Slide 50 Example 2 – hwc Offline hwc Experiment – gamess openss -offline -f "./rungms tools/fmo/samples/PIEDA 01 1" hwc openss -offline -f "./rungms tools/fmo/samples/PIEDA 01 1" hwc-OR- openss -cli openss -cli Welcome to OpenSpeedShop 1.6 Welcome to OpenSpeedShop 1.6 openss>>RunOfflineExp("./rungms tools/fmo/samples/PIEDA 01 1","hwc") openss>>RunOfflineExp("./rungms tools/fmo/samples/PIEDA 01 1","hwc")
51
Tutorial @ SC2008, Austin, TX Slide 51 Example 2 – hwc Cont. Offline hwc Experiment –Default ViewOffline hwc Experiment –Default View
52
52 Section 4 Simple Comparisons Parallel Performance Analysis with Open|SpeedShop
53
Tutorial @ SC2008, Austin, TX Slide 53 Simple comparisons Views provide basic “default” views Sorted by statements/functions/objects Sorted by statements/functions/objects Sum across all tasks Sum across all tasks How to get a more insight? Compare results from different sources Compare results from different sources Combine multiple metrics Combine multiple metrics O|SS allows users to customize views Independent of experiment or metric Independent of experiment or metric Predefined “canned” views Predefined “canned” views User defined views User defined views
54
Tutorial @ SC2008, Austin, TX Slide 54 Compare Wizard Compare entire experiments against each other Select compare experiments from Intro Wizard Select compare experiments from Intro Wizard Select experiments from dialog page Select experiments from dialog page Follow wizard instructions Follow wizard instructions Creates a side by side compare panel Creates a side by side compare panel Example shown in next slides Create & Run executable from “orig” directory Create & Run executable from “orig” directory Copy source to “modified” directory & change Copy source to “modified” directory & change Run new “modified” version & compare results Run new “modified” version & compare results * NEW
55
Tutorial @ SC2008, Austin, TX Slide 55 Compare Wizard Select Compare Saved Performance Data * NEW
56
Tutorial @ SC2008, Austin, TX Slide 56 Compare Wizard Choose the experiments you want to compare * NEW
57
Tutorial @ SC2008, Austin, TX Slide 57 Compare Wizard Side by side performance results * NEW
58
Tutorial @ SC2008, Austin, TX Slide 58 Compare Wizard Go to the source for the selected function * NEW
59
Tutorial @ SC2008, Austin, TX Slide 59 Compare Wizard Side by Side Source for the two versions * NEW
60
Tutorial @ SC2008, Austin, TX Slide 60 Customizing Views Pick your own views Compare entire experiments against each other Compare entire experiments against each other Combine any data from any active experiment Combine any data from any active experiment Select metrics provided by any collector Select metrics provided by any collector Restrict data to arbitrary node sets Restrict data to arbitrary node sets Configurable through menu option “Customize Stats Panel” Activate from stats panel context menu or select “CC” icon in toolbar Activate from stats panel context menu or select “CC” icon in toolbar Allows definition of content for each column Allows definition of content for each column
61
Tutorial @ SC2008, Austin, TX Slide 61 Terminology: Views / Columns View Column
62
Tutorial @ SC2008, Austin, TX Slide 62 Designing a New View Decide on what to compare Load all necessary experiments Load all necessary experiments Create all columns using context menu Create all columns using context menu Select data for each column Pick experiment/collector/metric Pick experiment/collector/metric Select process and thread set Select process and thread set Drag and drop from process management panel Process Add/Remove dialog box “Focus StatsPanel” to activate view Context menu of Customize View StatsPanel Context menu of Customize View StatsPanel
63
Tutorial @ SC2008, Austin, TX Slide 63 Customize Comparison Panel Activate from context menu or select icon
64
Tutorial @ SC2008, Austin, TX Slide 64 Customize Comparison Panel “Load another experiment menu item” to load exp. Load additional experiments
65
Tutorial @ SC2008, Austin, TX Slide 65 Customize Comparison Panel Choose experiment to load.
66
Tutorial @ SC2008, Austin, TX Slide 66 Customize Comparison Panel Add a compare column Add compare column menu item
67
Tutorial @ SC2008, Austin, TX Slide 67 Customize Comparison Panel Choose experiment to compare in new column (2) Available Experiments has two choices
68
Tutorial @ SC2008, Austin, TX Slide 68 Customize Comparison Panel Choose “Focus StatsPanel” to create new panel Focus StatsPanel creates a comparison stats panel
69
Tutorial @ SC2008, Austin, TX Slide 69 Comparing Experiments View after Choosing “Focus Stats Panel” Data from “pcsamp” Experiment with ID 1 Data from “hwc” Experiment with ID 2
70
Tutorial @ SC2008, Austin, TX Slide 70 Summary Simple Comparison of two different metrics on same executable Views can be customized Enable detailed analysis Enable detailed analysis Customized Views to contrast … Results from multiple experiments Results from multiple experiments Multiple metrics from several collectors Multiple metrics from several collectors
71
71 Section 5 I/O Tracing Experiments Parallel Performance Analysis with Open|SpeedShop
72
Tutorial @ SC2008, Austin, TX Slide 72 I/O Tracing - Overview What Does IO Tracing Record? I/O Events I/O Events Intercepts All Calls to I/O Functions Intercepts All Calls to I/O Functions Records Current Stack Trace & Start/End Time Records Current Stack Trace & Start/End Time What About IOT Tracing? Collects Additional Information Collects Additional Information Function Parameters Function Return Value
73
Tutorial @ SC2008, Austin, TX Slide 73 I/O Tracing – Overview. What I/O Events Are Recorded? read, readv, pread, pread64 read, readv, pread, pread64 write, writev, pwrite, pwrite64 write, writev, pwrite, pwrite64 open, open64 open, open64 close close pipe, dup, dup2 pipe, dup, dup2 creat, creat64 creat, creat64 lseek, lseek64 lseek, lseek64 * Updated
74
Tutorial @ SC2008, Austin, TX Slide 74 I/O Tracing - Considerations Trace Experiments Collect Large Amounts of Data Collect Large Amounts of Data More Overhead (Compared To Sampling) More Overhead (Compared To Sampling) More Perturbation (Compared To Sampling) More Perturbation (Compared To Sampling) Allows For Fine-grained Analysis Allows For Fine-grained Analysis
75
Tutorial @ SC2008, Austin, TX Slide 75 Example 1 Offline IOT Experiment – gamess openss -offline -f "./ddikick.x gamess.01.x tools/efp/lysine -ddi 1 1 lanz -scr /scr/scranfo" iot openss -offline -f "./ddikick.x gamess.01.x tools/efp/lysine -ddi 1 1 lanz -scr /scr/scranfo" iot-OR- openss -cli openss -cli Welcome to OpenSpeedShop 1.6 Welcome to OpenSpeedShop 1.6 openss>>RunOfflineExp("./ddikick.x gamess.01.x tools/efp/lysine -ddi 1 1 lanz -scr /scr/scranfo","iot") openss>>RunOfflineExp("./ddikick.x gamess.01.x tools/efp/lysine -ddi 1 1 lanz -scr /scr/scranfo","iot")
76
Tutorial @ SC2008, Austin, TX Slide 76 Example 1 Cont. View Experiment Status – CLI expstatus expstatus
77
Tutorial @ SC2008, Austin, TX Slide 77 Example 1 Cont. View Experiment Output – CLI expview expview CLI View Help help expview help expview View Experiment Output – GUI opengui opengui
78
Tutorial @ SC2008, Austin, TX Slide 78 Example 1 Cont. expview Show Top N IO Calls expview -m count iotN openss>>expview -m count iot6 Number of Calls Function (defining location) Number of Calls Function (defining location) 6994 __libc_write (/lib/libpthread-2.4.so) 6994 __libc_write (/lib/libpthread-2.4.so) 548 llseek (/lib/libpthread-2.4.so) 548 llseek (/lib/libpthread-2.4.so) 384 __libc_read (/lib/libpthread-2.4.so) 384 __libc_read (/lib/libpthread-2.4.so) 8 __libc_close (/lib/libpthread-2.4.so) 8 __libc_close (/lib/libpthread-2.4.so) 5 __libc_open64 (/lib/libpthread-2.4.so) 5 __libc_open64 (/lib/libpthread-2.4.so) 5 __dup (/lib/libc-2.4.so) 5 __dup (/lib/libc-2.4.so)
79
Tutorial @ SC2008, Austin, TX Slide 79 Example 1 Cont. expview Show Time Spent In Particular Function expview -f expview -f openss>>expview -f __libc_read Exclusive I/O Call % of Total Function (defining location) Time(ms) Time(ms) 11.742272 0.416965 __libc_read (/lib/libpthread.so) 11.742272 0.416965 __libc_read (/lib/libpthread.so)
80
Tutorial @ SC2008, Austin, TX Slide 80 Example 1 Cont. expview Show Call Count For Particular Function expview -f -m count openss>>expview -f __libc_read -m count Number of Calls Function (defining location) Number of Calls Function (defining location) 384 __libc_read (/lib/libpthread-2.4.so) 384 __libc_read (/lib/libpthread-2.4.so)
81
Tutorial @ SC2008, Austin, TX Slide 81 Example 1 Cont. expview Show Ret. Val. & Start Time For Top N Calls expview -v trace -f -m retval -m start_time iotN openss>>expview -v trace -f __libc_read -m retval -m start_time iot5 Function Dependent Start Time Call Stack Function (defining location) Function Dependent Start Time Call Stack Function (defining location) Return Value Return Value 32720 2008/07/08 15:36:37 >>__libc_read (/lib/libpthread-2.4.so) 32720 2008/07/08 15:36:37 >>__libc_read (/lib/libpthread-2.4.so) 32720 2008/07/08 15:36:38 >>__libc_read (/lib/libpthread-2.4.so) 32720 2008/07/08 15:36:38 >>__libc_read (/lib/libpthread-2.4.so) 20960 2008/07/08 15:36:37 >>__libc_read (/lib/libpthread-2.4.so) 20960 2008/07/08 15:36:37 >>__libc_read (/lib/libpthread-2.4.so) 8192 2008/07/08 15:36:38 >>__libc_read (/lib/libpthread-2.4.so) 8192 2008/07/08 15:36:38 >>__libc_read (/lib/libpthread-2.4.so)
82
Tutorial @ SC2008, Austin, TX Slide 82 Example 1 Cont. expview Show Top N Most Time Consuming Call Trees expview -v calltrees,fullstack iotN o openss>>expview -vcalltrees,fullstack -mcount,time iot1 Number of Calls Exclusive I/O Call Call Stack Function (defining location) Time(ms) _start (gamess.01.x) ……. >>>>main (gamess.01.x) >>>>> @ 545 in MAIN__ (gamess.01.x: gamess.f,352) >>>>>> @ 741 in brnchx_ (gamess.01.x: gamess.f,695) >>>>>>> @ 989 in energx_ (gamess.01.x: gamess.f,777) >>>>>>>> @ 1742 in wfn_ (gamess.01.x: gamess.f,1637) …… >>>>>>>>>>>>>>>find_or_create_unit (libgfortran.so.1.0.0) >>>>>>>>>>>>>>>>find_or_create_unit (libgfortran.so.1.0.0) 845 167.248327 >>>>>>>>>>>>>>>>>__libc_write (libpthread-2.4.so) * Updated
83
Tutorial @ SC2008, Austin, TX Slide 83 Example 1 Cont. - opengui * Updated
84
Tutorial @ SC2008, Austin, TX Slide 84 Example 2 – Parallel App. Let's Look At A Parallel Example Offline IO Experiment - smg2000 Offline IO Experiment - smg2000 [samuel@yra131 smg2000]$ openss -cli [samuel@yra131 smg2000]$ openss -clisamuel@yra131 Welcome to OpenSpeedShop 1.6 Welcome to OpenSpeedShop 1.6 openss>>RunOfflineExp("mpirun -np 16 smg2000 -n 100 100 100","io") openss>>RunOfflineExp("mpirun -np 16 smg2000 -n 100 100 100","io")...... openss>>opengui openss>>opengui
85
Tutorial @ SC2008, Austin, TX Slide 85 Example 2 – Parallel App. * Updated
86
Tutorial @ SC2008, Austin, TX Slide 86 Summary I/O Collectors Intercept All Calls to I/O Functions Intercept All Calls to I/O Functions Record Current Stack Trace & Start/End Time Record Current Stack Trace & Start/End Time Can Collect Detailed Ancillary Data (IOT) Can Collect Detailed Ancillary Data (IOT) Trace Experiments Collect Large Amounts of Data Allows For Fine-grained Analysis
87
87 Section 6 Parallel Performance Analysis Parallel Performance Analysis with Open|SpeedShop
88
Tutorial @ SC2008, Austin, TX Slide 88 Experiment Types Open|SpeedShop is designed to work on parallel jobs Focus here: parallelism using MPI Focus here: parallelism using MPI Sequential experiments Apply experiment/collectors to all nodes Apply experiment/collectors to all nodes By default display aggregate results By default display aggregate results Optional select individual groups of processes Optional select individual groups of processes MPI experiments Tracing of MPI calls Tracing of MPI calls Can be combined with sequential collectors Can be combined with sequential collectors
89
Tutorial @ SC2008, Austin, TX Slide 89 Configuration Multiple MPI versions possible -Open|SpeedShop MPI and Vampirtrace OPENSS_MPI_LAMset to MPI LAM installation dir OPENSS_MPI_OPENMPIset to MPI OPENMPI installation dir OPENSS_MPI_MPICHset to MPI MPICH installation dir OPENSS_MPI_MPICH2set to MPI MPICH2 installation dir OPENSS_MPI_MPICH_DRIVERmpich/mpich2 driver name [ch-p4] OPENSS_MPI_MPTset to SGI MPI MPT installation dir OPENSS_MPI_MVAPICHset to MPI MVAPICH installation dir Specify during installation (using the install.sh script)
90
Tutorial @ SC2008, Austin, TX Slide 90 MPI Job Start & Attach (offline) MPI process control Link in O|SS collector into MPI application Link in O|SS collector into MPI applicationExamples openss –offline –f “mpirun -np 4 sweep3d.mpi” pcsamp openss –offline –f “mpirun -np 4 sweep3d.mpi” pcsamp openss –offline –f “srun –N 4 –n 16 sweep3d.mpi” pcsamp openss –offline –f “srun –N 4 –n 16 sweep3d.mpi” pcsamp openss –offline –f “orterun –np 16 sweep3d.mpi” usertime openss –offline –f “orterun –np 16 sweep3d.mpi” usertime
91
Tutorial @ SC2008, Austin, TX Slide 91 Parallel Result Analysis Default views Values aggregated across all ranks Values aggregated across all ranks Manually include/exclude individual processes Manually include/exclude individual processes Rank comparisons Use Customize Stats Panel View Use Customize Stats Panel View Create columns for process groups Create columns for process groups Cluster Analysis Automatically create process groups of similar processes Automatically create process groups of similar processes Available from Stats Panel context menu Available from Stats Panel context menu
92
Tutorial @ SC2008, Austin, TX Slide 92 Viewing Results by Process Choice of ranks
93
Tutorial @ SC2008, Austin, TX Slide 93 MPI Tracing Similar to I/O tracing Record all MPI call invocations Record all MPI call invocations By default: record call times (mpi) By default: record call times (mpi) Optional: record all arguments (mpit) Optional: record all arguments (mpit) Equal events will be aggregated Save space in database Save space in database Reduce overhead Reduce overhead Future plans: Full MPI traces in public format Full MPI traces in public format
94
Tutorial @ SC2008, Austin, TX Slide 94 Tracing Results: Default View * Updated
95
Tutorial @ SC2008, Austin, TX Slide 95 Tracing Results: Event View * Updated
96
Tutorial @ SC2008, Austin, TX Slide 96 Tracing Results: Creating Event View * Updated
97
Tutorial @ SC2008, Austin, TX Slide 97 Tracing Results: Specialized Event View
98
Tutorial @ SC2008, Austin, TX Slide 98 Results / Show: Callstacks
99
Tutorial @ SC2008, Austin, TX Slide 99 Predefined Analysis Views O|SS provides common analysis functions Designed for quick analysis of MPI applications Designed for quick analysis of MPI applications Create new views in the StatsPanel Create new views in the StatsPanel Accessible through context menu or toolbar Accessible through context menu or toolbar Load Balance View Calculate min, max, average across ranks, processes or threads Calculate min, max, average across ranks, processes or threads Comparative Analysis View Use “cluster analysis” algorithm to group like performing ranks, processes, or threads. Use “cluster analysis” algorithm to group like performing ranks, processes, or threads.
100
Tutorial @ SC2008, Austin, TX Slide 100 Quick Min, Max, Average View Select “LB” in Toolbar
101
Tutorial @ SC2008, Austin, TX Slide 101 Comparative Analysis: Clustering Ranks Select “CA” in Toolbar
102
Tutorial @ SC2008, Austin, TX Slide 102 Comparing Ranks (1) Use CustomizeStatsPanel Create columns for each process set to compare Select set of ranks for each column
103
Tutorial @ SC2008, Austin, TX Slide 103 Comparing Ranks (2) Rank 0 Rank 1
104
Tutorial @ SC2008, Austin, TX Slide 104 Summary Open|SpeedShop manages MPI jobs Works with multiple MPI implementations Works with multiple MPI implementations Process control using MPIR interface (dynamic) Process control using MPIR interface (dynamic) Parallel Experiments Apply sequential collectors to all nodes Apply sequential collectors to all nodes Specialized MPI tracing experiments Specialized MPI tracing experimentsResults By default aggregated across results By default aggregated across results Optional: select individual processes Optional: select individual processes Compare or group ranks & specialized views Compare or group ranks & specialized views
105
105 Section 7 System Requirements & Installation Parallel Performance Analysis with Open|SpeedShop
106
Tutorial @ SC2008, Austin, TX Slide 106 System Requirements System Architecture AMD Opteron/Athlon AMD Opteron/Athlon Intel x86, x86-64, and Itanium-2 Intel x86, x86-64, and Itanium-2 Operating System Tested on Many Popular Linux Distributions Tested on Many Popular Linux DistributionsSLESRHEL Fedora Core Etc.
107
Tutorial @ SC2008, Austin, TX Slide 107 System Software Prerequisites libelf (devel) System-specific version highly recommended System-specific version highly recommended Python (devel) Qt3 (devel) Typical Linux Development Environment GNU Autotools GNU Autotools GNU libtool GNU libtool Etc. Etc.
108
Tutorial @ SC2008, Austin, TX Slide 108 Getting The Source Sourceforge Project Home http://sourceforge.net/projects/openss http://sourceforge.net/projects/openss CVS Access http://sourceforge.net/cvs/?group_id=176777 http://sourceforge.net/cvs/?group_id=176777Packages Accessible From Project Home Download Tab Accessible From Project Home Download Tab Additional Information http://www.openspeedshop.org/ http://www.openspeedshop.org/
109
Tutorial @ SC2008, Austin, TX Slide 109 Preparing The Build Environment Important Environment Variables OPENSS_PREFIX OPENSS_PREFIX OPENSS_INSTRUMENTOR OPENSS_INSTRUMENTOR OPENSS_MPI_ OPENSS_MPI_ QTDIR QTDIR Simple bash Example export OPENSS_PREFIX=/home/skg/local export OPENSS_PREFIX=/home/skg/local export OPENSS_INSTRUMENTOR=mrnet export OPENSS_INSTRUMENTOR=mrnet export OPENSS_MPI_OPENMPI=/opt/openmpi export OPENSS_MPI_OPENMPI=/opt/openmpi export QTDIR=/usr/lib64/qt-3.3 export QTDIR=/usr/lib64/qt-3.3 * Updated
110
Tutorial @ SC2008, Austin, TX Slide 110 Building OpenSpeedShop Preparing Directory Structure For Install Script Needed If OSS & OSS_ROOT Were Pulled From CVS Needed If OSS & OSS_ROOT Were Pulled From CVS Rename OpenSpeedShop to openspeedshop-1.6 tar -czf openspeedshop-1.6.tar.gz openspeedshop-1.6 Note: In General, openspeedshop- Note: In General, openspeedshop- Move openspeedshop-1.6 to OSS_ROOT/SOURCES Navigate To OpenSpeedShop_ROOT We Now Have Two Options... We Now Have Two Options... Use install.sh to Build Prerequisites/OSS Interactively Use Option 9 For Automatic Build./install.sh [--with-option] [OPTION]./install.sh [--with-option] [OPTION]
111
Tutorial @ SC2008, Austin, TX Slide 111 Installation Script Overview
112
Tutorial @ SC2008, Austin, TX Slide 112 Basic Installation Structure
113
Tutorial @ SC2008, Austin, TX Slide 113 Post-Installation Setup Important Environment Variables (All Instrumentors) OPENSS_PREFIX OPENSS_PREFIX Path to installation directory Path to installation directory OPENSS_PLUGIN_PATH OPENSS_PLUGIN_PATH Path to directory where plugins are stored Path to directory where plugins are stored OPENSS_MPI_IMPLEMENTATION OPENSS_MPI_IMPLEMENTATION If multiple MPI implementations, this points openss at the one you are using in your application If multiple MPI implementations, this points openss at the one you are using in your application QTDIR, LD_LIBRARY_PATH, PATH QTDIR, LD_LIBRARY_PATH, PATH QT installation directory and Linux path variables QT installation directory and Linux path variables * Updated
114
Tutorial @ SC2008, Austin, TX Slide 114 Post-Installation Setup Cont. Simple bash Example export OPENSS_PREFIX=/home/skg/local export OPENSS_MPI_IMPLEMENTATION=openmpi export OPENSS_PLUGIN_PATH= $OPENSS_PREFIX/lib64/openspeedshop export LD_LIBRARY_PATH= $OPENSS_PREFIX/lib64:$LD_LIBRARY_PATH export QTDIR=/usr/lib64/qt-3.3 export PATH=$OPENSS_PREFIX/bin:$PATH * Updated
115
Tutorial @ SC2008, Austin, TX Slide 115 Post-Installation Setup (MRNet) Additional MRNet-Specific Environment Variables OPENSS_MRNET_TOPOLOGY_FILE OPENSS_MRNET_TOPOLOGY_FILE Additional Info: http://www.paradyn.org/mrnet DYNINSTAPI_RT_LIB DYNINSTAPI_RT_LIB MRNET_RSH MRNET_RSH XPLAT_RSHCOMMAND XPLAT_RSHCOMMAND OPENSS_RAWDATA_DIR OPENSS_RAWDATA_DIR Simple bash Example export OPENSS_MRNET_TOPOLOGY_FILE=/home/skg/oss.top export OPENSS_MRNET_TOPOLOGY_FILE=/home/skg/oss.top export DYNINSTAPI_RT_LIB=$OPENSS_PREFIX/lib64/libdyninstAPI_RT.so.1 export DYNINSTAPI_RT_LIB=$OPENSS_PREFIX/lib64/libdyninstAPI_RT.so.1 export MRNET_RSH=ssh export MRNET_RSH=ssh export XPLAT_RSHCOMMAND=ssh export XPLAT_RSHCOMMAND=ssh export OPENSS_RAWDATA_DIR=/panfs/scratch2/vol3/skg/rawdata export OPENSS_RAWDATA_DIR=/panfs/scratch2/vol3/skg/rawdata
116
Tutorial @ SC2008, Austin, TX Slide 116 Post-Installation Setup (Offline) Additional Offline-Specific Environment Variables OPENSS_RAWDATA_DIR OPENSS_RAWDATA_DIR OPENSS_RAWDATA_DIR must be defined within the user's environment before OpenSpeedShop is invoked. If OPENSS_RAWDATA_DIR is not defined in the user's current terminal session before the invocation of OpenSpeedShop, then all raw performance data files will be placed in /tmp/$USER/offline-oss That is, OPENSS_RAWDATA_DIR will automatically be initialized to /tmp/$USER/offline-oss for the duration of the OpenSpeedShop session. NOTE: In general, it is best if a user selects a target directory located on scratch space. Simple bash Example export OPENSS_RAWDATA_DIR=/panfs/scratch2/vol3/skg/rawdata export OPENSS_RAWDATA_DIR=/panfs/scratch2/vol3/skg/rawdata
117
117 Section 8 Advanced Capabilities Parallel Performance Analysis with Open|SpeedShop
118
Tutorial @ SC2008, Austin, TX Slide 118 Advanced Features Overview Dynamic instrumentation and online analysis Pros & Cons OR what to use when Pros & Cons OR what to use when Interactive performance analysis wizards Interactive performance analysis wizards Advanced GUI features Scripting Open|SpeedShop Using the command line interface Using the command line interface Integrating O|SS into Python Integrating O|SS into Python Plugin concept to extend Open|SpeedShop Future plugin plans
119
Tutorial @ SC2008, Austin, TX Slide 119 Interactive Analysis Dynamic Instrumentation Works on binaries Works on binaries Add/Change instrumentation at runtime Add/Change instrumentation at runtime Hierarchical Communication Efficient broadcast of commands Efficient broadcast of commands Online data reduction Online data reduction Interactive Control Available through GUI and CLI Available through GUI and CLI Start/Stop/Adjust data collection Start/Stop/Adjust data collection MPI Application O|SS MRNet
120
Tutorial @ SC2008, Austin, TX Slide 120 Pros & Cons Advantage: New capabilities Attach to/detach from running job Attach to/detach from running job Intermediate/partial results Intermediate/partial results Advantage: Easier use Control directly from one environment Control directly from one environment Disadvantage: Longer wait times Less information available at database creation time Less information available at database creation time Binary parsing expensive Binary parsing expensive Disadvantage: Stability Techniques are low-level and OS specific Techniques are low-level and OS specific
121
Tutorial @ SC2008, Austin, TX Slide 121 Using Dynamic Instrumentation MRNet instrumentor Can be installed in parallel with offline capabilities Can be installed in parallel with offline capabilities Site specific daemon launch instructions (site.py) Site specific daemon launch instructions (site.py) Launch control from within O|SS Select binary at experiment creation Select binary at experiment creation Attach option from within O|SS Select PID of target process at experiment creation Select PID of target process at experiment creation Alternative option: Wizards GUI offers wizards for all default experiments GUI offers wizards for all default experiments Easy to use walk through all options Easy to use walk through all options
122
Tutorial @ SC2008, Austin, TX Slide 122 Wizard Panels (1) Gather data from new runs Analyze and/or compare existing data from previous runs O|SS Command Line Interface
123
Tutorial @ SC2008, Austin, TX Slide 123 Wizard Panels (2) Select type of data to be gathered by Open|SpeedShop
124
Tutorial @ SC2008, Austin, TX Slide 124 Using the Wizards Wizard guide through standard steps Access to all available experiments Access to all available experiments Options in separate screens Options in separate screens Attach and start of applications possible Attach and start of applications possible Example: pcsamp Wizard Select “… where my program spends time …” Select “… where my program spends time …” Select sampling rate (100 is a good default) Select sampling rate (100 is a good default) Select program to run or attach to Select program to run or attach to Ensure all online daemons are started (site.py) Ensure all online daemons are started (site.py) Review and select finish Review and select finish
125
Tutorial @ SC2008, Austin, TX Slide 125 Dynamic MPI Job Start & Attach MPI process control Starting: depends on MPI launcher Starting: depends on MPI launcher Uses TotalView MPIR interface for MPI-1 Uses TotalView MPIR interface for MPI-1Examples Example MPICH: Example MPICH: Run on actual binary Example SLURM: Example SLURM: Run on “srun ” Attach: attach to process with MPIR table Select “-v mpi” or check “Attach to MPI job” Select “-v mpi” or check “Attach to MPI job”
126
Tutorial @ SC2008, Austin, TX Slide 126 Process Management Execution Control Process Overview Process Details
127
Tutorial @ SC2008, Austin, TX Slide 127 Experiment Control Process management panel Access to all tasks/processes/threads Access to all tasks/processes/threads Information on status and location Information on status and location Execution control Start/Pause/Resume/Stop jobs Start/Pause/Resume/Stop jobs Select process groups for views Collector control Add additional collectors to tasks Add additional collectors to tasks Create custom experiments Create custom experiments
128
Tutorial @ SC2008, Austin, TX Slide 128 General GUI Features GUI panel management Peel-off and rearrange any panel Peel-off and rearrange any panel Color coded panel groups per experiment Color coded panel groups per experiment Context sensitive menus Right click at any location Right click at any location Access to different views Access to different views Activate additional panels Activate additional panels Access to source location of events Double click on stats panel Double click on stats panel Opens source panel with (optional) statistics Opens source panel with (optional) statistics
129
Tutorial @ SC2008, Austin, TX Slide 129 Leaving the GUI Three different options to run without GUI Equal functionality Equal functionality Can transfer state/results Can transfer state/results Interactive Command Line Interface openss -cli openss -cli Batch Interface openss -batch < openss_cmd_file openss -batch < openss_cmd_file openss –batch –f openss –batch –f Python Scripting API python openss_python_script_file.py python openss_python_script_file.py
130
Tutorial @ SC2008, Austin, TX Slide 130 CLI Language An interactive command Line Interface gdb/dbx like processing gdb/dbx like processing Several interactive commands Create Experiments Create Experiments Provide Process/Thread Control Provide Process/Thread Control View Experiment Results View Experiment Results Where possible commands execute asynchronously http://www.openspeedshop.org/docs/cli_doc/
131
Tutorial @ SC2008, Austin, TX Slide 131 User-Time Example lnx-jeg.americas.sgi.com-17>openss -cli openss>>Welcome to OpenSpeedShop 1.9 openss>>expcreate -f test/executables/ fred/fred usertime The new focused experiment identifier is: -x 1 openss>>expgo Start asynchronous execution of experiment: -x 1 openss>>Experiment 1 has terminated. Create experiments and load application Start application
132
Tutorial @ SC2008, Austin, TX Slide 132 Showing CLI Results openss>>expview Excl CPU time Inclu CPU time % of Total Exclusive Function in seconds. in seconds. CPU Time (defining location) 5.2571 5.2571 49.7297 f3 (fred: f3.c,2) 3.3429 3.3429 31.6216 f2 (fred: f2.c,2) 1.9714 1.9714 18.6486 f1 (fred: f1.c,2) 0.0000 10.5714 0.0000 __libc_start_main (libc.so.6) 0.0000 10.5714 0.0000 _start (fred) 0.0000 10.5429 0.0000 work(fred:work.c,2) 0.0000 10.5714 0.0000 main (fred: fred.c,5)
133
Tutorial @ SC2008, Austin, TX Slide 133 CLI Batch Scripting (1) Create batch file with CLI commands Plain text file Plain text file Example: Example: # Create batch file echo expcreate -f fred pcsamp >> input.script echo expgo >> input.script echo expview pcsamp10 >>input.script # Run OpenSpeedShop openss -batch < input.script
134
Tutorial @ SC2008, Austin, TX Slide 134 CLI Batch Scripting (2) Open|SpeedShop Batch Example Results The new focused experiment identifier is: -x 1 Start asynchronous execution of experiment: -x 1 Experiment 1 has terminated. CPU Time Function (defining location) 24.2700 f3 (mutatee: mutatee.c,24) 16.0000 f2 (mutatee: mutatee.c,15) 8.9400 f1 (mutatee: mutatee.c,6) 0.0200 work (mutatee: mutatee.c,33)
135
Tutorial @ SC2008, Austin, TX Slide 135 CLI Batch Scripting (3) Open|SpeedShop Batch Example: direct #Run Open|SpeedShop as a single non-interactive command openss –batch –f fred pcsamp The new focused experiment identifier is: -x 1 Start asynchronous execution of experiment: -x 1 Experiment 1 has terminated. CPU Time Function (defining location) 24.2700 f3 (mutatee: mutatee.c,24) 16.0000 f2 (mutatee: mutatee.c,15) 8.9400 f1 (mutatee: mutatee.c,6) 0.0200 work (mutatee: mutatee.c,33)
136
Tutorial @ SC2008, Austin, TX Slide 136 CLI Integration in GUI Program Output & Command Line Panel
137
Tutorial @ SC2008, Austin, TX Slide 137 Command Line Panel Integrated with output panel O|SS command line language O|SS command line language Optional python support Optional python support Access to all loaded experiments Same context as GUI Same context as GUI Use experiment ID listed in panel name Use experiment ID listed in panel name “ History ” command Display all commands issued by GUI Display all commands issued by GUI Cut & paste output for scripting Cut & paste output for scripting
138
Tutorial @ SC2008, Austin, TX Slide 138 History Command
139
Tutorial @ SC2008, Austin, TX Slide 139 CLI Command Overview Experiment Creations – –expcreate – –expattach Experiment Control – –expgo – –expwait – –expdisable – –expenable Experiment Storage – –expsave – –exprestore Result Presentation – –expview – –opengui Misc. Commands – –help – –list – –log – –record – –playback – –history – –quit
140
Tutorial @ SC2008, Austin, TX Slide 140 Python Scripting Open|SpeedShop Python API that executes “same” Interactive/Batch Open|SpeedShop commands User can intersperse “normal” Python code with Open|SpeedShop Python API Run Open|SpeedShop experiments via the Open|SpeedShop Python API
141
Tutorial @ SC2008, Austin, TX Slide 141 Python Example (1) Necessary steps: Import O|SS Python module Import O|SS Python module Prepare arguments for target application Prepare arguments for target application Set view and experiment type Set view and experiment type Create experiment Create experiment import openss my_filename=openss.FileList("usability/phaseII/fred") my_viewtype = openss.ViewTypeList() my_viewtype += "pcsamp" exp1=openss.expCreate(my_filename,viewtype)
142
Tutorial @ SC2008, Austin, TX Slide 142 Python Example (2) After experiment creation Start target application (asynchronous!) Start target application (asynchronous!) Wait for completion Wait for completion Write results Write results openss.expGo() openss.wait() except openss.error: print "expGo(exp1,my_modifer) failed" openss.dumpView()
143
Tutorial @ SC2008, Austin, TX Slide 143 Python Example Output Two interfaces to dump data Plain text (similar to CLI) for viewing Plain text (similar to CLI) for viewing As Python objects for post-processing As Python objects for post-processing >python example.py /work/jeg/OpenSpeedShop/usability/phaseII/fred: successfully completed. Excl. CPU time % of CPU Time Function (def. location) \ 4.6700 47.7994 f3 (fred: f3.c,23) 3.5100 35.9263 f2 (fred: f2.c,2) 1.5900 16.2743 f1 (fred: f1.c,2)
144
Tutorial @ SC2008, Austin, TX Slide 144 Summary Multiple non-graphical interfaces Interactive Command Line Interactive Command Line Batch scripting Batch scripting Python module Python module Equal functionality Similar commands in all interfaces Similar commands in all interfaces Results transferable E.g., run in Python and view in GUI E.g., run in Python and view in GUI Possibility to switch GUI ↔ CLI Possibility to switch GUI ↔ CLI
145
Tutorial @ SC2008, Austin, TX Slide 145 Extensibility O|SS is more than a performance tool All functionality in one toolset with one interface All functionality in one toolset with one interface General infrastructure to create new tools General infrastructure to create new tools Plugins to add new functionality Cover all essential steps of performance analysis Cover all essential steps of performance analysis Automatically loaded at O|SS startup Automatically loaded at O|SS startup Three types of plugins Collectors: How to acquire performance data? Collectors: How to acquire performance data? Views: How to aggregate and present data? Views: How to aggregate and present data? Panels: How to visualize data in the GUI? Panels: How to visualize data in the GUI?
146
Tutorial @ SC2008, Austin, TX Slide 146 Plugin Dataflow Wizard Plugin Execution Environment Collector Plugin View Plugin Panel Plugin Instrumentor Data Abstraction CLI SQL Database
147
Tutorial @ SC2008, Austin, TX Slide 147 Plugin Plans Existing extensions to base experiments MPI tracing directly to OTF MPI tracing directly to OTF Extended parameter capture for MPI and I/O traces Extended parameter capture for MPI and I/O traces New plugins planned Code coverage Code coverage Memory profiling Memory profiling Extended Hardware Counter profiling Extended Hardware Counter profiling Interested in creating your own plugins? Step 1: Design tool workflow and map to plugins Step 1: Design tool workflow and map to plugins Step 2: Define data format (XDR encoding) Step 2: Define data format (XDR encoding) Examples and plugin creation guide available Examples and plugin creation guide available
148
Tutorial @ SC2008, Austin, TX Slide 148 Summary / Advanced Features Two techniques for instrumentation Online vs. Offline Online vs. Offline Different strength for different target scenarios Different strength for different target scenarios Flexible GUI that can be customized Several compatible scripting options Command Line Language Command Line Language Direct batch interface Direct batch interface Integration of O|SS into Python Integration of O|SS into Python GUI and scripting interoperable Plugin concept to extend Open|SpeedShop
149
149 Conclusions Parallel Performance Analysis with Open|SpeedShop
150
Tutorial @ SC2008, Austin, TX Slide 150 Open|SpeedShop Goal: One stop performance analysis All basic experiments in one tool All basic experiments in one tool Extensible using plugin mechanism Extensible using plugin mechanism Supports parallel programs Supports parallel programs Multiple interface Flexible GUI Flexible GUI Multiple scripting and batch options Multiple scripting and batch options Rich features Comparing results Comparing results Tracebacks/Callstacks in most experiments Tracebacks/Callstacks in most experiments Full I/O tracing support Full I/O tracing support
151
Tutorial @ SC2008, Austin, TX Slide 151 Documentation Open|SpeedShop User Guide Documentation http://www.openspeedshop.org/docs/users_guide/ http://www.openspeedshop.org/docs/users_guide/ http://www.openspeedshop.org/docs/users_guide/ /opt/OSS/share/doc/packages/OpenSpeedShop/users_guide /opt/OSS/share/doc/packages/OpenSpeedShop/users_guide Where /opt/OSS is the installation directory Python scripting API Documentation http://www.openspeedshop.org/docs/pyscripting_doc/ http://www.openspeedshop.org/docs/pyscripting_doc/ http://www.openspeedshop.org/docs/pyscripting_doc/ /opt/OSS/share/doc/packages/OpenSpeedShop/pyscripting_doc /opt/OSS/share/doc/packages/OpenSpeedShop/pyscripting_doc Where /opt/OSS is the installation directory Command Line Interface Documentation http://www.openspeedshop.org/docs/cli_doc/ http://www.openspeedshop.org/docs/cli_doc/ /opt/OSS/share/doc/packages/OpenSpeedShop/cli_doc /opt/OSS/share/doc/packages/OpenSpeedShop/cli_doc Where /opt/OSS is the installation directory
152
Tutorial @ SC2008, Austin, TX Slide 152 Current Status Open|SpeedShop 1.9 available Packages and source from sourceforge.net Packages and source from sourceforge.net Tested on a variety of platforms Tested on a variety of platforms Keep in mind, though: Open|SpeedShop is still under development Open|SpeedShop is still under development Low-level OS specific components Low-level OS specific components Keep us informed We are happy to help with problems We are happy to help with problems Interested in getting feedback Interested in getting feedback * Updated
153
Tutorial @ SC2008, Austin, TX Slide 153 Building a Community More then yet another research tool Large scale environment Large scale environment Flexible framework Flexible framework Cooperative atmosphere Cooperative atmosphereWanted: New Users and Open source tool developers New Users and Open source tool developers Tools/Software companies Tools/Software companies Goal: self-sustaining project * Updated
154
Tutorial @ SC2008, Austin, TX Slide 154 Availability and Contact Open|SpeedShop website: http://www.openspeedshop.org/ http://www.openspeedshop.org/ Download options: Package with Install Script Package with Install Script Source for tool and base libraries Source for tool and base librariesFeedback Bug tracking available from website Bug tracking available from website Contact information on website Contact information on website Feel free to contact presenters directly Feel free to contact presenters directly
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.