Download presentation
Presentation is loading. Please wait.
1
Kai Li, Allen D. Malony, Robert Bell, Sameer Shende {likai,malony,bertie,sameer}@cs.uoregon.edu Department of Computer and Information Science Computational Science Institute, NeuroInformatics Center University of Oregon A Framework for Online Performance Analysis and Visualization of Large- Scale Parallel Applications
2
PPAM 20032Framework for Online Performance Analysis, and Visualization Outline Problem description Scaling and performance observation Interest in online performance analysis General online performance system architecture Access models Profiling issues and control issues Framework for online performance analysis TAU performance system SCIRun computational and visualization environment Experiments Conclusions and future work
3
PPAM 20033Framework for Online Performance Analysis, and Visualization Problem Description Need for parallel performance observation Instrumentation, measurement, analysis, visualization In general, there is the concern for intrusion Seen as a tradeoff with accuracy of performance diagnosis Scaling complicates observation and analysis Issues of data size, processing time, and presentation Online approaches add capabilities as well as problems Performance interaction, but at what cost? Tools for large-scale performance observation online Supporting performance system architecture Tool integration, effective usage, and portability
4
PPAM 20034Framework for Online Performance Analysis, and Visualization Scaling and Performance Observation Consider “traditional” measurement methods Profiling: summary statistics calculated during execution Tracing: time-stamped sequence of execution events More parallelism more performance data overall Performance specific to each thread of execution Possible increase in number interactions between threads Harder to manage the data (memory, transfer, storage, …) More parallelism / performance data harder analysis More time consuming to analyze More difficult to visualize (meaningful displays) Need techniques to address scaling at all levels
5
PPAM 20035Framework for Online Performance Analysis, and Visualization Why Complicate Matters with Online Methods? Adds interactivity to performance analysis process Opportunity for dynamic performance observation Instrumentation change Measurement change Allows for control of performance data volume Post-mortem analysis may be “too late” View on status of long running jobs Allow for early termination Computation steering to achieve “better” results Performance steering to achieve “better” performance Online performance observation may be intrusive
6
PPAM 20036Framework for Online Performance Analysis, and Visualization Related Ideas Computational steering Falcon (Schwan, Vetter): computational steering Dynamic instrumentation and performance search Paradyn (Miller): online performance bottleneck analysis Adaptive control and performance steering Active Harmony (Hollingsworth): auto decision control Autopilot (Reed): actuator/sensor performance steering Scalable monitoring Peridot (Gerndt): automatic online performance analysis MRNet (Miller): multi-case reduction for access / control Scalable analysis and visualization VNG (Brunst): parallel trace analyis
7
PPAM 20037Framework for Online Performance Analysis, and Visualization General Online Performance Observation System Performance Data Performance Measurement Performance Control Performance Analysis Performance Visualization Performance Instrument
8
PPAM 20038Framework for Online Performance Analysis, and Visualization Models of Performance Data Access (Monitoring) Push Model Producer/consumer style of access and transfer Application decides when/what/how much data to send External analysis tools only consume performance data Availability of new data is signaled passively or actively Pull Model Client/server style of performance data access and transfer Application is a performance data server Access decisions are made externally by analysis tools Two-way communication is required Push/Pull Models
9
PPAM 20039Framework for Online Performance Analysis, and Visualization Online Profiling Issues Profiles are summary statistics of performance Kept with respect to some unit of parallel execution Profiles are distributed across the machine (in memory) Must be gathered and delivered to profile analysis tool Profile merging must take place (possibly in parallel) Consistency checking of profile data Callstack must be updated to generate correct profile data Correct communication statistics may require completion Event identification (not necessary is save event names) Sequence of profile samples allow interval analysis Interval frequency depends on profile collection delay
10
PPAM 200310Framework for Online Performance Analysis, and Visualization Performance Control Instrumentation control Dynamic instrumentation Inserts / removes instrumentation at runtime Measurement control Dynamic measurement Enabling / disabling / changing of measurement code Dynamic instrumentation or measurement variables Data access control Selection of what performance data to access Control of frequency of access
11
PPAM 200311Framework for Online Performance Analysis, and Visualization TAU Performance System Framework Tuning and Analysis Utilities (aka Tools Are Us) Performance system framework for scalable parallel and distributed high-performance computing Targets a general complex system computation model nodes / contexts / threads Multi-level: system / software / parallelism Measurement and analysis abstraction Integrated toolkit for performance instrumentation, measurement, analysis, and visualization Portable performance profiling/tracing facility Open software approach
12
PPAM 200312Framework for Online Performance Analysis, and Visualization TAU Performance System Architecture EPILOG Paraver ParaProf
13
PPAM 200313Framework for Online Performance Analysis, and Visualization Online Profile Measurement and Analysis in TAU Standard TAU profiling Per node/context/thread Profile “dump” routine Context-level Profile file per each thread in context Appends to profile file Selective event dumping Analysis tools access files through shared file system Application-level profile “access” routine
14
PPAM 200314Framework for Online Performance Analysis, and Visualization Online Performance Analysis and Visualization Application Performance Steering Performance Visualizer Performance Analyzer Performance Data Reader TAU Performance System Performance Data Integrator SCIRun (Univ. of Utah) // performance data streams // performance data output file system sample sequencing reader synchronization accumulated samples
15
PPAM 200315Framework for Online Performance Analysis, and Visualization Profile Sample Data Structure in SCIRun node context thread
16
PPAM 200316Framework for Online Performance Analysis, and Visualization Performance Analysis/Visualization in SCIRun SCIRun program
17
ParCo 2003 Mini-Symposium17Online Performance Monitoring, Analysis, and Visualization Uintah Computational Framework (UCF) University of Utah UCF analysis Scheduling MPI library Components 500 processes Use for online and offline visualization Apply SCIRun steering
18
ParCo 2003 Mini-Symposium18Online Performance Monitoring, Analysis, and Visualization “Terrain” Performance Visualization F
19
ParCo 2003 Mini-Symposium19Online Performance Monitoring, Analysis, and Visualization Scatterplot Displays Each point coordinate determined by three values: MPI_Reduce MPI_Recv MPI_Waitsome Min/Max value range Effective for cluster analysis Relation between MPI_Recv and MPI_Waitsome
20
ParCo 2003 Mini-Symposium20Online Performance Monitoring, Analysis, and Visualization Online Unitah Performance Profiling Demonstration of online profiling capability Colliding elastic disks Test material point method (MPM) code Executed on 512 processors ASCI Blue Pacific at LLNL Example 1 (Terrain visualization) Exclusive execution time across event groups Multiple time steps Example 2 (Bargraph visualization) MPI execution time and performance mapping Example 3 (Domain visualization) Task time allocation to “patches”
21
ParCo 2003 Mini-Symposium21Online Performance Monitoring, Analysis, and Visualization Example 1 (Event Groups)
22
ParCo 2003 Mini-Symposium22Online Performance Monitoring, Analysis, and Visualization Example 2 (MPI Performance)
23
ParCo 2003 Mini-Symposium23Online Performance Monitoring, Analysis, and Visualization Example 3 (Domain-Specific Visualization)
24
PPAM 200324Framework for Online Performance Analysis, and Visualization ParaProf Framework Architecture Portable, extensible, and scalable tool for profile analysis Offer “best of breed” capabilities to performance analysts Build as profile analysis framework for extensibility
25
PPAM 200325Framework for Online Performance Analysis, and Visualization ParaProf Profile Display (VTF) Virtual Testshock Facility (VTF), Caltech, ASCI Center Dynamic measurement, online analysis, visualization
26
PPAM 200326Framework for Online Performance Analysis, and Visualization Full Profile Display (SAMRAI++) 512 processes Structured AMR toolkit (SAMRAI++), LLNL
27
PPAM 200327Framework for Online Performance Analysis, and Visualization Evaluation of Experimental Approaches Currently only supporting push model File system solution for moving performance data Is this a scalable solution? Robust solution that can leverage high-performance I/O May result in high intrusion However, does not require IPC Should be relatively portable Analysis and visualization only runs sequentially
28
PPAM 200328Framework for Online Performance Analysis, and Visualization Possible Improvements Profile merging at context level to reduce number of files Merging at node level may require explicit processing Concurrent trace merging could also reduce files Hierarchical merge tree Will require explicit processing Could consider IPC transfer MPI (e.g., used in mpiP for profile merging) Create own communicators Sockets or PACX between computer server and analyzer Leverage large-scale systems infrastructure Parallel profile analysis
29
PPAM 200329Framework for Online Performance Analysis, and Visualization Concluding Remarks Interest in online performance monitoring, analysis, and visualization for large-scale parallel systems Need to intelligently use Benefit from other scalability considerations of the system software and system architecture See as an extension to the parallel system architecture Avoid solutions that have portability difficulties In part, this is an engineering problem Need to work with the system configuration you have Need to understand if approach is applicable to problem Not clear if there is a single solution
30
PPAM 200330Framework for Online Performance Analysis, and Visualization Future Work Build online support in TAU performance system Extend to support PULL model capabilities Develop hierarchical data access solutions Performance studies of full system Latency analysis Bandwidth analysis Integration with other performance tools System performance monitors ParaProf parallel profile analyzer Development of 3D visualization library Portability focus
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.