Download presentation
Presentation is loading. Please wait.
Published byDevon Viles Modified over 10 years ago
1
Paradyn/Condor Week 2004 MATE: Monitoring, Analysis and Tuning Environment Anna Morajko, Tomàs Margalef and Emilio Luque Universitat Autònoma de Barcelona Paradyn/Condor Week 2004 April 2004
2
Paradyn/Condor Week 2004 2 1.Introduction 2.Dynamic Performance Tuning 3.MATE 4.Tuning Techniques 5.Conclusions and future work Content
3
Paradyn/Condor Week 2004 3 Introduction Application performance Demand of high performance computation The main goal of parallel/distributed applications: solve a considered problem in the possible fastest way Performance is one of the most important issues Developers must optimize application performance to provide efficient and useful applications
4
Paradyn/Condor Week 2004 4 Introduction Application performance optimization Steps: monitoring, analysis, tuning Bottlenecks Application development Monitored execution Solutions Source code relation Performance data Application Source Instrumentation Modifications MonitoringTuning Performance analysis Measurements Changes
5
Paradyn/Condor Week 2004 5 Introduction Application performance optimization Difficulties in finding bottlenecks and determining their solutions for parallel/distributed applications –Many tasks that cooperate with each other High degree of expertise Application behavior may change on input data or environment Difficult task especially for non-expert users
6
Paradyn/Condor Week 2004 6 Introduction Our goals Investigate if it is possible to optimize performance of parallel/distributed applications dynamically without user intervention Investigate the applicability of dynamic tuning Create a tool that is able to dynamically optimize applications: –automatically improve application performance –improve the application execution during run time –tune without recompiling and rerunning –adapt application to existing conditions Practically evaluate profitability of dynamic tuning
7
Paradyn/Condor Week 2004 7 Introduction Dynamic automatic tuning User TuningMonitoring Tool Solution Problem / Performance analysis Modifications Performance data Application development Application Execution Source Instrumentation Events
8
Paradyn/Condor Week 2004 8 1.Introduction 2.Dynamic Performance Tuning 3.MATE 4.Tuning Techniques 5.Conclusions and future work Content
9
Paradyn/Condor Week 2004 9 Dynamic Performance Tuning Requirements No user intervention No source recompilation Performance analysis on the fly –Global analysis –Decisions taken in a short time –Not complex analysis and modifications Run time monitoring Run time tuning –Modifications performed carefully Parallel/distributed application control Low intrusion
10
Paradyn/Condor Week 2004 10 Dynamic Performance Tuning Key question What can be tuned in an application? Application knowledge Limited information about the application Tuning layers Approaches to tuning
11
Paradyn/Condor Week 2004 11 Dynamic Performance Tuning Tuning layers Application specific code Standard and custom libraries (API+code) Operating system libraries (API+code) Hardware Operating System kernel OS API Libraries code API Application code
12
Paradyn/Condor Week 2004 12 Dynamic Performance Tuning Application Application code changes –Different bottlenecks that depend on the application implementation Libraries Library code changes API usage –Standard C/C++ library -> memory management, dynamic containers –Custom PVM, MPI -> communication OS Kernel code changes API usage –Adjustment of options (e.g. TCP/IP socket), I/O request grouping More bottlenecks common for wider group of applications Hardware Operating System kernel OS API Libraries code API Application code
13
Paradyn/Condor Week 2004 13 Dynamic Performance Tuning Approaches to tuning Cooperative –Application must be prepared for tuning –Application-specific knowledge is provided Automatic - black-box –Tuning of any application –No application-specific knowledge is required –Knowledge about bottleneck is required –No changes are introduced into the application source code More automatic, more generic information available More cooperative, more application- specific Hardware Operating System kernel OS API Libraries code API Application code
14
Paradyn/Condor Week 2004 14 Dynamic Performance Tuning Knowledge representation Measure points –Where the instrumentation must be inserted to provide measurements Performance model –Determines minimal execution time of the entire application Tuning points/actions/synchronization –What and when can be changed in the application point – element that may be changed action – what to invoke on a point synchronization – when a tuning action can be invoked to ensure application correctness Formulas and conditions for optimal behavior measurementsoptimal values
15
Paradyn/Condor Week 2004 15 Dynamic Performance Tuning Application knowledge Measure points Performance model Tuning point, action, sync Provided by the user Provided automatically by a tuning system Hardware Operating System kernel OS API Libraries code API Application code
16
Paradyn/Condor Week 2004 16 Dynamic Performance Tuning Manipulation of a running application monitoring – collect information about the behavior of a running application tuning – insert tuning code into a running application that improves its performance Dynamic instrumentation – DynInst
17
Paradyn/Condor Week 2004 17 Dynamic Performance Tuning Dynamic modifications of a running application with DynInst Function replacement Function invocation One-time function invocation Function call elimination Function parameter changes Variable changes
18
Paradyn/Condor Week 2004 18 1.Introduction 2.Dynamic Performance Tuning 3.MATE 4.Tuning Techniques 5.Conclusions and future work Content
19
Paradyn/Condor Week 2004 19 MATE MATE – Monitoring, Analysis and Tuning Environment prototype implementation in C++ for PVM based applications Sun Solaris 2.x / SPARC
20
Paradyn/Condor Week 2004 20 MATE Machine 1 Machine 2 Machine 3 pvmd Analyzer pvmd AC instr. events modif. events DMLib Task 1 Task 2 Task 3 instr. AC Application Controller - AC Dynamic Monitoring Library - DMLib Analyzer
21
Paradyn/Condor Week 2004 21 MATE: Application Controller Services Distributed application control –Startup/exit of tasks (Tasker) –Startup/exit of PVM daemons, slave ACs (Hoster) –Clock synchronization Application model management (Task Manager) Performance monitoring (Monitors) –Manage monitoring instrumentation –Provide monitoring API for Analyzer Performance tuning (Tuners) –Manage tuning instrumentation –Provide tuning API for Analyzer
22
Paradyn/Condor Week 2004 22 MATE: Application Controller Machine 1 DMLib Task 2 Task 1 Instrument Via DynInst Machine 2 Analyzer add event/ remove event AC Monitor Monitors Instrumentation management via DynInst –Dynamically load DMLib –Generate monitoring snippets that call appropriate library functions –Insert/remove snippets in/from requested points API –AddEventTrace(tid, eventId, funcName, instrPlace, attrs) –RemoveEventTrace(tid,eventId)
23
Paradyn/Condor Week 2004 23 MATE: Application Controller Tuners Tuning via DynInst –Generate tuning snippet according to the request –Insert tuning snippet API –LoadLibrary(tid,path) –SetVariableValue(tid,params,brkpt) –ReplaceFunction(…) –InsertFunctionCall(…) –OneTimeFunctionCall(…) –RemoveFunctionCall(…) –FunctionParamChange(…) Machine 1 Task 2 Task 1 Tune Via DynInst Machine 2 Analyzer Apply tuning AC Tuner
24
Paradyn/Condor Week 2004 24 MATE: Dynamic Monitoring Library Services Register event What – event type (id, place) When – global timestamp Where – task identifier Requested attributes – e.g. function call parameters, return value Deliver event to the Analyzer API –DMLib_InitLogger(tid, analyzerHost,port,clockDiff) –DMLib_OpenEvent(id, nAttrs) –DMLib_AddIntAttr(value) –DMLib_AddFloatAttr(value) –DMLib_AddCharAttr(value) –DMLib_AddStringAttr(value) –DMLib_CloseEvent() –DMLib_DoneLogger() Machine 1 DMLib Task 1 pvm_send (p1, p2) { } pvm_send (p1, p2) { } DMLib_OpenEvent(); DMLib_AddIntAttr(); DMLib_CloseEvent(); DMLib_OpenEvent(); DMLib_AddIntAttr(); DMLib_CloseEvent(); Analyzer entry 1 0 64884 524247 262149 1 TCP/IP event API implementation
25
Paradyn/Condor Week 2004 25 MATE: Analyzer Services Automatic performance analysis on the fly –Request for events –Collect incoming events –Find bottlenecks among events applying performance model –Find solutions that overcome bottlenecks –Send tuning request Analyzer is provided with an application knowledge about performance problems Information related to one problem we call a tuning technique A tuning technique describes a complete performance optimization scenario
26
Paradyn/Condor Week 2004 26 MATE: Analyzer Tunlets Each technique is implemented in MATE as a tunlet A tunlet contains specific code (analysis logic) related to one concrete performance problem –measure points – what events are needed –performance model – how to determine bottlenecks and solutions –tuning actions/points/synchronization - what to change, where, when A tunlet is a C/C++ library dynamically loaded to the Analyzer process Analyzer Tunlet Measure pointsTuning point, action, sync Performance model
27
Paradyn/Condor Week 2004 27 MATE: Analyzer Events (from DMLibs) via TCP/IP Event Collector thread DTAPI Controller Tunlet Event Repository Application model AC Proxy Tuning request (to tuner) via TCP/IP Instrument. request (to monitor) via TCP/IP MetaData (from ACs) via TCP/IP Tunlet
28
Paradyn/Condor Week 2004 28 1.Introduction 2.Dynamic Performance Tuning 3.MATE 4.Tuning Example 5.Conclusions and future work Content
29
Paradyn/Condor Week 2004 29 Tuning techniques Catalog (set of tuning techniques) OS –Message aggregation –Send/receive TCP/IP buffers size Standard library –Memory allocation PVM library –Communication mode –Data encoding mode –Message fragment size Application –Workload balancing –Number of workers Automatic approach Cooperative approach
30
Paradyn/Condor Week 2004 30 Tuning Example Workload balancing (App layer) Imbalance problem: –Heterogeneous computing and communication powers –Varying amount of distributed work Goal: –minimize the idle time by balancing the work among the processes considering efficiency of machines Balancing -> faster machines process more work than slower It cannot be statically balanced before program execution (different input data, network load, machine power and load)
31
Paradyn/Condor Week 2004 31 Tuning Example Workload balancing (App layer) Many scheduling methods -> Factoring Scheduling method –Work is divided into different-size tuples according to the factor Application must be tunable: –well known variable that represents the factor –the factor must be checked before each iteration of the work distribution –the work tuples are calculated using the factoring scheduling method and according to the current factor value
32
Paradyn/Condor Week 2004 32 Tuning Example Example application Forest Fire propagation – Xfire High computation cost Scenarios: 1) homogeneous and dedicated 2) heterogeneous and dedicated 3) heterogeneous and non-dedicated Benefits: 1) Up to 2% 2) Up to 49% 3) Up to 48%
33
Paradyn/Condor Week 2004 33 1.Introduction 2.Dynamic Performance Tuning 3.MATE 4.Tuning Techniques 5.Conclusions and future work Content
34
Paradyn/Condor Week 2004 34 Conclusions The principal conclusion: dynamic tuning works, is applicable, effective and useful in certain conditions Limits of such tuning -> incomplete application information Classification of layers where tuning can be performed (OS, libraries, apps) Approaches to tuning: automatic and cooperative Application knowledge representation: –measure points, performance model, tuning point/action/sync
35
Paradyn/Condor Week 2004 35 Conclusions Working prototype environment – MATE – that automatically monitors, analyses and tunes running applications Practical experiments conducted with MATE and parallel/distributed applications prove that it automatically adapts application behavior to existing conditions during run time!
36
Paradyn/Condor Week 2004 36 Future work Global and local analysis –Scalability (problems with global analysis) –Some problems can be treated locally Performance analysis –How tuning techniques influence other techniques –Other approaches than performance model Metrics –Complementary information provided by metrics Provision of the application knowledge –Tunlet provided externally in a declarative manner Instrumentation evaluation –Prediction of monitoring and tuning instrumentation cost
37
Paradyn/Condor Week 2004 37 Future work Tuning techniques –OS layer TCP/IP options (e.g. sending without delay – Nagles algorithm) I/O operations (e.g. read/write operations, I/O buffer size) –Library layer Investigation of problems in MPI, numerical libraries –Application layer Automatic selection of algorithm (e.g. sorting algorithm) Recommendations –Provision of good explanation to the user Towards grid
38
Paradyn/Condor Week 2004 Thesis March, 2004 Thank you very much
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.