CEPBA-Tools experiences with MRNet and Dyninst Judit Gimenez, German Llort, Harald Servat
Paradyn Week, April-May 2007 Outline CEPBA-Tools environment OpenMP instrumentation using Dyninst Tracing control trough MRNet Our wish list
Paradyn Week, April-May 2007 Where we live Traceland … … aiming at detailed analysis and flexibility in the tools
Paradyn Week, April-May 2007 Importance of details Variance is important Along time Across processors Highly non linear systems Microscopic effects are important May have large macroscopic impact
Paradyn Week, April-May 2007 CEPBA-Tools MPtrace OMPItrace.prv.pcf.cfg Paraver Aaa miss ratio 0.8 Bbb IPC 0.5 Ccc Efficiency 0.4 Ddd bandwidth 520 Paramedir Dimemas.trf MPIDtrace TraceDriver Java, WAS GT4 JIS Nanos Compiler aixtrace2prv AIXtrace LTT2prv LTTtrace GPFS2prv GPFStrace Data display tools trace2trace
Paradyn Week, April-May 2007 CEPBA-Tools Challenge What can we say about an unknown application/system without looking at the source code in short time?
Paradyn Week, April-May 2007 OpenMP instrumentation OMPtrace Instrumentation of OpenMP Insight on: application Run Time scheduling Based on DiTools (SGI/Irix) only calls to dynamic libraries DPCL (IBM/AIX) functions and calls referenced within binary Dyninst (Itamium) functions and calls referenced within binary LD_PRELOAD (some Linux) only calls to dynamic libraries “Evolution” through the available platform except for Itanium (NASA-AMES request)
Paradyn Week, April-May 2007 OpenMP compilation and Run Time Call A A() { } !$omp parallel do do I=1,N loop body enddo Source program libomp Call A A() { } kmpc_fork_call _A_LN_par_regionID { } do I=start,end loop body enddo Idle() { Compiler generated
Paradyn Week, April-May 2007 OpenMP instrumentation points Timeline 1 1 USR_FCT, idA HWC i, Delta OMP_PAR, (Fork/join) PAR_FCT, A_LN_par_regionID HWC i, Delta PAR_FCT, 0 HWC i, Delta (Fork/join) OMP_PAR, USR_FCT, 0 HWC i, Delta 6 Main thread Call A A() { } kmpc_fork_call _A_LN_par_regionID { } do I=start,end loop body enddo
Paradyn Week, April-May 2007 CEPBA-Tools The issue Sufficient information / sufficiently detailed Usable by presentation tool The environment evolution ( ) from few processes to instrumenting hours of execution including more and more information hardware counters, call stack, network counters, system resource usage, MPI collective internals......from traces of few MB to hundreds of GB
Paradyn Week, April-May 2007 Scalability of tracing Techniques for achieving scalability User specified on/off Limit file size (stop when reached, circular buffer) Only computing burst + counters + statistics Library Summarization (software counters – MPI_Iprobe/ MPI_Test) Trace2trace utilities Partial views... autonomic tracing library
Paradyn Week, April-May 2007 MPItrace + MRNet user login node
Paradyn Week, April-May 2007 First target with MRNet A real problem scenario on MareNostrum some large runs punctually have very large degraded collectives instrumenting full run including details of collectives implementation would produce a huge trace Solution MPItrace + MRNet control which information is flushed to disk discard all the details except the related with large collectives
Paradyn Week, April-May 2007 …i+m …1 Implementation Instrumenting on a circular buffer Periodically the MRNet front-end requests information on the collectives duration the “spy” thread stops the main thread analyze the tracing buffer –collects information on the collectives –sends details on the range and duration the root sends back a mask of selection the “spy” thread flushes to disk the selected data resumes the application i …i+n 10…300 i 0 …
Paradyn Week, April-May 2007 First traces – CPMD 245MB, >15500 col <1MB, <85 col 25MB, <85 col LIMIT >= 35ms
Paradyn Week, April-May 2007 First traces – MRNet front-end analysis
Paradyn Week, April-May 2007 Next steps for MPItrace+MRnet Analysis of MRNet Evaluate impact topology / mapping Library control - maximum information, minimum data Automatic switching driven by on-line analysis Tracing level, type of data (counters set, instr. points), on/off Clustering, periodicity detection
Paradyn Week, April-May 2007 Our wish list Dyninst Support to MPI+OpenMP instrumentation Available for PowerPC MRNet Automatically compute the best topology based on available resources maybe considering user preferences about mapping, dispersion degree (fan-out)... Improve MRNet integration with MPI applications