Download presentation
Presentation is loading. Please wait.
Published byFrancis Burke Modified over 9 years ago
1
TRACEREP: GATEWAY FOR SHARING AND COLLECTING TRACES IN HPC SYSTEMS Iván Pérez Enrique Vallejo José Luis Bosque University of Cantabria TraceRep IWSG'15 1
2
Overview HPC Traces - Introduction Traces for Application Developers Traces for Computer Architects Traces - Objections Goals BSC Trace Tools Extrae Paraver TraceRep Architecture Design Implementation Limitations Snapshots Conclusions and Future Work TraceRep IWSG'15 2
3
1. HPC Traces – Introduction HPC traces are sequences of events and messages recorded during the execution of a parallel HPC program. TraceRep IWSG'15 3
4
1.1. Traces for Application Developers TraceRep IWSG'15 Computation Synchronization Waits Point to Point Messages Load Unbalance Evaluation, tuning and optimization of applications 4
5
1.2. Traces for Computer Architects TraceRep IWSG'15 5 Evaluate computer architectures. Workloads for feeding simulators. Application Binaries Application Execution Extraction Tool Hardware model 1 Hardware model 2 Hardware model 3 Stats 1 Stats 2 Stats 3 Simulator
6
1.3. Traces - Objections Complexity of tools and environment. Limited access to HPC clusters. Traces can reach very large sizes. Traces are often not shared between researchers Traces are hard to obtain and distribute. The tracing effort is not recognized. TraceRep IWSG'15 6
7
1.4. TraceRep - Goals User friendly interface to collect traces. Support with multiple clusters. Easy to incorporate new clusters. Public trace repository. Computer architects can access to traces of parallel applications for their experiments. Users can upload their own traces for the community. Author encouragement: Authorship: Users can set Creative Commons licenses which protect the authorship of their traces. Citation of related work: Users can add a citation (.bib file) of a paper which studied the traced application, so it can be cited when the trace is used. TraceRep IWSG'15 7
8
Overview HPC Traces - Introduction Traces for Application Developers Traces for Computer Architects Traces - Objections Goals BSC Trace Tools Extrae Paraver TraceRep Architecture Design Implementation Limitations Snapshots Conclusions and Future Work TraceRep IWSG'15 8
9
2.1. Extrae Collects information during the program execution and generates traces: Runtime entries and exits, hardware counters, user functions, periodic samples… Supported programming models: MPI, OpenMP, CUDA, OpenCL, pthreads, OmpSs, Java, Python. Supported platforms: Linux clusters, BlueGene/Q, Cray, nVidia GPUs, Intel Xeon Phi, ARM, Android. TraceRep IWSG'15 Extrae configuration file 9
10
2.2. BSC Tools - Paraver TraceRep IWSG'15 Very flexible visualization tool of trace-files. 10
11
Overview HPC Traces - Introduction Traces for Application Developers Traces for Computer Architectures Traces - Objections Goals BSC Trace Tools Extrae Paraver TraceRep Architecture Design Implementation Limitations Sanpshots Conclusions and Future Work IWSG'15 11
12
3.1. TraceRep - Architecture TraceRep IWSG'15 12
13
3.2. TraceRep - Design TraceRep IWSG'15 13
14
3.2. TraceRep - Implementation TraceRep IWSG'15 Drupal’s modules covered most of the features. Trace extraction service has implementations in both sides: Gateway side: new Drupal module. Clusters side: Python scripts adapted to the specific cluster. 14 Drupal Cluster Trace Extraction Experiment Periodic Task Cluster Filesystem TraceRep directory Compiltation ToolsExtrae Resource Manager Makefile Scripts Is the experiment over?
15
3.4. TraceRep – Current prototype limitations Security: TraceRep users upload code to the HPC clusters Alternatives: Restricted privileges for the user account of TraceRep Require a cluster account per-user to extract traces Compilation: Paths to compilers and libraries can vary from cluster to cluster Compilation constrains: a generic Makefile is currently used for all source codes. Applications that use complex building tools are currently no supported. Alternative: provide a unified environment for compilation. Storage: Storage in the gateway server is limited (limitation of the service used) Alternative: $$$ TraceRep IWSG'15 15
16
3.5. Snapshots TraceRep IWSG'15 16 http://tracerep.unican.es
17
Overview HPC Traces - Introduction Traces for Application Developers Traces for Computer Architectures Traces - Objections Goals BSC Trace Tools Extrae Paraver TraceRep Architecture Design Implementation Limitations Snapshots Conclusions and Future Work TraceRep IWSG'15 17
18
4. TraceRep – Conclusions Traces are very useful for HPC parallel application developers and computer architects. TraceRep provides a user friendly interface to collect and share traces. It encourage to share traces through trace licensing and citations. There are some limitations that must be addressed, regarding security, compilation and storage. TraceRep IWSG'15 18
19
4. TraceRep – Future work Alternative frameworks to replace the Drupal prototype: Liferay [1] Apache Airavata [2] Improve the compilation toolchain to present a consistent view on different clusters and allow for more complex codes. Exploiting the advanced features of Paraver is complex. We are seeking for a way to integrate Paraver in TraceRep. TraceRep IWSG'15 19 [1] “Liferay” 2015. Available: http://www.liferay.com/http://www.liferay.com/ [2] “Apache Airavata architecture overview,” 2015. Available: http://airavata.apache.org/architecture/overview.html
20
TRACEREP: GATEWAY FOR SHARING AND COLLECTING TRACES IN HPC SYSTEMS Iván Pérez Enrique Vallejo Jose Luis Bosque University of Cantabria TraceRep IWSG'15 20 Thank you for your attention
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.