Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Compiler-Based Tool for Array Analysis in HPC Applications Presenter: Ahmad Qawasmeh Advisor: Dr. Barbara Chapman 2013 PhD Showcase Event.

Similar presentations


Presentation on theme: "A Compiler-Based Tool for Array Analysis in HPC Applications Presenter: Ahmad Qawasmeh Advisor: Dr. Barbara Chapman 2013 PhD Showcase Event."— Presentation transcript:

1 A Compiler-Based Tool for Array Analysis in HPC Applications Presenter: Ahmad Qawasmeh Advisor: Dr. Barbara Chapman 2013 PhD Showcase Event

2 2 Motivation 1. Related Work 2. Array Analysis Techniques 3. Array Analysis Module in OpenUH 4. Our Integrated System 5. Outline

3 3 6. Dragon Tool 7. Conclusion 8. Future work Outline

4 Motivation 4 B Reduce Data movement A Identify and fix inefficiencies in defining arrays D Enhance analyzing code C Identify auto-parallelization opportunities

5 Parallelization/Reduce Data Movement sdfs Host Main Memory Application data sdfs GPU GPU Memory Application data Host cores GPU cores A[lb:ub] 5 !$acc region copyin(A(1:100,1:100))

6 Access Density/Array Region 5101520 5 10 15 20 25 DEF USE start Declare char A[20] for i = 0 to 19 A[i] = … ………. for i = 0 to 10 … = A[i] for i = 10 to 15 … = A[i] ………. for i = 10 to 15 … = A[i] ………. for i = 15 to 17 … = A[i] end 4 times at diff positions AccessDensity Region 6

7 Related Work B Par4All compiler tackles data transfer management between host and accelerator using array regions analysis. A PGI accelerator compiler applies array region analysis to reduce memory transfers D C CAPO depends on interprocedural data dependence info to insert compiler directives to facilitate parallelism E Dragon was previously developed with some limitations HPM toolkit, PAPI, and OProfile provide facilities to instrument programs, record HWC data, and analyze results. F Array Regrouping was targeted. 7

8 Array Access Analysis Techniques 8 B Importance for optimizations in parallel compiler A What is Array Region Analysis? C It is usually impractical to simply list elements referenced

9 Array Access Analysis Techniques Methods in term of efficiency and precision: Triplet-based (RS) Linear-based (Region) Reference- based(Atom) Precision Efficiency Classic 9

10 Our Integrated System HPC Application ARA Module HL-Whirl-Tree Dragon Array Analysis Graph Lowering.rgn file OpenUH IPA Phase Extension 10

11 Dragon Array Analysis Graph 11

12 Dragon Call Graph for NAS LU Benchmark 12

13 Dragon Array Graph for NAS LU Benchmark 13

14 Dragon Array Graph for NAS LU Benchmark 14

15 Conclusion 15 B We show that this information can be critical and crucial for a better parallelization, cache and memory utilization. A We unfold an interactive tool to find the hotspot portions of interprocedural arrays in HPC applications. C Reduce data transfers by exploiting the sub-array offloading functionality supported by D-B GPU programming models. D Our tool has been tested on some HPC benchmarks.

16 Future Work 16 B Extend our array analysis tool to support the analysis and visualization of remote array accesses in PGAS context A Combine Array Analysis and Data Dependency modules in OpenUH to enhance memory and cache utilization C Enrich our tool’s features by supporting high performance 3D visualization via Qt OpenGL module

17 Bibliography [1] P. Group. (2008) Pgi compilers, gpus and you! pgi presentation sc08.pdf. [Online]. Available: http://www.pgroup.com/lit/presentations/ [2] M. Amini, F. Coelho, F. Irigoin, and R. Keryell, “Static compilation analysis for host- accelerator communication optimization,” in The 24 th International Workshop on Languages and Compilers for Parallel Computing, Fort Collins, Colorado, Sep. 2011. [3] (2001) Code parallelization with capo – a user manual. [Online]. Available: http://people.nas.nasa.gov/hjin/CAPO/nas-01-008-abstract.html http://people.nas.nasa.gov/hjin/CAPO/nas-01-008-abstract.html [4] (2008) Hardware performance monitor(hpm) toolkit users guide. [Online]. Available: https://wiki.alcf.anl.gov/images/5/59/HPM ug.pdf [5] P. J. Mucci, S. Browne, C. Deane, and G. Ho. (1999, Sep.) Papi: A portable interface to hardware performance counters. dodugc99-papi.pdf. [Online]. Available: http://web.eecs.utk.edu/ mucci/latest/pubs/ 17

18 Bibliography [6] W. E. Cohen. (2004) Tuning programs with oprofile. Oprofile.pdf. [Online]. Available: http://people.redhat.com/wcohen/ http://people.redhat.com/wcohen/ [7] O. Hernandez, C. Liao, and B. Chapman, “Dragon: A static and dynamic tool for openmp,” in In Workshop on OpenMP Applications and Tools (WOMPAT 2004), 2005, pp. 53–66. [8] A. Qawasmeh, B. Chapman, and A. Banerjee, “A Compiler-Based Tool for Array Analysis in HPC Applications,” In Proceedings of the 41st International Conference on Parallel Computing Workshops, Pittsburgh, PA, USA, Sep. 2012, pp. 454–463. [9] X. Shen, Y. Gao, C. Ding, and R. Archambault, “Lightweight reference affinity analysis,” in In Proceedings of the 19th ACM International Conference on Supercomputing, Boston, MA, USA, Jun. 2005, pp. 131–140. [10] (2012) High Performance Computing and Tools Research Group. [Online]. Available: http://www2.cs.uh.edu/~hpctools/ http://www2.cs.uh.edu/~hpctools/ 18

19


Download ppt "A Compiler-Based Tool for Array Analysis in HPC Applications Presenter: Ahmad Qawasmeh Advisor: Dr. Barbara Chapman 2013 PhD Showcase Event."

Similar presentations


Ads by Google