190 likes | 340 Views
A Compiler-Based Tool for Array Analysis in HPC Applications. Presenter: Ahmad Qawasmeh Advisor: Dr. Barbara Chapman. 2013 PhD Showcase Event. 1. 2. 3. 4. Array Analysis Techniques. Motivation. Related Work. Array Analysis Module in OpenUH. Outline. 5. Our Integrated System.
E N D
A Compiler-Based Tool for Array Analysis in HPC Applications Presenter: Ahmad Qawasmeh Advisor: Dr. Barbara Chapman 2013 PhD Showcase Event
1. 2. 3. 4. Array Analysis Techniques Motivation Related Work Array Analysis Module in OpenUH Outline 5. Our Integrated System
Future work Conclusion Dragon Tool 7. 8. 6. Outline
C B A D Motivation Reduce Data movement Identify and fix inefficiencies in defining arrays Enhance analyzing code Identify auto-parallelization opportunities
Parallelization/Reduce Data Movement sdfs Host GPU sdfs GPU Memory Main Memory Application data Application data A[lb:ub] GPU cores Host cores !$acc region copyin(A(1:100,1:100)) 5
Access Density/Array Region start Declare char A[20] for i = 0 to 19 A[i] = … ………. ………. for i = 0 to 10 … = A[i] for i = 10 to 15 … = A[i] ………. ………. for i = 10 to 15 … = A[i] ………. ………. for i = 15 to 17 … = A[i] end 25 Access Density USE 20 15 10 USE USE 5 DEF 4 times at diff positions 5 10 15 20 Region 6
C B F A E D Related Work Par4All compiler tackles data transfer management between host and accelerator using array regions analysis. Array Regrouping was targeted. Dragon was previously developed with some limitations PGI accelerator compiler applies array region analysis to reduce memory transfers HPM toolkit, PAPI, and OProfile provide facilities to instrument programs, record HWC data, and analyze results. CAPO depends on interprocedural data dependence info to insert compiler directives to facilitate parallelism 7
C B A Array Access Analysis Techniques Importance for optimizations in parallel compiler What is Array Region Analysis? It is usually impractical to simply list elements referenced
Array Access Analysis Techniques Methods in term of efficiency and precision: Linear-based (Region) Reference-based(Atom) Triplet-based (RS) Classic Precision Efficiency 9
Our Integrated System OpenUH IPA Phase Extension ARA Module HPC Application Dragon Array Analysis Graph .rgn file Lowering HL-Whirl-Tree 10
C B A D Conclusion We show that this information can be critical and crucial for a better parallelization, cache and memory utilization. We unfold an interactive tool to find the hotspot portions of interprocedural arrays in HPC applications. Reduce data transfers by exploiting the sub-array offloading functionality supported by D-B GPU programming models. Our tool has been tested on some HPC benchmarks.
C B A Future Work Extend our array analysis tool to support the analysis and visualization of remote array accesses in PGAS context Combine Array Analysis and Data Dependency modules in OpenUH to enhance memory and cache utilization Enrich our tool’s features by supporting high performance 3D visualization via Qt OpenGL module
Bibliography [1] P. Group. (2008) Pgi compilers, gpus and you! pgi presentation sc08.pdf. [Online]. Available: http://www.pgroup.com/lit/presentations/ [2] M. Amini, F. Coelho, F. Irigoin, and R. Keryell, “Static compilation analysis for host-accelerator communication optimization,” in The 24th International Workshop on Languages and Compilers for Parallel Computing, Fort Collins, Colorado, Sep. 2011. [3] (2001) Code parallelization with capo – a user manual. [Online]. Available: http://people.nas.nasa.gov/hjin/CAPO/nas-01-008-abstract.html [4] (2008) Hardware performance monitor(hpm) toolkit users guide. [Online]. Available: https://wiki.alcf.anl.gov/images/5/59/HPM ug.pdf [5] P. J. Mucci, S. Browne, C. Deane, and G. Ho. (1999, Sep.) Papi: A portable interface to hardware performance counters. dodugc99-papi.pdf. [Online]. Available: http://web.eecs.utk.edu/ mucci/latest/pubs/
Bibliography [6] W. E. Cohen. (2004) Tuning programs with oprofile. Oprofile.pdf. [Online]. Available: http://people.redhat.com/wcohen/ [7] O. Hernandez, C. Liao, and B. Chapman, “Dragon: A static and dynamic tool for openmp,” in In Workshop on OpenMP Applications and Tools (WOMPAT 2004), 2005, pp. 53–66. [8] A. Qawasmeh, B. Chapman, and A. Banerjee, “A Compiler-Based Tool for Array Analysis in HPC Applications,” In Proceedings of the 41st International Conference on Parallel Computing Workshops, Pittsburgh, PA, USA, Sep. 2012, pp. 454–463. [9] X. Shen, Y. Gao, C. Ding, and R. Archambault, “Lightweight reference affinity analysis,” in In Proceedings of the 19th ACM International Conference on Supercomputing, Boston, MA, USA, Jun. 2005, pp. 131–140. [10] (2012) High Performance Computing and Tools Research Group. [Online]. Available: http://www2.cs.uh.edu/~hpctools/ 18