1 / 19

A Compiler-Based Tool for Array Analysis in HPC Applications

A Compiler-Based Tool for Array Analysis in HPC Applications. Presenter: Ahmad Qawasmeh Advisor: Dr. Barbara Chapman. 2013 PhD Showcase Event. 1. 2. 3. 4. Array Analysis Techniques. Motivation. Related Work. Array Analysis Module in OpenUH. Outline. 5. Our Integrated System.

aviva
Download Presentation

A Compiler-Based Tool for Array Analysis in HPC Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Compiler-Based Tool for Array Analysis in HPC Applications Presenter: Ahmad Qawasmeh Advisor: Dr. Barbara Chapman 2013 PhD Showcase Event

  2. 1. 2. 3. 4. Array Analysis Techniques Motivation Related Work Array Analysis Module in OpenUH Outline 5. Our Integrated System

  3. Future work Conclusion Dragon Tool 7. 8. 6. Outline

  4. C B A D Motivation Reduce Data movement Identify and fix inefficiencies in defining arrays Enhance analyzing code Identify auto-parallelization opportunities

  5. Parallelization/Reduce Data Movement sdfs Host GPU sdfs GPU Memory Main Memory Application data Application data A[lb:ub] GPU cores Host cores !$acc region copyin(A(1:100,1:100)) 5

  6. Access Density/Array Region start Declare char A[20] for i = 0 to 19 A[i] = … ………. ………. for i = 0 to 10 … = A[i] for i = 10 to 15 … = A[i] ………. ………. for i = 10 to 15 … = A[i] ………. ………. for i = 15 to 17 … = A[i] end 25 Access Density USE 20 15 10 USE USE 5 DEF 4 times at diff positions 5 10 15 20 Region 6

  7. C B F A E D Related Work Par4All compiler tackles data transfer management between host and accelerator using array regions analysis. Array Regrouping was targeted. Dragon was previously developed with some limitations PGI accelerator compiler applies array region analysis to reduce memory transfers HPM toolkit, PAPI, and OProfile provide facilities to instrument programs, record HWC data, and analyze results. CAPO depends on interprocedural data dependence info to insert compiler directives to facilitate parallelism 7

  8. C B A Array Access Analysis Techniques Importance for optimizations in parallel compiler What is Array Region Analysis? It is usually impractical to simply list elements referenced

  9. Array Access Analysis Techniques Methods in term of efficiency and precision: Linear-based (Region) Reference-based(Atom) Triplet-based (RS) Classic Precision Efficiency 9

  10. Our Integrated System OpenUH IPA Phase Extension ARA Module HPC Application Dragon Array Analysis Graph .rgn file Lowering HL-Whirl-Tree 10

  11. Dragon Array Analysis Graph 11

  12. Dragon Call Graph for NAS LU Benchmark 12

  13. Dragon Array Graph for NAS LU Benchmark 13

  14. Dragon Array Graph for NAS LU Benchmark 14

  15. C B A D Conclusion We show that this information can be critical and crucial for a better parallelization, cache and memory utilization. We unfold an interactive tool to find the hotspot portions of interprocedural arrays in HPC applications. Reduce data transfers by exploiting the sub-array offloading functionality supported by D-B GPU programming models. Our tool has been tested on some HPC benchmarks.

  16. C B A Future Work Extend our array analysis tool to support the analysis and visualization of remote array accesses in PGAS context Combine Array Analysis and Data Dependency modules in OpenUH to enhance memory and cache utilization Enrich our tool’s features by supporting high performance 3D visualization via Qt OpenGL module

  17. Bibliography [1] P. Group. (2008) Pgi compilers, gpus and you! pgi presentation sc08.pdf. [Online]. Available: http://www.pgroup.com/lit/presentations/ [2] M. Amini, F. Coelho, F. Irigoin, and R. Keryell, “Static compilation analysis for host-accelerator communication optimization,” in The 24th International Workshop on Languages and Compilers for Parallel Computing, Fort Collins, Colorado, Sep. 2011. [3] (2001) Code parallelization with capo – a user manual. [Online]. Available: http://people.nas.nasa.gov/hjin/CAPO/nas-01-008-abstract.html [4] (2008) Hardware performance monitor(hpm) toolkit users guide. [Online]. Available: https://wiki.alcf.anl.gov/images/5/59/HPM ug.pdf [5] P. J. Mucci, S. Browne, C. Deane, and G. Ho. (1999, Sep.) Papi: A portable interface to hardware performance counters. dodugc99-papi.pdf. [Online]. Available: http://web.eecs.utk.edu/ mucci/latest/pubs/

  18. Bibliography [6] W. E. Cohen. (2004) Tuning programs with oprofile. Oprofile.pdf. [Online]. Available: http://people.redhat.com/wcohen/ [7] O. Hernandez, C. Liao, and B. Chapman, “Dragon: A static and dynamic tool for openmp,” in In Workshop on OpenMP Applications and Tools (WOMPAT 2004), 2005, pp. 53–66. [8] A. Qawasmeh, B. Chapman, and A. Banerjee, “A Compiler-Based Tool for Array Analysis in HPC Applications,” In Proceedings of the 41st International Conference on Parallel Computing Workshops, Pittsburgh, PA, USA, Sep. 2012, pp. 454–463. [9] X. Shen, Y. Gao, C. Ding, and R. Archambault, “Lightweight reference affinity analysis,” in In Proceedings of the 19th ACM International Conference on Supercomputing, Boston, MA, USA, Jun. 2005, pp. 131–140. [10] (2012) High Performance Computing and Tools Research Group. [Online]. Available: http://www2.cs.uh.edu/~hpctools/ 18

  19. Thank You !

More Related