1 / 42

A Case for Source-Level Transformations in MATLAB

A Case for Source-Level Transformations in MATLAB. Vijay Menon and Keshav Pingali Cornell University. The MaJic Project at Illinois/Cornell. George Almasi Luiz De Rose David Padua. MATLAB. High-Level Interpreted Language for Numerical Computing Matrix is 1st class type

breena
Download Presentation

A Case for Source-Level Transformations in MATLAB

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi Luiz De Rose David Padua

  2. MATLAB • High-Level Interpreted Language for Numerical Computing • Matrix is 1st class type • Library of numerical functions • Application Domains • Image Processing • Structural Mechanics • Computational Finance

  3. The Problem • Development is fast... • ~10X as concise as C/Fortran • Performance is slow! • ~10X as slow as C/Fortran • Conventional Approach: • Rewrite • Compile

  4. Our Approach: Source-Level Optimization • Apply high-level transformations directly on MATLAB codes • Significant performance benefit for: • interpreted code • compiled code

  5. Outline • Overheads in MATLAB • Conventional Compilation • Source-Level Optimization • Comparison • Implementation Status

  6. Outline • Overheads in MATLAB • Type/Shape Checking • Memory Management • Array Bounds Checking • Conventional Compilation • Source-Level Optimization • Comparison • Implementation Status

  7. MATLAB has no type/shape declarations Consider: A * B Interpreter checks to perform multiply (*) Shape Scalar*Scalar Scalar*Matrix Matrix*Matrix Type/Shape Checking • Type • Real*Real • Real*Complex • Complex*Complex

  8. Consider: for i = 1:n y = y + a * x(i) end Loops perform redundant checks magnify interpreter overhead Type/Shape Checking

  9. Memory Management: Dynamic Resizing • Consider: x(10) = 10; • C/Fortran: x must have >= 10 elements • MATLAB: x is resized if needed • Memory reallocated • Data copied

  10. Memory Management: Dynamic Resizing • MATLAB dynamically grows arrays: for i = 1 : 1000 x(i) = i; end • Every iteration triggers resize! • 1,000 memory allocations • ~500,000 elements copied • Execution Time: • x is undefined: 14.2 seconds • x is already defined: 0.37 seconds

  11. Array Bounds Checking • Consider array indexing: x(i) = y(i); • Failed Bounds Check on • x(i) can trigger resize • y(i) can trigger error

  12. Array Bounds Checking • In a loop: for i = 3:100 x(i) = x(i-1) + x(i-2); end • Interpreter performance redundant checks • Compiler work: • Nonresizable arrays: Gupta PLDI’90 • Resizable arrays: more difficult

  13. Common Theme • Loops magnify overheads • every iteration: redundant checks, resizes, … • MATLAB interprets naively • computes as is • no reorganization to optimize

  14. Outline • Overheads in MATLAB • Conventional Compilation • Compile to C/Fortran • Rely on C/Fortran compiler for optimization • Source-Level Optimization • Comparison • Implementation Status

  15. MATLAB Compilers • Compile to C/C++/Fortran • MCC -> C (The MathWorks) • MATCOM -> C++ (Mathtools) • FALCON -> F90 (U of Illinois) • Native compiler generates executable code: • Link back into MATLAB environment • Run as stand-alone program

  16. The MCC Compiler • Safe Optimization: • Type Inference - no declarations in MATLAB • Eliminate Type Checks / Reduce Storage • Specialize for real input variables • Always legal! • Unsafe Optimization: • Assume all data is real • Eliminate all bounds checks - disallow resizing • User must ensure legality!

  17. Falcon Benchmarks • Collected by DeRose from MATLAB users at Illinois/NCSA • Element/Loop Intensive • CN - Crank-Nicholson PDE Solver • Di - Dirichlet PDE Solver • FD - Finite Difference PDE Solver • Ga - Galerkin PDE Solver • IC - Incomplete Cholesky Factorization • Memory Intensive • AQ - Adaptive Quadrature w/ Simpson’s Rule • EC - Euler-Cromer 2 body problem • RK - Runga Kutta 2 body problem • Library Intensive • CG - Conjugate Gradients Iterative Solver • Mei - 3D surface Generation • QMR - Quasi-Minimal Residual • SOR - Successive Over-Relaxation AQ

  18. MCC: Safe Optimizations

  19. MCC: Unsafe Optimizations Note: User must ensure legality!

  20. Outline • Overheads in MATLAB • Conventional Compilation • Source-Level Optimization • Vectorization • Preallocation • Expression Optimization • Comparison • Implementation Status

  21. Vectorization • Loops are expensive • Overheads are magnified • Idea: Eliminate Loops • Map loops to higher-level matrix operations • Interpreter uses efficient libraries • BLAS • LINPACK/EISPACK

  22. Example of Vectorization • In Galerkin, 98% of execution spent in: for i = 1:N for j = 1:N phi(k) += a(i,j)*x(i)*y(i); end end

  23. Vectorized Code • In Optimized Galerkin: phi(k) += x*a*y’; • Fragment Speedup: 260 • Program Speedup: 110 • Note: Not always possible!

  24. Effect of Vectorization

  25. Preallocation • Eliminate Dynamic Resizing • Try to predict eventual size of array • Insert early allocation when possible: • x = zeros(1000,1); • Resizing will not be triggered

  26. Example of Preallocation • In Euler-Cromer, 87% of time spent in: for i = 1:N r(i) = … th(i) = … t(i) = … k(i) = … p(i) = … … end

  27. Preallocated Code • In Optimized Euler-Cromer: r = zeros(1,N); ... for i = 1:N r(i) = … … end • Fragment Speedup: 7 • Program Speedup: 4

  28. Effect of Preallocation

  29. Expression Optimization • MATLAB interprets expressions naïvely in left to right order • Simple restructuring may significantly effects execution time, e.g.: • A*B*x : O(n3) flops • A*(B*x) : O(n2) flops

  30. Example of Expression Optimization • In QMR, 70% of execution spent in: w = A’*q; • A : 420x420 matrix • q, w : 420x1 vectors • A’ = transpose(A)

  31. Expression Optimized Code • In Optimized QMR: A’*q == (q’*A)’ w = (q’*A)’; • Transpose 2 vectors instead 1 matrix • Fragment Speedup: 20 • Program Speedup: 3

  32. Effect of Expression Optimization

  33. Summary Source-Level

  34. Comparison

  35. Point #1: • Source optimizations can outperform MCC

  36. Point #2: • Source optimizations complement MCC

  37. Benefits of Source-Level Optimizations • Vectorization • Directly eliminates loop overhead • Move work to hand-optimized BLAS • Preallocation • Eliminates resizing overhead • Enables MCC array bounds elimination • Expression Optimization • Uses algebraic info unavailable in C/Fortran

  38. Implementation Status • Illinois/Cornell MaJic system • Just-in-time MATLAB interpreter/compiler • Incorporates Source-Level Transformation • Semantic Optimization (Menon/Pingali ICS’99) • Vectorization/BLAS call generation • Expression Optimization • Preallocation/Bounds Check Optimization (Work in progress)

  39. Conclusion • Source Level Optimizations are important for enhancing performance of MATLAB whether code is just interpreted or later compiled

  40. THE END

  41. Unsafe Type Check Removal • Correct on 11/12 Codes

  42. Unsafe Bounds Check Removal • Correct on 7/12 Codes

More Related