1 / 9

Data Shackling

Data Shackling. Locality enhancement of dense numerical linear algebra codes Traversals along co-ordinate axes Data-centric reference for each statement No array copying. Workloads. Basic Linear Algebra Subroutines Dot Product : xTy Sapxy : α * x + y Matrix vector : y = A * x

fell
Download Presentation

Data Shackling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Shackling • Locality enhancement of dense numerical linear algebra codes • Traversals along co-ordinate axes • Data-centric reference for each statement • No array copying

  2. Workloads • Basic Linear Algebra Subroutines • Dot Product : xTy • Sapxy : α * x + y • Matrix vector : y = A * x • Triangular Solve : Lx = b • Matrix Multiplication : C = C + A*B • Matrix Factorizations • Cholesky • LU • QR

  3. Why is BLAS approach not good enough? • What is BLAS? • Machine specific code for each BLAS • Matrix factorizations are blocked • Exposes a BLAS like interface •  • BLAS not portable but matrix codes are •  • Automating the production of block codes not easy • Too difficult an approach for compilers

  4. What is a data shackle? • One array in the program • Divided into blocks • Using parallel equally spaced cutting planes • Order for visiting blocks of data • One reference of that array is selected for each statement

  5. Intuitively…A Data Shackle • Specifies the order for touching blocks • Data centric reference • Which iterations of each statement are performed • When that block is touched • Code is generated to group those iterations together

  6. Example – Matrix Multiplication • Obtain the data shackle on C • Divide C into • 25 x 25 blocks • Vertical and horizontal cutting planes • Visited : L-R T-B • C(i,j) is data-centric reference

  7. Example Contd…

  8. Discussion on example • More optimzed code can be produced by folding the loop bounds • Within a block order of instructions preserved • Cannot say the same for program • Not the real blocked code • Low locality for A and B • Compose shackles .. later

  9. Are all shackles legal? • Definitely not • Program instructions reordered

More Related