1 / 16

Programming on IBM Cell Triblade

Programming on IBM Cell Triblade. Jagan Jayaraj ,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009. Rayleigh–Taylor instability.

mireya
Download Presentation

Programming on IBM Cell Triblade

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming on IBM Cell Triblade Jagan Jayaraj ,Pei-Hung Lin, Mike Knox and Paul Woodward University of Minnesota April 1, 2009

  2. Rayleigh–Taylor instability • An instability of an interface between two fluids of different densities, which occurs when the lighter fluid is pushing the heavier fluid. • Using multi-fluids Piecewise-Parabolic Method(PPM) to implement R-T instability simulation • Program is written in Fortran

  3. TriBlade • Two QS22 blades, each with 2 PowerXCell 8i CPUs • LS21 blade with two dual-core AMD Opterons • 16GB memory for LS21 and 8GB memory for QS22

  4. LCSE Cell Cluster • 6 Triblades • 4 QS22 Cell blades • 2 QS20 Cell blades • 4 AMD Quadcore Systems

  5. Login instructions • Account credentials should be in your email. • Guest account: lcse / lcse$ncsa! • Login steps: • SSH to frodo.lcse.umn.edu • Once logged in to frodo SSH to an assigned Cell Processor host • AMD – rra001a ~ rra006a • Cell – rra001b / rra001c ~ rra006b/rra006c

  6. Software available • Cell SDK 3.1 • OpenMPI 1.3 • DaCS Fortran bindings • Compilers • AMD: gfortran, gcc 4.1.2 • PPU: ppuxlf, ppu-gcc • SPU: spuxlf, spu-gcc • Example code is available on /mnt/scratch/NCSA_Example

  7. Compilation and Execution • On AMD node: • make ppm4f-x86 • On Cell node: • make ppm4f-ppu • On AMD node: • ./ppm4f-x86

  8. Triblade programming paradigm • Three levels of parallelism: • within-Cell • within-node • node-to-node • Compute-communication overlap • DMA • DaCS • MPI

  9. Programming for IBM Cell Tri-blade • Single code for Roadrunner and non-RR systems • Using lots #ifdef, #if, #endif… • Using preprocessor to generate three codes • Minimize the manual translation for SPU code • Using Fortran to Cell C translator, • Tedious portions of the SPU code can be translated. • Fortran codes for PPU and AMD • Fortran binding programs for C intrinsic libraries • Keep memory footprint small

  10. Single Source Code Preprocessor SPU Fortran code PPU Fortran code AMD Fortran code Translation SPU C code Fortran Binding Programs SPU C Compiler PPU Fortran Compiler GNU Fortran Compiler Embedded SPU Executable PPU Executable AMD Executable

  11. Division of labor • Define jobs for AMD, PPU and SPU clearly • AMD: I/O, MPI, relay data to Cell… • PPU: Transfer data, manage SPUs • SPU: Just compute

  12. Items to care • Three codes for three different ISAs • Different endian-ness between PPU and AMD • Need to do byte-swapping • 64bit/32bit conversion • SPU supports 32bit address only, but DaCS requires 64bit address mode

  13. Translator • Fortran to C with Cell extensions • Needs directives • Built with ANTLR • Handles: • Vector and scalar loops • DMAs (Including List DMAs) • Variable declarations • Conditional vector moves

  14. References • Woodward, P. R., J. Jayaraj, P.-H. Lin, and P.-C. Yew, “Moving Scientific Codes to Multicore Microprocessor CPUs,” Computing in Science & Engineering, special issue on novel architectures, Nov., 2008, p. 16-25. Also available at www.lcse.umn.edu/CiSE. • Woodward, P. R., J. Jayaraj, P.-H. Lin, and D. Porter, “Programming Techniques for Moving Scientific Simulation Codes to Roadrunner,” tutorial given 3/12/08 at Los Alamos, link available at www.lanl.gov/roadrunner/rrtechnicalseminars2008. • Woodward, P. R., J. Jayaraj, P.-H. Lin, and W. Dai, “First Experience of Compressible Gas Dynamics Simulationon the Los Alamos Roadrunner Machine,” submitted to Concurrency and Computation Practice and Experience, preprint available at www.lcse.umn.edu/RR-docs. • http://www.lcse.umn.edu/NCSA_Workshop/

More Related