1 / 8

Ocelot and the SST- MacSim Simulator

Ocelot and the SST- MacSim Simulator. Genie Hsieh § , Andrew Kerr, Hyesoon Kim, Jaekyu Lee, Nagesh Lakshminarayana , Arun Rodrigues § , Sudhakar Yalamanchili. School of Computer Science and School of Electrical and Computer Engineering Georgia Institute of Technology

slade
Download Presentation

Ocelot and the SST- MacSim Simulator

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ocelot and the SST-MacSim Simulator Genie Hsieh§, Andrew Kerr, Hyesoon Kim, JaekyuLee, NageshLakshminarayana, Arun Rodrigues§, Sudhakar Yalamanchili School of Computer Science and School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA. 30332 §Scalable Computer Architecture Department Sandia National Laboratories Albuquerque, NM. 87185

  2. System Diversity Amazon EC2 GPU Instances Mobile Platforms Heterogeneity is Mainstream Tianhe-1A KeenelandSystem

  3. Heterogeneity On-Chip Vector Extensions AES Instructions Programmable Pipeline (GEN6) Programmable Accelerator Multiple models of Computation Multi-ISA Denver Sandy Bridge 16, PowerPC cores ARM Style • Accelerators • Crypto Engine • RegEx Engine • XML Engine Memory PowerEN

  4. Heterogeneous Systems: Keeneland Courtesy J. Vetter (GT/ORNL) Keeneland System (7 Racks) Rack (6 Chassis) S6500 Chassis (4 Nodes) ProLiant SL390s G7 (2CPUs, 3GPUs) M2070 Xeon 5660 201528 GFLOPS 40306 GFLOPS 12000-Series Director Switch 6718 GFLOPS 1679GFLOPS 24/18 GB 515GFLOPS 67GFLOPS Integrated with NICS Datacenter GPFS and TG Full PCIe X16 bandwidth to all GPUs 4

  5. Heterogeneous Architecture & Systems Research Common Research Themes • Lexical Analyzer • Parser • Semantic analysis Focus on explicitly data parallel languages – bulk synchronous models • Memory Optimizations • Program Transformations • Control Flow Optimizations • + Many more • Optimization • Code generation • Post pass optimization Instruction set architecture • Microarchitecture • Memory systems • Network on Chip • Power Management • + Many more SIMT (Fermi) VLIW (Caymen) New Designs 5

  6. Research Infrastructure Challenges • Open source • Compiler infrastructures for GPU computing • Microarchitecture cycle-level timing simulators for heterogeneous architectures • Integration between compiler, simulators, and models • Scalable simulation infrastructures • Simulation wall! • Ability to integrate point tools Tile Tile Tile Tile Tile Tile Tile Tile Tile Tile

  7. Tutorial Overview Low level Compiler Infrastructure for GPU Computing Ocelot Dynamic Execution Infrastructure Andrew Kerr, SudhakarYalamanchili MacSim Heterogeneous Architecture Simulator Heterogeneous Cycle-level Architecture Models J. Lee, N. Lakshminarayana, H. Kim SST: Structural Simulation Toolkit Parallel Simulation Infrastructure G. Hsieh, A. Rodrigues

  8. Tutorial Schedule

More Related