1 / 10

An Execution Model for Heterogeneous Multicore Architectures

An Execution Model for Heterogeneous Multicore Architectures. Gregory Diamos, Andrew Kerr, and Sudhakar Yalamanchili Computer Architecture and Systems Laboratory Center for Experimental Research in Computer Systems School of Electrical and Computer Engineering Georgia Institute of Technology.

Michelle
Download Presentation

An Execution Model for Heterogeneous Multicore Architectures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Execution Model for Heterogeneous Multicore Architectures Gregory Diamos, Andrew Kerr, and Sudhakar Yalamanchili Computer Architecture and Systems Laboratory Center for Experimental Research in Computer Systems School of Electrical and Computer Engineering Georgia Institute of Technology

  2. Software Challenges of Heterogeneity • Programming Model • Execution Model • Portability • Performance

  3. System Space Single GPU Multicore CPU Multi GPU Multicore CPU Multi-node Level of Abstraction Runtime Execution Model (Harmony) Runtime Translation of Data-Parallel IR (Ocelot) System Size and Configuration

  4. Scalable Portable Execution – Harmony Runtime Cap Model 3 readInputs(); computeInvariants(); for all chunks { simulateChunk(); } generateResults(); Memory Inputs Outputs Inputs Outputs kernel chunk chunk Transparent scheduling, execution management of chunks kernel Harmony Run-time CPU CPU CPU ACC ACC ACC FIFO FIFO FIFO Local Memory Local Memory Local Memory Cache Cache Cache DMA DMA DMA Binary compatibility across system sizes Network (e.g., Hypertransport, QPI, PCIe) • Minimize/avoid retuning and porting applications as you add accelerators • Advanced optimizations • Speculation, performance prediction, kernel fusion

  5. Emerging Environment Datalog CUDA/OpenCL Language Front End Language Front End • Status: • Summer 2009 • With Prof. Nate Clark Kernel IR • Status: • Single node/multi-GPU Run Time (Harmony) Ocelot Emulator LLVM I/F • Status: • Test and Debug • Status: • In progress (Fall 2009) CUDAJIT Prof. H. Kim GPGPU Simulator Supported ISAs (MIPS, SPARC, x86, etc.)

  6. Emerging HVM Platform Architecture With K. Schwan and A. Gavrilovska

  7. Problem Scaling – Risk Analysis Application Measured execution times GPU interactive overhead dominates With latest CPUs (2x faster) and GPUs(4x faster), GPU advantage should grow by 2x

  8. Other Applications

  9. GPU Compilation Flow Abstract Syntax Tree (Datalog Clauses) Clauses to Execution Units Execution Group P GPU (EU) GPU (EU) GPU (EU) P Predicates to Data Structures Execution Units to Algorithms (Kernels) Data Structures Compute Kernels Runtime Mapping of Kernels to Cores Runtime GPU Core CPU Core

More Related