1 / 20

vCUDA: GPU Accelerated High Performance Computing in Virtual Machines

Lin Shi, Hao Chen and Jianhua Sun. vCUDA: GPU Accelerated High Performance Computing in Virtual Machines. IEEE 2009. Lecture Outline. Abstract 3 Background 4 Motivation 5 CUDA Architecture 7 vCUDA Architecture 8 Experiment Result 13 Conclusion 19. Abstract.

naasir
Download Presentation

vCUDA: GPU Accelerated High Performance Computing in Virtual Machines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lin Shi, Hao Chen and Jianhua Sun vCUDA: GPU Accelerated High Performance Computing in Virtual Machines IEEE 2009

  2. Lecture Outline • Abstract 3 • Background4 • Motivation5 • CUDA Architecture7 • vCUDA Architecture8 • Experiment Result13 • Conclusion19

  3. Abstract • This paper describe vCUDA, a GPGPU computation solution for virtual machine. The author announced that the API interception and redirection could provide transparent and high performance to the applications. • This paper would carry out the performance evaluation on the overhead of their framework.

  4. Background • VM(Virtual Machine)‏ • CUDA (Computation Unified Device Architecture)‏ • API (Application Programming Interface)‏ • API Interception, Redirection • RPC(Remote Procedure Call)‏

  5. Motivation • Virtualization may be the simplest solution to heterogeneous computation environment. • Hardware varied by vendors, it is not necessary for VM-developer to implements hardware drivers for them. (due to license, vendor would not public the source and kernel technique)‏

  6. Motivation ( cont. )‏ • Currently the virtualization does only support Accelerated Graphic API such as OpenGL, named VMGL, which is not used for general computation purpose.

  7. User Application << CUDA Extensions to C>> CUDA Runtime API CUDA Driver API CUDA Driver CUDA Enabled Device CUDA Architecture • Component Stack

  8. vCUDA Architecture • Split the stack into hardware/software binding User Application Part of SDK << CUDA Extensions to C>> CUDA Runtime API soft binding CUDA Driver API Direct communicate CUDA Driver hard binding CUDA Enabled Device

  9. vCUDA Architecture ( cont. )‏ • Re-group the stack into host and remote side. User Application Part of SDK << CUDA Extensions to C>> [v]CUDA Runtime API Remote binding (guestOS)‏ [v]CUDA Driver API [v]CUDA Enabled Device(vGPU)‏ CUDA Driver API Host binding CUDA Driver CUDA Enabled Device

  10. vCUDA Architecture ( cont. )‏ • Use fake API as adapter to adapt the instant driver and the virtual driver. • API Interception • Parameters passed • Order Semantics • Hardware State • Communication • Use Lazy-RPC Transmission • Use XML-RPC as high-level communication.(for cross-platform requirement)‏ [v]CUDA Runtime API Remote binding (guestOS)‏ [v]CUDA Driver API [v]CUDA Enabled Device(vGPU)‏

  11. vCUDA Architecture ( cont. )‏ Host OS Virtual Machine OS Non instant API lazyRPC Instant API

  12. vCUDA Architecture ( cont. )‏ • vCUDA API with virtual GPU • Lazy RPC • Reduce the overhead of switching between host OS and guest OS. vGPU Hardware states NonInstant Package AP LazyRPC NonInstant API call API Invocation Instant api call GPU Stub vStub

  13. Experiment Result • Criteria • Performance • Lazy RPC and Concurrency • Suspend& Resume • Compatibility

  14. Experiment Result ( cont. ) Experiment Result ( cont. ) • Criteria • Performance • Lazy RPC and Concurrency • Suspend& Resume • Compatibility

  15. Experiment Result ( cont. ) Experiment Result ( cont. ) • Criteria • Performance • Lazy RPC andConcurrency • Suspend& Resume • Compatibility

  16. Experiment Result ( cont. ) Experiment Result ( cont. ) • Criteria • Performance • Lazy RPC andConcurrency • Suspend& Resume • Compatibility

  17. Experiment Result ( cont. ) Experiment Result ( cont. ) • Criteria • Performance • Lazy RPC andConcurrency • Suspend& Resume • Compatibility

  18. Experiment Result ( cont. ) Experiment Result ( cont. ) • Criteria • Performance • Lazy RPC andConcurrency • Suspend& Resume • Compatibility MV: Matrix Vector Multiplication Algorithm StoreGPU: Exploiting Graphics Processing Units to Accelerate Distributed Storage Systems MRRR: Multiple Relatively Robust Representations GPUmg: Molecular Dynamics Simulation with GPU

  19. Conclusion • They have developed CUDA interface for virtual machine, which is compatible to the native interface. The data transmission is a significant bottleneck, due to RPC XML-parsing. • This presentation have briefly present the major architecture of the vCUDA and the idea of it. We could extend the architecture as component / solution to make the cloud computing support GPU.

  20. End of Presentation Thanks for your listening.

More Related