Vcuda gpu accelerated high performance computing in virtual machines
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

vCUDA: GPU Accelerated High Performance Computing in Virtual Machines PowerPoint PPT Presentation


  • 201 Views
  • Uploaded on
  • Presentation posted in: General

Lin Shi, Hao Chen and Jianhua Sun. vCUDA: GPU Accelerated High Performance Computing in Virtual Machines. IEEE 2009. Lecture Outline. Abstract 3 Background 4 Motivation 5 CUDA Architecture 7 vCUDA Architecture 8 Experiment Result 13 Conclusion 19. Abstract.

Download Presentation

vCUDA: GPU Accelerated High Performance Computing in Virtual Machines

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Vcuda gpu accelerated high performance computing in virtual machines

Lin Shi, Hao Chen and Jianhua Sun

vCUDA: GPU Accelerated High Performance Computing in Virtual Machines

IEEE 2009


Lecture outline

Lecture Outline

  • Abstract3

  • Background4

  • Motivation5

  • CUDA Architecture7

  • vCUDA Architecture8

  • Experiment Result13

  • Conclusion19


Abstract

Abstract

  • This paper describe vCUDA, a GPGPU computation solution for virtual machine. The author announced that the API interception and redirection could provide transparent and high performance to the applications.

  • This paper would carry out the performance evaluation on the overhead of their framework.


Background

Background

  • VM(Virtual Machine)‏

  • CUDA (Computation Unified Device Architecture)‏

  • API (Application Programming Interface)‏

  • API Interception, Redirection

  • RPC(Remote Procedure Call)‏


Motivation

Motivation

  • Virtualization may be the simplest solution to heterogeneous computation environment.

  • Hardware varied by vendors, it is not necessary for VM-developer to implements hardware drivers for them. (due to license, vendor would not public the source and kernel technique)‏


Motivation cont

Motivation ( cont. )‏

  • Currently the virtualization does only support Accelerated Graphic API such as OpenGL, named VMGL, which is not used for general computation purpose.


Cuda architecture

User Application

<< CUDA Extensions to C>>

CUDA Runtime API

CUDA Driver API

CUDA Driver

CUDA Enabled Device

CUDA Architecture

  • Component Stack


Vcuda architecture

vCUDA Architecture

  • Split the stack into hardware/software binding

User Application

Part of SDK

<< CUDA Extensions to C>>

CUDA Runtime API

soft binding

CUDA Driver API

Direct communicate

CUDA Driver

hard binding

CUDA Enabled Device


Vcuda architecture cont

vCUDA Architecture ( cont. )‏

  • Re-group the stack into host and remote side.

User Application

Part of SDK

<< CUDA Extensions to C>>

[v]CUDA Runtime API

Remote binding

(guestOS)‏

[v]CUDA Driver API

[v]CUDA Enabled Device(vGPU)‏

CUDA Driver API

Host binding

CUDA Driver

CUDA Enabled Device


Vcuda architecture cont1

vCUDA Architecture ( cont. )‏

  • Use fake API as adapter to adapt the instant driver and the virtual driver.

    • API Interception

      • Parameters passed

      • Order Semantics

      • Hardware State

      • Communication

  • Use Lazy-RPC Transmission

    • Use XML-RPC as high-level communication.(for cross-platform requirement)‏

[v]CUDA Runtime API

Remote binding

(guestOS)‏

[v]CUDA Driver API

[v]CUDA Enabled Device(vGPU)‏


Vcuda architecture cont2

vCUDA Architecture ( cont. )‏

Host OS

Virtual Machine OS

Non instant API

lazyRPC

Instant API


Vcuda architecture cont3

vCUDA Architecture ( cont. )‏

  • vCUDA API with virtual GPU

  • Lazy RPC

    • Reduce the overhead of switching between host OS and guest OS.

vGPU

Hardware states

NonInstant Package

AP

LazyRPC

NonInstant API call

API Invocation

Instant api call

GPU

Stub

vStub


Experiment result

Experiment Result

  • Criteria

    • Performance

    • Lazy RPC and Concurrency

    • Suspend& Resume

    • Compatibility


Experiment result cont

Experiment Result ( cont. )

Experiment Result ( cont. )

  • Criteria

    • Performance

    • Lazy RPC and Concurrency

    • Suspend& Resume

    • Compatibility


Experiment result cont1

Experiment Result ( cont. )

Experiment Result ( cont. )

  • Criteria

    • Performance

    • Lazy RPC andConcurrency

    • Suspend& Resume

    • Compatibility


Experiment result cont2

Experiment Result ( cont. )

Experiment Result ( cont. )

  • Criteria

    • Performance

    • Lazy RPC andConcurrency

    • Suspend& Resume

    • Compatibility


Experiment result cont3

Experiment Result ( cont. )

Experiment Result ( cont. )

  • Criteria

    • Performance

    • Lazy RPC andConcurrency

    • Suspend& Resume

    • Compatibility


Experiment result cont4

Experiment Result ( cont. )

Experiment Result ( cont. )

  • Criteria

    • Performance

    • Lazy RPC andConcurrency

    • Suspend& Resume

    • Compatibility

MV: Matrix Vector Multiplication Algorithm

StoreGPU: Exploiting Graphics Processing Units to Accelerate Distributed Storage Systems

MRRR: Multiple Relatively Robust Representations

GPUmg: Molecular Dynamics Simulation with GPU


Conclusion

Conclusion

  • They have developed CUDA interface for virtual machine, which is compatible to the native interface. The data transmission is a significant bottleneck, due to RPC XML-parsing.

  • This presentation have briefly present the major architecture of the vCUDA and the idea of it. We could extend the architecture as component / solution to make the cloud computing support GPU.


End of presentation

End of Presentation

Thanks for your listening.


  • Login