1 / 21

A general- purpose virtualization service for HPC on cloud computing : an application to GPUs

A general- purpose virtualization service for HPC on cloud computing : an application to GPUs. R.Montella , G.Coviello , G.Giunta * G. Laccetti # , F . Isaila , J . Garcia Blas °. * Department of Applied Science – University of Napoli Parthenope

lumina
Download Presentation

A general- purpose virtualization service for HPC on cloud computing : an application to GPUs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A general-purposevirtualization servicefor HPC on cloudcomputing:an application to GPUs R.Montella, G.Coviello, G.Giunta*G. Laccetti#, F. Isaila, J. Garcia Blas° *Department of Applied Science – University of Napoli Parthenope ° Department of Mathematics and Applications – University of Napoli Federico II °Department of Computer Science – University of Madrid Carlos III

  2. Outline • Introduction and contextualization • GVirtuS: GenericVirtualization Service • GPU virtualization • High performance cloudcomputing • WhousesGVirtuS • Conclusions and ongoingprojects GVirtuS

  3. Introduction and contextualization • High Performance Computing • Gridcomputing • Many core technology • GPGPUs • Virtualization • Cloudcomputing

  4. High Performance CloudComputing • Hardware: • High performance computing cluster • Multicore / Multi processorcomputingnodes • GPGPUs • Software: • Linux • Virtualizationhypervisor • Private cloud management software • +Specialingredients…

  5. GVirtuS • GenericVirtualization Service • Framework for split-driver basedabstractioncomponents • Plug-in architecture • Independentform • Hypervisor • Communication • Target of virtualization • High performance: • Enabilingtransparentvirtualization • Wthoverall performances nottoo far from un-virtualizedmachines

  6. Split-Driver approach Appliction • Split-Driver • Hardware access by priviledged domain. • Unpriviledged domains access the device using a frontend/backhend approach • Frontend (FE): • Guest-side software component. • Stub: redirect requests to the backend. • Backend (BE): • Mange device requests. • Device multiplexing. Wrap library Unpriviledged Domain Frontend driver Communicator Backend driver Interface library Priviledged Domain Device driver Requests Device

  7. gVirtus Frontend • Libreria dinamica . • Stessa ABI della libreria CUDA Runtime. • Presente nelle macchine virtuali che utilizzano le GPU. GVirtuS approach Appliction • GVirtuS Frontend • Dyinamic loadable library • Same application binary interface • Run on guest user space Wrap library Unpriviledged Domain Frontend driver Communicator • GVirtuS Backend • Server application • Run in host user space • Concurrent requests Backend driver Interface library Priviledged Domain Device driver Requests Device

  8. The Communicator • Provides a high performance communication between virtual machines and their hosts. • The choice of the hypervisor deeply affects the efficiency of the communication.

  9. An application: VirtualizingGPUs • GPUs • Hypervisor independent • Communicator independent • GPU independent

  10. GVirtuS - CUDA • Currently full threadsafesupport to • CUDA drivers • CUDA runtime • OpenCL • Partiallysupporting (itisneeded more work) • OpenGL integration

  11. GVirtuS – libcudart.so Virtual Machine Real Machine #include <stdio.h> #include <cuda.h> int main(void) { int n; cudaGetDeviceCount(&n); printf("Number of CUDA GPU(s): %d\n", n); return 0; } GVirtuS Backend Process Handler GVirtuS Frontend Result *handleGetDeviceCount( CudaRtHandler * pThis, Buffer *input_buffer) { int *count = input_buffer->Assign<int>(); cudaError_texit_code; exit_code = cudaGetDeviceCount(count); Buffer *out = new Buffer(); out->Add(count); return new Result(exit_code, out); } cudaError_tcudaGetDeviceCount(int *count) { Frontend *f = Frontend::GetFrontend(); f->AddHostPointerForArguments(count); f->Execute("cudaGetDeviceCount"); if(f->Success()) *count = *(f->GetOutputHostPointer<int>()); return f->GetExitCode(); }

  12. Choices and Motivations • We focused on VMware and KVM hypervisors. • vmSocket is the component we have designed to obtain a high performance communicator • vmSocket exposes Unix Sockets on virtual machine instances thanks to a QEMU device connected to the virtual PCI bus. vmSocket

  13. vmSocket: virtual PCI device • Programming interface: • Unix Socket • Communication between guest and host: • Virtual PCI interface • QEMU has been modified • GPU based high performance computing applications usually require massive data transfer between host (CPU) memory and device (GPU) memory… • FE/BE interaction efficiency: • there is no mapping between guest memory and device memory • the memory device pointers are never de-referenced on the host side • CUDA kernels are executed on the BE where the pointers are fully consistent.

  14. Performance Evaluation • CUDA Workstation • Genesis GE-i940 Tesla • i7-940 2,93 133 GHz fsb, Quad Core hyper-threaded 8 Mb cache CPU and 12Gb RAM. • 1 nVIDIAQuadro FX5800 4Gb RAM video card • 2 nVIDIA Tesla C1060 4 Gb RAM • The testing system: • Ubuntu 10.04 Linux • nVIDIA CUDA Driver, and the SDK/Toolkit version 4.0. • VMware vs. KVM/QEMU (using different communicators).

  15. GVirtuS-CUDA runtime performances Evaluation:CUDA SDK benchmarks • Results: • 0: No virtualization, no accleration(blank) • 1: Acceleration withoutvirtualization(target) • 2,3: Virtualization with no acceleration • 4...6: GPU acceleraion, Tcp/Ipcommunication⇒ Similar performances due to communicationoverhead • 7: GPU accelerationusingGVirtuS, Unix Socketbasedcommunication • 8,9: GVirtuSvirtualization • ⇒Good performances, no so far from the target • ⇒4..6 better performances than 0 Computing timesas Host-CPU rate

  16. Distributed GPUs VMSocket CUDA Application CUDA Application UNIX Socket Hilights: • Using theTcp/Ip Communicator FE/BE could be on different machines. • Real machines can access remote GPUs. Applications: • GPU for embedded systems as network machines • High Performance Cloud Computing Inter node load balanced … Frontend Frontend Inter node among VMs Inter node Guest VM Linux Guest VM Linux tcp/ip? Security? Compression? CUDA Application Hypervisor Frontend Host OS Linux Backend CUDA Runtime Driver node01 node02 Host OS Linux Backend CUDA Runtime Driver

  17. High Performance Cloud Computing matrixMulMPI+GPU • Ad hoc performance test for benchmarking • Virtual cluster on local computing cloud • Benchmark: • Matrix-matrix multiplication • 2 parallelims levels: distributed memory and GPU • Results:⇒Virtual nodes with just CPUs⇒Better performances with virtual nodes GPUs equipped⇒2 nodes with GPUs perform better than 8 nodes without virtual acceleration.

  18. GVirtuS in the world • GPU support to OpenStack cloud software • Heterogeneous cloud computingJohn Paul Walters et Al.University of Southern California / Information Sciences Institute • http://wiki.openstack.org/HeterogeneousGpuAcceleratorSupport • HPC at NEC Labs of America in Princeton, University of Missouri Columbia, Ohio State University Columbus • Supporting GPU Sharing in Cloud Environments with a Transparent Runtime Consolidation Framework • Awarded as the best paper at HPDC2011

  19. Download, taste, contribute! • http://osl.uniparthenope.it/projects/gvirtusGPL/LGPL License

  20. Conclusions • The GVirtuS generic virtualization and sharing system enables thin Linux based virtual machines to use hosted devices as nVIDIA GPUs. • The GVirtuS-CUDA stack permits to accelerate virtual machines with a small impact on overall performance respect to a pure host/gpu setup. • GVirtuS can be easily extended to other enabled devices as high performance network devices

  21. Ongoingprojects • Elastic Virtual Parallel File System • MPI wrap for High Performance Cloud Computing • XEN Support (is a big challenge!) Download Try & Contribute!

More Related