140 likes | 769 Views
Accelerated Computing in Cloud. Viet Tran ( IISAS) Jan Astalos (IISAS) Miroslav Dobrucky (IISAS ). IISAS- GPUCloud site. Openstack GPGPU site at IISAS Hardware: IBM dx360 M4 servers with 2x Intel Xeon E5-2650v2 16 CPU cores, 64GB RAM, 1 TB storage on each WN
E N D
Accelerated Computing in Cloud Viet Tran (IISAS) Jan Astalos (IISAS) Miroslav Dobrucky (IISAS)
IISAS-GPUCloud site • Openstack GPGPU site at IISAS • Hardware: • IBM dx360 M4 servers with 2x Intel Xeon E5-2650v2 • 16 CPU cores, 64GB RAM, 1 TB storage on each WN • 2x NVIDIA Tesla K20m on each WN • Software • Base OS: Ubuntu 14.04 LTS • KVM hypervisor with PCI passthrough virtualisation of GPU cards • OpenStack Kilo middleware • Newest Federated Cloud tools
IISAS-GPUCloud site • GPU-enabled flavors: gpu1cpu6 (1GPU + 6 CPU cores), gpu2cpu12 (2GPU +12 CPU cores) • Pre-defined images with NVIDIA drivers and CUDA toolkit installed for most used Linux distributions • Supported VOs: fedcloud.egi.eu, ops, dteam, moldyngrid, enmr.eu, vo.lifewatch.eu, acc-comp.egi.eu
Using IISAS-GPUCloud site • Via rOCCI client • Simply choose GPU-enable flavor (e.g. gpu2cpu12) as resource template • Or via Openstack Horizon portal • Graphical interface • Adding support for EGI users to login via token (no username/password)
Docker support • Dockers with GPGPU can be executed at IISAS-GPUCloud site • Create a VM with GPGPU-enable flavor and image • Run docker with proper mapping to access GPU docker run --name=XXXXXX \ --device=/dev/nvidia0:/dev/nvidia0 \ --device=/dev/nvidia1:/dev/nvidia1 \ --device=/dev/nvidiactl:/dev/nvidiactl\ --device=/dev/nvidia-uvm:/dev/nvidia-uvm\ …..
IISAS-Nebula site • Hardware: • IBM dx360 M4 servers with 2x Intel Xeon E5-2650v2 • 16 CPU cores, 64GB RAM, 1 TB storage on each WN • 2x NVIDIA Tesla K20m on each WN • OpenNebula 5.0 • Fully integrated to EGI FedCloud, recently certified • Access via rOCCI client • GPU properties are specified in os_tpl instead of resource_tpl • Work in progress to move GPU properties to resource_tpl
GPU specific VO • acc-comp.egi.eu VO has been established for testing and development with GPGPU: • VO image list with preinstalled GPU drivers and CUDA libraries are available via AppDB • Supported only at sites with GPGPU hardware • More info at https://wiki.egi.eu/wiki/Accelerated_computing_VO
Applications using GPGPU • Machine Learning, Artificial Neural Networks (ANN) and Pattern Recognitions • Biodiversity and Ecosystem Research • LifeWatch VO • Bioinformatics and Biomolecularsimulations • Molecular Dynamics simulations • MoBrain, MolDynGrid VOs
Supports • User tutorial: • How to use GPGPU on IISAS-GPUCloud • Access via rOCCI client • Access via OpenStack dashboard with token • How to create your own GPGPU server in cloud • Site admin guide • How to enable GPGPU passthrough in OpenStack • Additional tools • Automation via scripts: • NVIDIA + CUDA installer • Keystone-VOMS client for getting token • Keystone-voms module for Openstack Horizon • All this in a wiki: https://wiki.egi.eu/wiki/GPGPU-FedCloud
Work in progress • Conceptual Model of the Cloud Computing Service is being defined in GLUE2.1 draft • The CloudComputingInstanceType class describes the hardware environment of the VM (i.e. the flavour) • New CloudComputingVirtualAccelerator entity defined to describe a set of homogeneus virtual accelerator devices, who can be associated to one or more CloudComputingInstanceTypes • GPU accounting easier in cloud environment (1 VM 1 GPU) • Cloud systems currently return wallclock time only • If the wallclock for how long a GPU was attached to a VM is available then the GPU reporting would be in line with cloud CPU time, i.e. wallclockonly • APEL team involved to define an extended usage record and new views to display GPU usage in the Accounting Portal
Work in progress • Experimental cloud site set up at IISAS to enable GPGPU support with LXC/LXD hypervisor with OpenStack • LXC/LXD is a full container solution supported by Linux • Expected to provide better performance and stability than KVM (must faster startup time, better integration with OS), especially in terms of GPGPU support (simpler site setup, more stable than KVM PCI passthrough) • New cloud sites joining soon the EGI Federated Cloud • At INCD/LIP OpenStack PCI passthrough with NVIDIA Tesla K40 GPUs • At Seville (CSIC-EBD-LW), supporting LifeWatch community with ~500 cores and 1 PB of storage, and NVIDIA Tesla K20m GPUs
More information • https://wiki.egi.eu/wiki/GPGPU-FedCloud • https://wiki.egi.eu/wiki/Accelerated_computing_VO • https://accelerated.ui.sav.sk/?page_id=21 • https://horizon.ui.savba.sk/