1 / 15

Advanced Computing Facility Introduction Dr. Feng Cen 09/16/16 Modified: Yuanwei Wu, Wenchi Ma

The Advanced Computing Facility (ACF) houses High Performance Computing (HPC) resources dedicated to scientific research, with 458 nodes, 8568 processing cores, and 49.78TB memory.

keetonp
Download Presentation

Advanced Computing Facility Introduction Dr. Feng Cen 09/16/16 Modified: Yuanwei Wu, Wenchi Ma

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Computing Facility Introduction Dr. Feng Cen 09/16/16 Modified:YuanweiWu,WenchiMa 09/12/18

  2. Overview • The Advanced Computing Facility (ACF) • houses High Performance Computing (HPC) resources dedicated to scientific research • 458 nodes, 8568 processing cores and 49.78TB memory • 20 nodes have over 500GB memory per node • 13 nodes have 64 AMD cores per node and 109 node have 24 Intel cores per node • Coprocessor: NvidiaTitanXp,NividaTeslaP100,Nvidia K80: 52, Nvidia K40C: 2, Nvidia K40m: 4, Nvidia K20m: 2, Nvidia M2070:1 • Virtual machine operation system: Linux http://ittc.ku.edu/cluster/acf_cluster_hardware.html

  3. Cluster Usage Website http://ganglia.acf.ku.edu/

  4. Useful Links • ACF Cluster computing resources • http://ittc.ku.edu/cluster/acf_cluster_hardware.html • Advanced Computing Facility (ACF) documentation main page • https://acf.ku.edu/wiki/index.php/Main_Page • Cluster Jobs Submission Guide • https://acf.ku.edu/wiki/index.php/Cluster_Jobs_Submission_Guide • Advanced guide • http://www.adaptivecomputing.com/support/documentation-index/torque-resource-manager-documentation/ • ACF Portal Website • http://portal.acf.ku.edu/ • Cluster Usage Website • http://ganglia.acf.ku.edu/

  5. ACF Portal Website http://portal.acf.ku.edu/

  6. ACF Portal Website • Monitor jobs • View cluster loads • Download files • Upload files • ...

  7. login1 server or login2 server Access Cluster System via Linux Terminal • Access cluster in Nichols hall • 1. Login to login server → 2. Submit cluster jobs or start an interactive session from the login server . • Cluster will create a virtual machine to run your job or for your interactive session. • Access cluster from off campus • Use the KU Anywhere VPN first : http://technology.ku.edu/software/ku-anywhere-0

  8. Access Cluster System via Linux Terminal • Login to login server • Use “ssh” to directly connect to the cluster login servers: login1 or login2 • Examples: ssh login1 # login with your default linux account ssh -X login1 # “-X” access login server with X11 forwarding ssh <username>@login1 # login with a different linux account ssh -X <username>@login1 • Login server is an entry point to the cluster and cannot support computationally intensive tasks

  9. Access Cluster System via Linux Terminal:updated by Yuanwei @ 9/12/18 • Request GPU resources on cluster 1. Reference document: https://help.ittc.ku.edu/Cluster_Documentation 2. The GPU resources on cluster g002, 4 k20 (4G memory per k20) g003, 2 k20 + 2 k40 (4G memory per k20, 12G memory per k40) g015, 4 k80 (12G memory per k80) g017, 4 P100 (16G memory per P100) g018, 4 P100 (16G memory per P100), might be reserved g019, 2 Titanxp (12G memory per Titanxp) + 1T SSD (I saved the ImageNet12 images here for experiments) g020, 2 Titanxp (12G memory per Titanxp) g021, 2 Titanxp (12G memory per Titanxp)

  10. Access Cluster System via Linux Terminal • Request GPU resources on cluster 3. The steps of requesting GPU from your local machine @ ITTC 3.1 login to 'login1' node at cluster 3.2 Load the slurm module 3.3 Load the right version of CUDA, cuDNN, python, matlab modules you want

  11. Access Cluster System via Linux Terminal • Request GPU resources on cluster 3. The steps of requesting GPU from your local machine @ ITTC 3.4 check the usage of GPU on cluster

  12. Access Cluster System via Linux Terminal • Request GPU resources on cluster 3. The steps of requesting GPU from your local machine @ ITTC 3.5 request GPU from cluster Meaning of options (check the ittc cluster documentation for more details): --gres=”gpu:gpu_name_u_request:gpu_num” -w: select the gpu node --mem: this specifies the requested CPU memory per node -t: the requested time for using this GPU source to run your job, format is D-HH:MM:SS -N: This sets the number of requested nodes for the interactive session (you can add if you want) -n: Specifies the number of tasks or processes to run on each allocated node (you can add if you want)

  13. Access Cluster System via Linux Terminal • Request GPU resources on cluster 3. The steps of requesting GPU from your local machine @ ITTC 3.6 check the usage on your requested GPU (or use 'watch -n 0.5 nvidia-smi' to dynamically watch the usage on GPU)

  14. Access Cluster System via Linux Terminal • Request GPU resources on cluster 3. The steps of requesting GPU from your local machine @ ITTC 3.7 quit your GPU job Ending by Yuanwei @ 9/12/18

  15. Thank you !

More Related