Nanco a large hpc cluster for rbni russell berrie nanotechnology institute
1 / 31

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) - PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute). Anne Weill – Zrahia Technion,Computer Center October 2008. Resources needed for applications arising from Nanotechnology. Large memory – Tbytes High floating point computing speed – Tflops

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Nanco: a large HPC cluster for RBNI(Russell Berrie Nanotechnology Institute)

Anne Weill – Zrahia

Technion,Computer Center

October 2008

Resources needed for applications arising from Nanotechnology

  • Large memory –Tbytes

  • High floating point computing speed –Tflops

  • High data throughput – state of the art …

SMP architecture






Cluster architecture





Interconnection network

Why not a cluster

  • Single SMP system easier to purchase/maintain

  • Ease of programming in SMP systems

Why a cluster

  • Scalability

  • Total available physical RAM

  • Reduced cost

  • But …

Having an application which exploits the parallel capabilities


Studying the application or

applications which

will run on the cluster

Things to include in design

Our choices

Other requirements

  • Space, power ,cooling constraints , strength of floors

  • Software configuration:

  • Operating system

  • Compilers & application deve. tools

  • Load balancing and job scheduling

  • System management tools














Infiniband Switch

Before finalizing our choice …

One should check , on a similar system :

  • Single processor peak performance

  • Infiniband interconnect performance

  • SMP behaviour

  • Non commercial parallel applications behaviour

Parallel applications issues

  • Execution time

  • Parallel speedup Sp= T1/Tp

  • Scalability

Benchmark design

  • Must give a good estimate of performance of your application

  • Acceptance test -should match all its components

Comparison of performance

Execution time of Monte-Carlo parallel code (MPI)

What did work

  • Running MPI code interactively

  • Running a serial job through the queue

  • Compiling C code with MPI

What did not work

  • Compiling F90 or C++ code with MPI

  • Running MPI code through the queue

  • Queues do not do accounting per CPU

Parallel performance results

Theoretical peak

2.1 Tflops

Nanco performance on HPL:

0.58 Tflops

Comparison with Sun Benchmark

Execution time –comparison of compilers

Performance with different optimizations

Conclusions from acceptance tests

  • New gcc (gcc4) is faster than Pathscale for some applications

  • MPI collective communication functions are differently implemented in various MPI versions

  • Disk access times are crucial - use attached storage when possible

Scheduling decisions

  • Assessing priorities between user groups

  • Assessing parallel efficiency of different job types (MPI,serial ,OPenMP) /commercial software and designing special queues for them

  • Avoiding starvation by giving weight to the urgency parameter

Observations during production mode

  • Assessing user’s understanding of machine – support in writing scripts and efficient parallelization

  • Lack of visualization tools – writing of script to show current usage of cluster

Utilization of cluster

Utilization of nanco sep08

Nanco jobs by type


  • Benchmark correct design is crucial to test capabilities of proposed architecture

  • Acceptance tests allow to negotiate with vendors and give insights on future choices

  • Only after several weeks and running of the cluster at full capacity can we make informed decisions on management of the cluster

  • Login