Nanco a large hpc cluster for rbni russell berrie nanotechnology institute
This presentation is the property of its rightful owner.
Sponsored Links
1 / 31

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on
  • Presentation posted in: General

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute). Anne Weill – Zrahia Technion,Computer Center October 2008. Resources needed for applications arising from Nanotechnology. Large memory – Tbytes High floating point computing speed – Tflops

Download Presentation

Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Nanco: a large HPC cluster for RBNI(Russell Berrie Nanotechnology Institute)

Anne Weill – Zrahia

Technion,Computer Center

October 2008


Resources needed for applications arising from Nanotechnology

  • Large memory –Tbytes

  • High floating point computing speed –Tflops

  • High data throughput – state of the art …


SMP architecture

P

P

P

P

Memory


Cluster architecture

Processor

Memory

Processor

Memory

Interconnection network


Why not a cluster

  • Single SMP system easier to purchase/maintain

  • Ease of programming in SMP systems


Why a cluster

  • Scalability

  • Total available physical RAM

  • Reduced cost

  • But …


Having an application which exploits the parallel capabilities

requires

Studying the application or

applications which

will run on the cluster


Things to include in design


Our choices


Other requirements

  • Space, power ,cooling constraints , strength of floors

  • Software configuration:

  • Operating system

  • Compilers & application deve. tools

  • Load balancing and job scheduling

  • System management tools


Configuration

M

M

M

P

P

P

P

P

P

node2

node64

node1

Infiniband Switch


Before finalizing our choice …

One should check , on a similar system :

  • Single processor peak performance

  • Infiniband interconnect performance

  • SMP behaviour

  • Non commercial parallel applications behaviour


Parallel applications issues

  • Execution time

  • Parallel speedup Sp= T1/Tp

  • Scalability


Benchmark design

  • Must give a good estimate of performance of your application

  • Acceptance test -should match all its components


Comparison of performance


Execution time of Monte-Carlo parallel code (MPI)


What did work

  • Running MPI code interactively

  • Running a serial job through the queue

  • Compiling C code with MPI


What did not work

  • Compiling F90 or C++ code with MPI

  • Running MPI code through the queue

  • Queues do not do accounting per CPU


Parallel performance results

Theoretical peak

2.1 Tflops

Nanco performance on HPL:

0.58 Tflops


Comparison with Sun Benchmark


Execution time –comparison of compilers


Performance with different optimizations


Conclusions from acceptance tests

  • New gcc (gcc4) is faster than Pathscale for some applications

  • MPI collective communication functions are differently implemented in various MPI versions

  • Disk access times are crucial - use attached storage when possible


Scheduling decisions

  • Assessing priorities between user groups

  • Assessing parallel efficiency of different job types (MPI,serial ,OPenMP) /commercial software and designing special queues for them

  • Avoiding starvation by giving weight to the urgency parameter


Observations during production mode

  • Assessing user’s understanding of machine – support in writing scripts and efficient parallelization

  • Lack of visualization tools – writing of script to show current usage of cluster


Utilization of cluster


Utilization of nanco sep08


Nanco jobs by type


Conclusion

  • Benchmark correct design is crucial to test capabilities of proposed architecture

  • Acceptance tests allow to negotiate with vendors and give insights on future choices

  • Only after several weeks and running of the cluster at full capacity can we make informed decisions on management of the cluster


  • Login