Building a beowulf my perspective and experience
Download
1 / 49

Building a Beowulf: My Perspective and Experience - PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on

Building a Beowulf: My Perspective and Experience. Ron Choy Lab. for Computer Science MIT. Ver 1.02. First of all …. Why do we care? Make informed purchase decision Know how to evaluate Build one yourself!. Outline. History/Introduction Hardware aspects Software aspects

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Building a Beowulf: My Perspective and Experience' - melinda-hess


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Building a beowulf my perspective and experience

Building a Beowulf:My Perspective and Experience

Ron Choy

Lab. for Computer Science

MIT

Ver 1.02


First of all
First of all …

  • Why do we care?

    • Make informed purchase decision

    • Know how to evaluate

    • Build one yourself!


Outline
Outline

  • History/Introduction

  • Hardware aspects

  • Software aspects

  • Our class Beowulf

  • Beowulf design exercise


What is a beowulf
What is a Beowulf ?

  • Massively parallel computer built out of COTS

  • Runs a free operating system (not Wolfpack, MSCS)

  • Connected by high speed interconnect

  • Compute nodes are dedicated (not Network of Workstations)


Who uses beowulfs
Who uses Beowulfs?

  • Pharmaceutical companies

  • Investment firms

  • Animation makers

  • Me and you


The beginning
The Beginning

  • Thomas Sterling and Donald Becker CESDIS, Goddard Space Flight Center, Greenbelt, MD

  • Summer 1994: built an experimental cluster

  • Called their cluster Beowulf


The first beowulf
The First Beowulf

  • 16 x 486DX4, 100MHz processors

  • 16MB of RAM each, 256MB in total

  • Channel bonded Ethernet (2 x 10Mbps)

  • Not that different from our Beowulf



Current beowulfs
Current Beowulfs

  • Faster processors, faster interconnect, but the idea remains the same

  • Cluster database: http://clusters.top500.org/db/Query.php3

  • Top cluster: 1.920 11.06 TFLOPS peak



Why beowulf
Why Beowulf?

  • It’s cheap!

  • Our Beowulf, 18 processors, 9GB RAM: $15000

  • A Sun Enterprise 250 Server, 2 processors, 2GB RAM: $16000

  • Everything in a Beowulf is open-source and open standard - easier to manage/upgrade


Essential components of a beowulf
Essential Components of a Beowulf

  • Processors

  • Memory

  • Interconnect

  • Software


Processors
Processors

  • Major vendors: AMD, Intel

  • AMD: Athlon MP

  • Intel: Pentium 4


Comparisons
Comparisons

  • Athlon MP and P4 are close in performance

  • One is better for some applications and the other is better for other applications – it all depends on what you want

  • However, Athlon MP is cheaper


Comparisons 2
Comparisons (2)

  • P4 supports SSE2 instruction set, which perform SIMD operations on double precision data (2 x 64-bit)

  • Athlon MP supports only SSE, for single precision data (4 x 32-bit)


Memory
Memory

  • DDR RAM (double data rate) – used mainly by Athlons, P4 can use them as well

  • RDRAM (Rambus DRAM) – used by P4s


Memory bandwidth
Memory Bandwidth

  • Good summary:

    http://www6.tomshardware.com/mainboard/02q1/020311/sis645dx-03.html

  • DDR beats out RDRAM in bandwidth, and is also cheaper


Interconnect
Interconnect

  • The most important component

  • Factors to consider

    • Bandwidth

    • Latency

    • Price

    • Software support


Ethernet
Ethernet

  • Relatively inexpensive, reasonably fast and very popular.

  • Developed by Bob Metcalfe and D.R. Boggs at Xerox PARC

  • A variety of flavors (10Mbps, 100Mbps, 1Gbps)



Myrinet
Myrinet

  • Developed by Myricom

  • “OS bypass”, the network card talks directly to host processes

  • Proprietary, but very popular because of its low latency and high bandwidth

  • Usually used in high-end clusters




Cost comparison
Cost Comparison

  • To equip our Beowulf with:

    • Fast ethernet: ~$1700

    • Gigabit ethernet: ~ $5600

    • Myrinet: ~$17300


How to choose
How to choose?

  • Depends on your application!

  • Requires really low latency e.g. QCD? Myrinet

  • Requires high bandwidth and can live with higher latency e.g. ScaLAPACK? Gigabit ethernet

  • Embarrassingly parallel? Anything


What would you gain from fast interconnect
What would you gain from fast interconnect?

  • Our cluster: Single fast ethernet (100Mbps)

    • 36.8 GFLOPS peak, HPL: ~12 GFLOPS

    • 32.6% efficiency

  • GALAXY: Gigabit ethernet

    • 20 GFLOPS peak, HPL: ~7 GFLOPS

    • 35% efficiency *old, slow tcp/ip stack!*

  • HELICS: Myrinet 2000

    • 1.4 TFLOPS peak, HPL: ~864 GFLOPS

    • 61.7% efficiency


My experience with hardware
My experience with hardware

  • How long did it take for me to assemble the 9 machines? 8 hours, nonstop


Real issue 1 space
Real issue 1 - space

  • Getting a Beowulf is great, but do you have the space to put it?

  • Often space is at a premium, and Beowulf is not as dense as traditional supercomputers

  • Rackmount? Extra cost! e.g. cabinet ~$1500, case for one node ~$400


Real issue 2 heat management
Real issue 2 – heat management

  • The nodes, with all the high powered processors and network cards, run hot

  • Especially true for Athlons - can reach 60°C

  • If not properly managed, the heat can cause crash or even hardware damage!


Real issue 3 power
Real issue 3 - power

  • Do you have enough power in your room?

  • UPS? Surge protection?

  • You don’t want a thunderstorm to fry your Beowulf!

  • For our case we have a managed machine room - lucky


Real issue 4 noise
Real issue 4 - noise

  • Beowulfs are loud. Really loud.

  • You don’t want it on your desktop.

Bad idea


Noise 2
Noise (2)

  • You might want to consider using ‘quiet’ power supplies and a diskless architecture

  • Or a special purpose ‘personal cluster’


Real issue 5 cables
Real issue 5 - cables

  • Color scheme your cables!


Software
Software

  • We’ll concentrate on the cluster management core

  • Three choices:

    • Vanilla Linux/FreeBSD

    • Free cluster management software (a very patched up Linux)

    • Commercial cluster management software (very very patched up Linux, with technical support)


The issues
The issues

  • Beowulfs can get very large (100’s of nodes)

  • Compute nodes should setup themselves automatically

  • Software updates must be automated across all the nodes

  • Software coherency is an issue


Vanilla linux
Vanilla Linux

  • Most customizable, easiest to make changes

  • Easiest to patch

  • Harder for someone else to inherit the cluster – a real issue

  • Need to know a lot about Linux to properly setup


Free cluster management softwares
Free cluster management softwares

  • Oscar: http://oscar.sourceforge.net/

  • Rocks: http://rocks.npaci.edu

  • MOSIX: http://www.mosix.org/

  • (usually patched) Linux that comes with software for cluster management

  • Reduces dramatically the time needed to get things up and running

  • Open source, but if something breaks, you have one more piece of software to hack


Commercial cluster management
Commercial cluster management

  • Scyld: www.scyld.com - founded by Donald Becker

  • Scyld – father of Beowulf

  • Sells a heavily patched Linux distribution for clustering, free version available but old

  • Based on bProc, which is similar to MOSIX


My experience opinions
My experience/opinions

  • I chose Rocks because I needed the Beowulf up fast, and it’s the first cluster management software I came across

  • It was a breeze to setup

  • But now the pain begins … severe lack of documentations

  • I have actually stripped the cluster of all Rocks features now -> almost a plain RH7.1


Experience 2
Experience (2)

  • Also a batch system like OpenPBS http://www.openpbs.org is a must in a multi-user environment


Software cont d
Software (cont’d)

  • Note that I skipped a lot of details: e.g. file system choice (NFS? PVFS?), MPI choice (MPICH? LAM?), libraries to install …

  • I could talk forever about Beowulfs but it won’t fit in one talk


Recipe we used for our beowulf
Recipe we used for our Beowulf

  • Ingredients: $15000, 3 x 6 packs of coke, 1 grad student

  • Web surf for 1 weeks, try to focus on the Beowulf sites, decide on hardware

  • Spend 2 days filling in various forms for purchasing and obtaining “competitive quotes”

  • Wait 5 days for hardware to arrive, meanwhile web surf some more, and enjoy the last few days of free time in a while


Recipe cont d
Recipe (cont’d)

4. Lock grad student, hardware (not money), and coke in an office. Ignore scream. The hardware should be ready after 8 hours.

Office of the future


Recipe cont d 2
Recipe (cont’d 2)

5. Move grad student and hardware to its final destination. By this time grad student will be emotionally attached to the hardware. This is normal. Have grad student set up software. This would take 2 weeks.



Things i would have done differently
Things I would have done differently

  • Plain Linux

  • Color scheme the cables!

  • Try a diskless setup (saves on cost and management – but no local scratch space)

  • Get rackmount


Design a 30000 beowulf
Design a $30000 Beowulf

  • One node (2 processors, 1GB RAM) costs $1400, with 4.6 GFLOPS peak

  • Should we get:

    • 16 nodes, with fast ethernet, or

    • 8 nodes, with Myrinet?


Design cont d
Design (cont’d)

  • 16 nodes with fast ethernet:

    • 73.6 GFLOPS peak

    • 23.99 GFLOPS real (using the efficiency of our cluster)

    • 16 GB of RAM

  • 8 nodes with Myrinet

    • 36.8 GFLOPS peak

    • 22.7 GFLOPS real (using the efficiency of HELICS)

    • 8 GB of RAM


Design cont d 2
Design (cont’d 2)

  • First choice is good if you work on embarrassingly parallel problems which does not require much communication

  • Second choice is more general purpose


ad