Building a scientific research computing environment
Download
1 / 35

EricWuTalk - PowerPoint PPT Presentation


  • 184 Views
  • Uploaded on

Building a scientific research computing environment. Eric Wu, BBN Technologies 10/29/2003. Building a scientific research computing environment . Eric Wu, BBN Technologies 10/29/2003. BBN Techologies.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'EricWuTalk' - Jeffrey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Building a scientific research computing environment l.jpg

Building a scientific research computing environment

Eric Wu, BBN Technologies

10/29/2003


Building a scientific research computing environment2 l.jpg

Building a scientific research computing environment

Eric Wu, BBN Technologies

10/29/2003


Bbn techologies l.jpg
BBN Techologies

  • Consulting firm founded by MIT Professors and a student in 1948. Leo Beranek (B) receiving the 2002 National Medal Of Science

  • Located in Cambridge, MA

  • Accomplishments

    • First ARPAnet

    • @ symbol in email

    • First router

    • Analyzed Nixon watergate tapes

  • My department

    • Speech recognition. Transcription, not translation.

    • English, Arabic, Japanese

    • ~150 node network

  • http://www.bbn.com


What should i buy l.jpg
What should I buy?

Hardware and software

Hardware depends on software to realize full potential

Software depends on hardware to realize full potential

$$$


Software l.jpg
Software

  • Test speed of software (benchmark)

  • Rules for benchmarking

    • First rule of benchmarking:

      • The only benchmark that matters is your code!!!

      • SPEC, Vendor benchmarks are worthless (my opinion)

    • Always try to benchmark before buying a new architecture

  • Benchmarking resources

    • Your friends

    • Web

    • Supercomputing centers

    • Testdrive.hp.com (Alpha, Pentium, Itanium)

    • Buy one


Software benchmarking example l.jpg
Software - Benchmarking example

  • Performance

    • VASP – Alpha is better than Xeon

    • CP90 - Alpha and Xeon are same

  • Alpha costs 4-5x as much


Hardware l.jpg
Hardware

  • Hardware features

    • Memory speed

    • Interconnects (Front Side Bus)

    • Clock speed

    • 32 bit vs. 64 bit

    • Cache

    • Processor architecture

  • Understanding hardware can help to understand or predict speed


Hardware8 l.jpg

Interconnect (Bus)

Memory

Memory

Hardware

Diagram of Hardware

Processor

Processor


Memory and front side bus l.jpg
Memory and Front Side Bus

  • Don’t ignore memory and interconnects(FSB)!

    • Memory and Front Side Bus (FSB) speed make a difference in performance

    • Be careful when vendors are upgrading

    • FSB for Xeons lag behind Pentium4

  • FSB effects on a dual-processor machine

    • 1 job (+ 1 free processor) takes 1 hour

    • 2 jobs (no free processors) each take 1.25 hours

    • Bandwidth limitations!


Processor clock speed l.jpg
Processor: clock speed

  • Defined as rate the processor runs (cycles per second)

    • Useful only when comparing within an architecture (Pentium to Pentium)

    • Useless when comparing across architectures

      • For VASP, Alpha 1.25 GHz is 2x as fast as Xeon 2.8 GHz

      • For VASP, Itanium 900 MHz is 1.6x as fast as Xeon 2.8 GHz

  • Many other factors matter

  • Example: Instructions per clock cycle also matter (IPC)

    • Pentium 4 – 2

    • Itanium “Madison” – 6


Processor 32 bit vs 64 bit l.jpg
Processor: 32 bit vs. 64 bit

  • Definitions

    • 32 bit – can store range of 232 integers

    • 64 bit – can store range of 264 integers

  • Does not mean 64 bit is automatically faster or better!

  • Advantages of 64 bit

    • High memory applications

      • Each number points to an address space in memory

      • 232 = 4x109, or 4G

      • 264 = 4x109, or 18 billion G

      • 32 bit can access > 4G with OS tricks, but slow

    • Applications with large range of numbers

      • Scientific computing

      • Cryptography

      • 32 bit can access 264 with compiler tricks, but slow


Processor cache l.jpg

Interconnect (Bus)

Memory

Memory

Processor: Cache

Diagram of Hardware

Processor

Fast

Processor

Cache

Slow

Slow


Processor cache13 l.jpg
Processor: Cache

  • Cache

    • Bypass slow interconnect and memory

    • Reduce access time to information

    • Reduce bandwidth requirements to memory

  • L2 vs L3

    • Lower L[n] means closer to processor, more potential for improvement

  • Effects

    • Faster code

    • Superlinear speedup in parallel code

  • Examples

    • Xeon 3.06 GHz 512k L2, 1MB L3

    • Opteron 1MB L2

    • Itanium “Madison” 6MB L3

    • Alpha 16 MB L2

Processor

L2 Cache

L3 Cache

Memory


Hardware14 l.jpg
Hardware

  • Many processor features can influence speed

  • Effect on speed will depend on software

There is no substitute for benchmarking


Purchasing strategies l.jpg
Purchasing Strategies

  • Don’t forget to ask your friends

    • How much did they pay?

    • Which vendors?

    • How reliable?

  • Picking vendors

    • Know your group

      • How many students?

      • How many machines?

    • Know the differences between vendors

      • Vendor A vs. Vendor B

        • Hardware: Repair on site vs. send it back

        • Memory: Next day air replacement vs. send it back

        • Diagnosing problems: Motherboard lights vs. send it back

        • Rack Rails: Snap in vs. Screw in

        • Problem rate: 2/16 machines (9%) vs. 5/24 machines (21%)

        • Machine cooling: 5 fans vs. 2 fans

        • Cost: Vendor A is $550 more per node, 1/6th more!


Purchasing strategies16 l.jpg
Purchasing Strategies

  • Beware new hardware

    • 3 points of failure: hardware, compiler, software

    • Case study 1: Pentium 2 Xeons (1998) (donation)

      • Operating system?

        • Windows was slow

        • Linux was buggy

      • Compilers were new, no standards

      • Software (VASP) did not have Pentium support

    • Case study 2: Itanium I 600 (2000) on testdrive.hp.com

      • Processors were slower than expected

      • Intel compiler operated differently on Itanium and Xeon

      • Math libraries had bugs (MKL)

      • Software (VASP) did not have Itanium support

  • Sometimes, it’s better to let somebody else be the guinea pig


Purchasing strategies examples l.jpg
Purchasing Strategies - Examples

  • Buying Xeons

    • Quotation from Vendor A : ~$4500.

    • Quotation from Vendor B: ~$3000!

    • Go back to Vendor A, Vendor A lowers price to $3000

    • This is extreme, but you should price shop.

      • SW Technologies http://www.swt.com gives prices of cheap Xeons.

      • Ask your friends what they paid.

  • Buying Alphas

    • Quotation from Vendor A : ~$12,500

    • Threaten to buy all Xeons!

    • New quotation from Vendor A: $11,000


Parallel computing l.jpg
Parallel computing

  • Moore’s law is slowing down

  • Source:

  • http://www.nersc.gov/~simon/cs267/


Parallel computing19 l.jpg
Parallel computing

  • Even with Moore’s law, at best we can only double system size every two years (with N scaling)

  • Parallel computing

    • Advancements in hardware

      • SMP machines

        • More processors/machine

      • Networking of Intel-type machines

        • Myrinet

        • Gigabit is cheaper

    • Advancements in software

      • MPICH and LAM are more robust

      • Your favorite code is probably parallel now

  • Cost

    • Usually cheaper (can be 50%). Some costs (cooling, power) usually covered by school or lab


Parallel computing hardware l.jpg
Parallel computing Hardware

  • Networking Hardware

    • Fast Ethernet (100 Mbits/s)

    • Gigabit (1000 Mbits/s)

    • Myrinet

    • Quadrics, Infiniband, etc…

  • Definition of terms

    • Latency – Time to decide where to send packet.

      • Low latency is good for many small packets

  • Bandwidth

    • How fast does it transmit?

  • Maximum switching capacity

    • Maximum volume it can handle (relevant for gigabit)


  • Parallel computing hardware21 l.jpg
    Parallel computing Hardware

    • Buy a vendor architecture

      • 8-16 processors on each machine

      • Examples: HP GS160, HP GS320, IBM Power 4

      • Advantages

        • Less sysadmin

        • More reliabile

        • Easier in every way

        • Division of machine into OS partitions (more for businesses)

      • Disadvantages

        • Cost - ~$500,000 vs. $50,000-$150,000

        • Can pay for sysadmins instead


    Parallel computing hardware22 l.jpg
    Parallel computing Hardware

    • Gigabit

      • Pricing

        • Cards are often free (standard)

        • Switches are moderately expensive, and falling

        • Few ports = cheap. Pricing does not scale well to >60 ports.

      • Latency

        • Moderate. Depends on switch and packet size

      • Be careful of switching capacity!! Make sure to buy a switch that is made for high performance computing, not routing.

      • Brands:

        • Foundry

        • Extreme

        • Cisco


    Parallel computing hardware23 l.jpg
    Parallel computing Hardware

    • Myrinet

      • Pricing

        • Total ~$1100 a port (http://www.myri.com)

        • Linear scaling up to 128 ports.

      • Latency

        • Lowest latency

      • Needs setup of drivers (not too bad, but…)

      • Easy to expand

      • Best performance for large number of processors (at highest price)


    Parallel computing hardware24 l.jpg
    Parallel computing Hardware

    • Remember the first rule of benchmarking. Example

      • PWSCF or ABINIT, parallelize over k-points

        • Little communication

        • Drawback – need a lot of memory, need kpoints>processors

        • No need for either gigabit or Myrinet

      • VASP parallelize over plane waves

        • A lot of communication

        • Reduce memory usage

        • Gigabit or Myrinet is essential

    • Know your code and how you will use it!


    Parallel computing software l.jpg
    Parallel computing Software

    • PVM Parallel Virtual Machine

    • MPI Message Passing Interface

      • LAM http://www.lam-mpi.org

        • Designed for TCP/IP (clusters)

        • Performance (?)

      • MPICH http://www-unix.mcs.anl.gov/mpi/mpich/

        • Stack architecture = flexibility. Not just TCP/IP

        • More popular

        • Slightly easier to use

      • Both MPICH and LAM can coexist. Pick the one you like.


    Compilers l.jpg
    Compilers

    • Often overlooked

    • Compilers can increase speed 10-100%

    • Compilers are cost-effective

      • Compiler may cost $500

      • Cost to increase speed 10-100% can be $200-$2000/machine!

    • Disadvantages

      • Each compiler is different – alter code for each compiler

      • Students hate compiling codes


    Compilers gcc 2 95 3 3 3 l.jpg
    Compilers – gcc (2.95.3, 3.3)

    • Available at http://gcc.gnu.org

    • Advantages

      • Free

      • Portable

      • Wide base of users

      • Newer versions produce fast code

    • Disadvantages

      • Poor Fortran support


    Compilers intel l.jpg
    Compilers – Intel

    • Available at

      • http://www.intel.com/software/products/compilers/flin/noncom.htm

      • http://www.intel.com/software/products/compilers/clin/noncom.htm

    • Advantages

      • Free (academia)

      • Wide base of users (more so for Fortran)

      • FAST code on Intel chips. Reported fast code for AMD chips

    • Disadvantages

      • Harder to use (my opinion)

      • “Character” of different versions

      • No Red Hat 9 support


    Compilers portland and others l.jpg
    Compilers – Portland, and others

    • Pricing info at http://www.pgroup.com/pricing/ae.htm

    • Advantages

      • Works for all platforms

      • Robust

    • Disadvantages

      • Some cost

      • Not as fast

    • Other compilers

      • NAG

      • Fujitsu

      • Absoft


    Math libraries l.jpg
    Math Libraries

    • BLAS/LAPACK

      • Intel MKL - http://www.intel.com/software/products/mkl/

      • ATLAS - http://math-atlas.sourceforge.net

      • K. Goto’s BLAS - http://www.cs.utexas.edu/users/flame/goto/

    • FFTW (http://www.fftw.org)

    • Vendor only

      • HP/Compaq cxml

      • IBM essl

      • SGI scsl


    Disk storage l.jpg
    Disk Storage

    • Should be done on a RAID (Redundant Array of Inexpensive Disks)

    • RAID configuration provides fault tolerance

    • Different types of RAID

      • RAID 1 (mirroring) - 2 disks (two 100 G disks = 100G of data)

      • RAID 5 – 3+ disks (three 100G disks = 200G data, four 100G disks = 300G data, etc…)

    • Implemented within software or hardware

    • Disk type SCSI or IDE

    • May take one day to set up. Can save your hide!!!


    What type of raid should i use l.jpg
    What type of RAID should I use?

    • Software or Hardware?

      • Software RAID is free

      • Hardware RAID has better performance (especially with more clients), but costs $$$. Usually can buy a PCI card and some cables.

    • SCSI or IDE disks?

      • IDE is cheap

      • SCSI is $$$, but better performance. Most believe better quality.

      • SATA disks are another alternative.

    • Costs

      • Hardware/SCSI can cost 3x more

      • Don’t forget cost of computer to house disks

    • My recommendation

      • Hardware/SCSI

        • Graduate students hate to do sysadmin tasks.

        • Graduate students tend to be lax with sysadmin tasks

        • Force your students to delete old files/use gzip

      • Hardware/IDE – If you need 100’s of G of storage


    Backups l.jpg
    Backups

    • “You’re only as good as your last backup”

      Ancient computing proverb

    • MIT-TSM backups http://web.mit.edu/is/help/tsm/quickstart.html

      • $7.50 a month

      • Unlimited storage (rsync) – limited only by restore speed

      • With scripts, can backup every day

    • Disk mirroring with rsync

      • Buy a few cheap IDE disks

      • Use an old machine

    • Tape backups

    • “You’re only as good as your last restore”

      • Modern computing proverb


    Further reading l.jpg
    Further reading

    • 32 vs. 64 bit

      • Good article: http://www.arstechnica.com/cpu/03q1/x86-64/x86-64-1.html

    • Courses on supercomputers (recommended)

      • Berkeley: http://www.nersc.gov/~simon/cs267/

      • Buffalo: http://www.ccr.buffalo.edu/content/education.htm#courses

    • Building a Beowulf

      • Ron [email protected] http://www.mit.edu/people/cly/beowulf.ppt

      • ROCKS, “automatic” install of Beowulf cluster http://www.x2ca.com/articles/ICCS2003.pdf

    • Parallel computing/supercomputing links

      • Parascope http://www.computer.org/parascope/

      • Nan’s page http://www.cs.rit.edu/~ncs/parallel.html

      • Top 500 http://www.top500.org/


    Conclusions l.jpg
    Conclusions

    • Hardware understanding can help you make an intelligent decision

    • Nothing beats a benchmark of your code

    • Don’t forget the compiler and math libraries

    • Consider your parallel computing options

    • Be sure to implement fault-tolerant systems (RAID and backups)


    ad