building a scientific research computing environment
Download
Skip this Video
Download Presentation
Building a scientific research computing environment

Loading in 2 Seconds...

play fullscreen
1 / 35

Building a scientific research computing environment - PowerPoint PPT Presentation


  • 186 Views
  • Uploaded on

Building a scientific research computing environment. Eric Wu, BBN Technologies 10/29/2003. Building a scientific research computing environment . Eric Wu, BBN Technologies 10/29/2003. BBN Techologies.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Building a scientific research computing environment' - Jeffrey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
building a scientific research computing environment

Building a scientific research computing environment

Eric Wu, BBN Technologies

10/29/2003

bbn techologies
BBN Techologies
  • Consulting firm founded by MIT Professors and a student in 1948. Leo Beranek (B) receiving the 2002 National Medal Of Science
  • Located in Cambridge, MA
  • Accomplishments
    • First ARPAnet
    • @ symbol in email
    • First router
    • Analyzed Nixon watergate tapes
  • My department
    • Speech recognition. Transcription, not translation.
    • English, Arabic, Japanese
    • ~150 node network
  • http://www.bbn.com
what should i buy
What should I buy?

Hardware and software

Hardware depends on software to realize full potential

Software depends on hardware to realize full potential

$$$

software
Software
  • Test speed of software (benchmark)
  • Rules for benchmarking
    • First rule of benchmarking:
      • The only benchmark that matters is your code!!!
      • SPEC, Vendor benchmarks are worthless (my opinion)
    • Always try to benchmark before buying a new architecture
  • Benchmarking resources
    • Your friends
    • Web
    • Supercomputing centers
    • Testdrive.hp.com (Alpha, Pentium, Itanium)
    • Buy one
software benchmarking example
Software - Benchmarking example
  • Performance
    • VASP – Alpha is better than Xeon
    • CP90 - Alpha and Xeon are same
  • Alpha costs 4-5x as much
hardware
Hardware
  • Hardware features
    • Memory speed
    • Interconnects (Front Side Bus)
    • Clock speed
    • 32 bit vs. 64 bit
    • Cache
    • Processor architecture
  • Understanding hardware can help to understand or predict speed
hardware8
Interconnect (Bus)

Memory

Memory

Hardware

Diagram of Hardware

Processor

Processor

memory and front side bus
Memory and Front Side Bus
  • Don’t ignore memory and interconnects(FSB)!
    • Memory and Front Side Bus (FSB) speed make a difference in performance
    • Be careful when vendors are upgrading
    • FSB for Xeons lag behind Pentium4
  • FSB effects on a dual-processor machine
    • 1 job (+ 1 free processor) takes 1 hour
    • 2 jobs (no free processors) each take 1.25 hours
    • Bandwidth limitations!
processor clock speed
Processor: clock speed
  • Defined as rate the processor runs (cycles per second)
    • Useful only when comparing within an architecture (Pentium to Pentium)
    • Useless when comparing across architectures
      • For VASP, Alpha 1.25 GHz is 2x as fast as Xeon 2.8 GHz
      • For VASP, Itanium 900 MHz is 1.6x as fast as Xeon 2.8 GHz
  • Many other factors matter
  • Example: Instructions per clock cycle also matter (IPC)
    • Pentium 4 – 2
    • Itanium “Madison” – 6
processor 32 bit vs 64 bit
Processor: 32 bit vs. 64 bit
  • Definitions
    • 32 bit – can store range of 232 integers
    • 64 bit – can store range of 264 integers
  • Does not mean 64 bit is automatically faster or better!
  • Advantages of 64 bit
    • High memory applications
      • Each number points to an address space in memory
      • 232 = 4x109, or 4G
      • 264 = 4x109, or 18 billion G
      • 32 bit can access > 4G with OS tricks, but slow
    • Applications with large range of numbers
      • Scientific computing
      • Cryptography
      • 32 bit can access 264 with compiler tricks, but slow
processor cache
Interconnect (Bus)

Memory

Memory

Processor: Cache

Diagram of Hardware

Processor

Fast

Processor

Cache

Slow

Slow

processor cache13
Processor: Cache
  • Cache
    • Bypass slow interconnect and memory
    • Reduce access time to information
    • Reduce bandwidth requirements to memory
  • L2 vs L3
    • Lower L[n] means closer to processor, more potential for improvement
  • Effects
    • Faster code
    • Superlinear speedup in parallel code
  • Examples
    • Xeon 3.06 GHz 512k L2, 1MB L3
    • Opteron 1MB L2
    • Itanium “Madison” 6MB L3
    • Alpha 16 MB L2

Processor

L2 Cache

L3 Cache

Memory

hardware14
Hardware
  • Many processor features can influence speed
  • Effect on speed will depend on software

There is no substitute for benchmarking

purchasing strategies
Purchasing Strategies
  • Don’t forget to ask your friends
    • How much did they pay?
    • Which vendors?
    • How reliable?
  • Picking vendors
    • Know your group
      • How many students?
      • How many machines?
    • Know the differences between vendors
      • Vendor A vs. Vendor B
        • Hardware: Repair on site vs. send it back
        • Memory: Next day air replacement vs. send it back
        • Diagnosing problems: Motherboard lights vs. send it back
        • Rack Rails: Snap in vs. Screw in
        • Problem rate: 2/16 machines (9%) vs. 5/24 machines (21%)
        • Machine cooling: 5 fans vs. 2 fans
        • Cost: Vendor A is $550 more per node, 1/6th more!
purchasing strategies16
Purchasing Strategies
  • Beware new hardware
    • 3 points of failure: hardware, compiler, software
    • Case study 1: Pentium 2 Xeons (1998) (donation)
      • Operating system?
        • Windows was slow
        • Linux was buggy
      • Compilers were new, no standards
      • Software (VASP) did not have Pentium support
    • Case study 2: Itanium I 600 (2000) on testdrive.hp.com
      • Processors were slower than expected
      • Intel compiler operated differently on Itanium and Xeon
      • Math libraries had bugs (MKL)
      • Software (VASP) did not have Itanium support
  • Sometimes, it’s better to let somebody else be the guinea pig
purchasing strategies examples
Purchasing Strategies - Examples
  • Buying Xeons
    • Quotation from Vendor A : ~$4500.
    • Quotation from Vendor B: ~$3000!
    • Go back to Vendor A, Vendor A lowers price to $3000
    • This is extreme, but you should price shop.
      • SW Technologies http://www.swt.com gives prices of cheap Xeons.
      • Ask your friends what they paid.
  • Buying Alphas
    • Quotation from Vendor A : ~$12,500
    • Threaten to buy all Xeons!
    • New quotation from Vendor A: $11,000
parallel computing
Parallel computing
  • Moore’s law is slowing down
  • Source:
  • http://www.nersc.gov/~simon/cs267/
parallel computing19
Parallel computing
  • Even with Moore’s law, at best we can only double system size every two years (with N scaling)
  • Parallel computing
    • Advancements in hardware
      • SMP machines
        • More processors/machine
      • Networking of Intel-type machines
        • Myrinet
        • Gigabit is cheaper
    • Advancements in software
      • MPICH and LAM are more robust
      • Your favorite code is probably parallel now
  • Cost
    • Usually cheaper (can be 50%). Some costs (cooling, power) usually covered by school or lab
parallel computing hardware
Parallel computing Hardware
  • Networking Hardware
    • Fast Ethernet (100 Mbits/s)
    • Gigabit (1000 Mbits/s)
    • Myrinet
    • Quadrics, Infiniband, etc…
  • Definition of terms
    • Latency – Time to decide where to send packet.
        • Low latency is good for many small packets
    • Bandwidth
        • How fast does it transmit?
    • Maximum switching capacity
        • Maximum volume it can handle (relevant for gigabit)
parallel computing hardware21
Parallel computing Hardware
  • Buy a vendor architecture
    • 8-16 processors on each machine
    • Examples: HP GS160, HP GS320, IBM Power 4
    • Advantages
      • Less sysadmin
      • More reliabile
      • Easier in every way
      • Division of machine into OS partitions (more for businesses)
    • Disadvantages
      • Cost - ~$500,000 vs. $50,000-$150,000
      • Can pay for sysadmins instead
parallel computing hardware22
Parallel computing Hardware
  • Gigabit
    • Pricing
      • Cards are often free (standard)
      • Switches are moderately expensive, and falling
      • Few ports = cheap. Pricing does not scale well to >60 ports.
    • Latency
      • Moderate. Depends on switch and packet size
    • Be careful of switching capacity!! Make sure to buy a switch that is made for high performance computing, not routing.
    • Brands:
      • Foundry
      • Extreme
      • Cisco
parallel computing hardware23
Parallel computing Hardware
  • Myrinet
    • Pricing
      • Total ~$1100 a port (http://www.myri.com)
      • Linear scaling up to 128 ports.
    • Latency
      • Lowest latency
    • Needs setup of drivers (not too bad, but…)
    • Easy to expand
    • Best performance for large number of processors (at highest price)
parallel computing hardware24
Parallel computing Hardware
  • Remember the first rule of benchmarking. Example
    • PWSCF or ABINIT, parallelize over k-points
      • Little communication
      • Drawback – need a lot of memory, need kpoints>processors
      • No need for either gigabit or Myrinet
    • VASP parallelize over plane waves
      • A lot of communication
      • Reduce memory usage
      • Gigabit or Myrinet is essential
  • Know your code and how you will use it!
parallel computing software
Parallel computing Software
  • PVM Parallel Virtual Machine
  • MPI Message Passing Interface
    • LAM http://www.lam-mpi.org
      • Designed for TCP/IP (clusters)
      • Performance (?)
    • MPICH http://www-unix.mcs.anl.gov/mpi/mpich/
      • Stack architecture = flexibility. Not just TCP/IP
      • More popular
      • Slightly easier to use
    • Both MPICH and LAM can coexist. Pick the one you like.
compilers
Compilers
  • Often overlooked
  • Compilers can increase speed 10-100%
  • Compilers are cost-effective
    • Compiler may cost $500
    • Cost to increase speed 10-100% can be $200-$2000/machine!
  • Disadvantages
    • Each compiler is different – alter code for each compiler
    • Students hate compiling codes
compilers gcc 2 95 3 3 3
Compilers – gcc (2.95.3, 3.3)
  • Available at http://gcc.gnu.org
  • Advantages
    • Free
    • Portable
    • Wide base of users
    • Newer versions produce fast code
  • Disadvantages
    • Poor Fortran support
compilers intel
Compilers – Intel
  • Available at
    • http://www.intel.com/software/products/compilers/flin/noncom.htm
    • http://www.intel.com/software/products/compilers/clin/noncom.htm
  • Advantages
    • Free (academia)
    • Wide base of users (more so for Fortran)
    • FAST code on Intel chips. Reported fast code for AMD chips
  • Disadvantages
    • Harder to use (my opinion)
    • “Character” of different versions
    • No Red Hat 9 support
compilers portland and others
Compilers – Portland, and others
  • Pricing info at http://www.pgroup.com/pricing/ae.htm
  • Advantages
    • Works for all platforms
    • Robust
  • Disadvantages
    • Some cost
    • Not as fast
  • Other compilers
    • NAG
    • Fujitsu
    • Absoft
math libraries
Math Libraries
  • BLAS/LAPACK
    • Intel MKL - http://www.intel.com/software/products/mkl/
    • ATLAS - http://math-atlas.sourceforge.net
    • K. Goto’s BLAS - http://www.cs.utexas.edu/users/flame/goto/
  • FFTW (http://www.fftw.org)
  • Vendor only
    • HP/Compaq cxml
    • IBM essl
    • SGI scsl
disk storage
Disk Storage
  • Should be done on a RAID (Redundant Array of Inexpensive Disks)
  • RAID configuration provides fault tolerance
  • Different types of RAID
    • RAID 1 (mirroring) - 2 disks (two 100 G disks = 100G of data)
    • RAID 5 – 3+ disks (three 100G disks = 200G data, four 100G disks = 300G data, etc…)
  • Implemented within software or hardware
  • Disk type SCSI or IDE
  • May take one day to set up. Can save your hide!!!
what type of raid should i use
What type of RAID should I use?
  • Software or Hardware?
    • Software RAID is free
    • Hardware RAID has better performance (especially with more clients), but costs $$$. Usually can buy a PCI card and some cables.
  • SCSI or IDE disks?
    • IDE is cheap
    • SCSI is $$$, but better performance. Most believe better quality.
    • SATA disks are another alternative.
  • Costs
    • Hardware/SCSI can cost 3x more
    • Don’t forget cost of computer to house disks
  • My recommendation
    • Hardware/SCSI
      • Graduate students hate to do sysadmin tasks.
      • Graduate students tend to be lax with sysadmin tasks
      • Force your students to delete old files/use gzip
    • Hardware/IDE – If you need 100’s of G of storage
backups
Backups
  • “You’re only as good as your last backup”

Ancient computing proverb

  • MIT-TSM backups http://web.mit.edu/is/help/tsm/quickstart.html
    • $7.50 a month
    • Unlimited storage (rsync) – limited only by restore speed
    • With scripts, can backup every day
  • Disk mirroring with rsync
    • Buy a few cheap IDE disks
    • Use an old machine
  • Tape backups
  • “You’re only as good as your last restore”
          • Modern computing proverb
further reading
Further reading
  • 32 vs. 64 bit
    • Good article: http://www.arstechnica.com/cpu/03q1/x86-64/x86-64-1.html
  • Courses on supercomputers (recommended)
    • Berkeley: http://www.nersc.gov/~simon/cs267/
    • Buffalo: http://www.ccr.buffalo.edu/content/education.htm#courses
  • Building a Beowulf
    • Ron [email protected] http://www.mit.edu/people/cly/beowulf.ppt
    • ROCKS, “automatic” install of Beowulf cluster http://www.x2ca.com/articles/ICCS2003.pdf
  • Parallel computing/supercomputing links
    • Parascope http://www.computer.org/parascope/
    • Nan’s page http://www.cs.rit.edu/~ncs/parallel.html
    • Top 500 http://www.top500.org/
conclusions
Conclusions
  • Hardware understanding can help you make an intelligent decision
  • Nothing beats a benchmark of your code
  • Don’t forget the compiler and math libraries
  • Consider your parallel computing options
  • Be sure to implement fault-tolerant systems (RAID and backups)
ad