1 / 42

Essential Overview

Essential Overview. Louisiana Tech University Ruston, Louisiana Charles Grassl IBM January, 2006. Agenda. Hardware Software Documentation. Hardware Overview. Processors: Nodes: Clusters:. Product Naming. Processor Progression. POWER5 Systems. POWER5 processors

nicola
Download Presentation

Essential Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Essential Overview Louisiana Tech University Ruston, Louisiana Charles Grassl IBM January, 2006

  2. Agenda • Hardware • Software • Documentation

  3. Hardware Overview • Processors: • Nodes: • Clusters:

  4. Product Naming

  5. Processor Progression

  6. POWER5 Systems • POWER5 processors • Single and Dual processor chips • Modules • Dual Chip Modules (DCM) • Multi Chip Modules (MCM) • Nodes • Multiple modules • p5-575 • p5-595 • Cluster • Multiple nodes • Connected with High Speed Switch (HPS)

  7. Systems (“Nodes”)

  8. POWER5 Processor Systems p5-595 MCM Processor Chip DCM p5-575 Cluster

  9. Cluster 1600 Network, Disk System Multi Processor Nodes Physical View Logical View

  10. IBM p5-575 nodes 1.9 GHz POWER5 processors Single processor chips 8 processors per node HPS interconnect “575” distinction: Dual Chip Module (DCM) 8 DCMs One or two processors per chip Single Core (SC) Dual Core (DC) “595” distinction: Multi Chip Module (MCM) construction 8 MCMs Local System Name

  11. POWER5 Processors • Multi-processor chip • High clock rate: Multiple GHz • Three cache levels • Bandwidth • Latency hiding • Shared Memory • Large memory size

  12. POWER5 Features • Private L1 cache • Shared L2 cache • Shared L3 cache • Interleaved memory • Hardware Prefetch • Multiple Page Size support

  13. Processor Characteristics • High frequency clocks • Deep pipelines • High asymptotic rates • Superscalar • Speculative out-of-order instructions • Up to 8 outstanding cache line misses • Large number of instructions in flight • Branch prediction • Hardware Prefetching

  14. Processor Features

  15. Caches and Memory

  16. POWER4 – POWER5 Comparison

  17. POWER5 Design: Summary • More gates • 170 million  260 million • Enhancements • Increased cache associativity • Increased number of rename registers • Reduced L3 and cache latency • New features • Simultaneous Multi Threading • Dynamic power management

  18. Processor Systems (Nodes) • Multiple processors • Multiple modules • Various construction formats • Multi Chip Modules • Dual Chip Modules • Shared memory

  19. POWER5 Processor Chip Multi Chip and Dual Chip Modules Dual Chip Module (MCM) p5-570 p5-575 Multi Chip Module (MCM) p5-590 p5-595

  20. Dual Chip Module • Each Module: • 1 processor chip • 1 L3 cache • 1 Memory card • Each Processor Chip • 2 processors • L1 caches • Registers • Functional units • 1 L2 cache • 1 path to memory 36 Mbyte L3 Memory

  21. Multi Chip Module Memory Memory • Each Module: • 4 processor chips • 4 L3 cache chips • 2 Memory cards • Each Processor Chip • 2 processors • L1 caches • Registers • Functional units • 1 L2 cache • 1 path to memory Memory Memory

  22. POWER5 Multi Chip Module • Four POWER5 chips • Four L3 cache chips • 95mm  95mm • 4,491 signal I/Os • 89 layers of metal

  23. POWER5 Dual Chip Module • One POWER5 chip • Single or Dual Core • One L3 cache chips

  24. L3 L3 Mem Ctl Mem Ctl L3 L3 L3 L3 Mem Ctl Mem Ctl Modifications to POWER4 System Structure P P P P L2 L2 Fab Ctl Fab Ctl Memory Memory

  25. Switch Technology • Internal network • In lieu of GigEthernet, Myrinet, Quadrics, etc. • Fourth generation • HPS Switch (POWER2 generation) • SP Switch (POWER2 -> POWER3) • SP Switch 2 (POWER3 -> POWER4) • HPS (POWER4 -> POWER5) • Multiple links per node • Match number of links to number of processors

  26. High Performance Switch (HPS) • Also Known As “Federation” • Follow on to SP Switch2 • Also known as “Colony” • Specifications: • 2 Gbyte/s (bidirectional) • 5 microsecond latency • Configuration: • Up to four adaptors per node • 2 links per adaptor • 16 Gbyte/s per node

  27. HPS Specifications

  28. Software Overview • Operating System • AIX • Compilers • C • C++ • Fortran • Batch Queue • LoadLeveler (IBM) • LSF (Platform) • PBS • Gridware

  29. AIX • Current Version: AIX 5.3 • Processors: • POWER3 • POWER4 • POWER5 • Linux Affinity • Logical PARtitions (LPAR) Nodes • Operating system • Memory • Network connections • Kernel Address Size: • 64-bit • 32-bit

  30. Linux on POWER • Native Linux, SuSE7  SuSE8 • Rpm's and package managers • Cluster Systems Manager • 64-bit kernel • 32/64-bit applications support (SuSE8)

  31. C and C++ Visual Age C and C++ Professional for AIX Versions 6, 7, 8 ANSI C C++ Compiler names: xlc xlC Fortran XL Fortran for AIX Versions 8, 9, 10 Fortran 77 Fortran 90 Compiler names: xlf77 xlf90 Compilers

  32. Compiler Names AIX uses different compiler names to perform some tasks which are handled by compiler flags on most other systems

  33. Compiler Usage

  34. User Limits • Set by the system administrator • Ulimit: • C or K shell built-in • Sets or reports resource limits • Limits are defined in /etc/security/limits • Sizes are in 512 byte blocks • Times are in seconds • $ ulimit -a

  35. Ulimit Defaults * 64-bit address mode

  36. Other Defaults • Thread control • /etc/environment • AIXTHREAD_SCOPE=S • AIXTHREAD_MNRATIO=1:1 • AIXTHREAD_COND_DEBUG=OFF • AIXTHREAD_GUARDPAGES=4 • AIXTHREAD_MUTEX_DEBUG=OFF • AIXTHREAD_RWLOCK_DEBUG=OFF

  37. Batch Queuing • Compile on any AIX node • Use –qarch=pwr5 • Submit job with available batch utility • Use appropriate queue name • Available queuing systems: • LoadLeveler • PBS • Gridware • LSF

  38. Node 0 Node 1 Node 2 Cluster Layout Compile And Submit Node Network

  39. Documentation • Software: • www.software.ibm.com • Products A-Z • X -> xl C, xl C/C++, xl Fortran • www.servers.ibm.com/aix • Compilers • /usr/vac/doc • /usr/vacpp/doc • /usr/lpp/xlf/doc • Redbooks: • www.redbooks.ibm.com/ • IBM eServer p5 590 and 595 System Handbook

  40. Documentation • AIX Commands Reference • AIX command: • /usr/sbin/infocenter • /opt/ibm_help/help_start.sh • http://www.unet.univie.ac.at/aix/aixgen/wbinfnav/aixcmdsrefbooks.htm • Google search: “AIX Commands Reference”

  41. Documentation Library Google Search: AIX 5L documentation Library http://publibn.boulder.ibm.com/cgi-bin/ds_rslt

  42. Summary: Architecture • System architecture • Processors • Nodes • Cluster • Processors • POWER5 • Three levels of cache • Nodes: • Eight processor p5-575 • Cluster: • 14 p5-575 nodes • HPS interconnect

More Related