interconnection and packaging in ibm blue gene l l.
Skip this Video
Download Presentation
Interconnection and Packaging in IBM Blue Gene/L

Loading in 2 Seconds...

play fullscreen
1 / 28

Interconnection and Packaging in IBM Blue Gene/L - PowerPoint PPT Presentation

  • Uploaded on

Interconnection and Packaging in IBM Blue Gene/L. Yi Zhu Feb 12, 2007. Outline. Design goals Architecture Design philosophy. Main Design Goals for Blue Gene/L. Improve computing capability, holding total system cost. Reduce cost/FLOP. Reduce complexity and size.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Interconnection and Packaging in IBM Blue Gene/L' - candie

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
  • Design goals
  • Architecture
  • Design philosophy
main design goals for blue gene l
Main Design Goals for Blue Gene/L
  • Improve computing capability, holding total system cost.
  • Reduce cost/FLOP.
  • Reduce complexity and size.
    • ~25KW/rack is max for air-cooling in standard room.
    • 700MHz PowerPC440 for ASIC has excellent FLOP/Watt.
  • Maximize Integration:
    • On chip: ASIC with everything except main memory.
    • Off chip: Maximize number of nodes in a rack..
blue gene l packaging
Blue Gene/L Packaging
  • 2 nodes per compute card.
  • 16 compute cards per node board.
  • 16 node boards per 512-node midplane.
  • Two midplanes in a 1024-node rack.
  • 64 racks
  • Compute card: 206 mm x 55 mm
  • Node card: near to 0.46 m x 0.61 m
  • Midplane: 0.64m tall x 0.8m x 0.5m
  • Rack: 2m tall x 0.91 m x 0.91 m
  • On one midplane: 16 node cards x 16 computer cards x 2 chips – 8x8x8 torus
  • Among midplanes: three network switches, one per dimension – 8x4x4 torus
other networks
Other Networks
  • A global combining/broadcast tree for collective operations
  • A Gigabit Ethernet network for connection to other systems, such as hosts and file systems.
  • A global barrier and interrupt network
  • And another Gigabit Ethernet to JTAG network for machine control
node architecture
Node Architecture
  • IBM PowerPC embedded CMOS processors, embedded DRAM, and system-on-a-chip technique is used.
  • 11.1-mm square die size, allowing for a very high density of processing.
  • The ASIC uses IBM CMOS CU-11 130nm micron technology.
  • 700 Mhz processor speed close to memory speed.
  • Two processors per node.
  • Second processor is intended primarily for handling message passing operations
first level packaging
First Level Packaging
  • Dimension: 32mm x 25mm
    • 474 pins
      • 328 signals for the memory interface
    • A bit-serial torus bus
    • A 3-port double-bit-wide bus
    • 4 global OR signals for fast asynchronous barriers
design philosophy

Computer Cards

Bus widths

# pins, # ports

Design Philosophy
  • Key: determine the parameters from high-level package to chip pin assignment

Interconnection Networks

Routing and Pin assignment

Card connectors, dimensions

interconnection networks
Interconnection Networks
  • Cables are bigger, costlier and less reliable than traces.
    • So want to minimize the number of cables.
    • 3-dimensional torus is chosen as main BG/L network, with each node connected to 6 neighbors.
    • Maximize number of nodes connected via circuit card(s) only.
interconnection networks16
Interconnection Networks
  • BG/L midplane has 8*8*8=512 nodes.
  • (Number of cable connections) / (all connections)

= (6 faces * 8 * 8 nodes) / (6 neighbors * 8 * 8 * 8 nodes)

= 1 / 8

compute card17
Compute Card
  • Determined by the trade off space, function and cost
  • Fewest possible computer ASICs per card has lowest cost for test, rework and replacement
  • Two ASICs per card are more space-efficient due to the share SDRAM
bus widths
Bus Widths
  • Bus width of the torus network was decided primarily by # cables that could be physically connected to a midplane
  • Collective network and interrupt bus widthsand topology were determined by computer card form
pins and ports
# Pins and # Ports
  • # Pins per ASIC is determined by the choice of collective network and interrupt bus widths + # ports escaping each ASIC
  • # collective ports per ASIC & between card connectors was a tradeoff between collective network latency and system form factor
final choices
Final Choices
  • 3 collective ports per ASIC
  • 2 bidirectional bits per collective port
  • 4 bidirectional global interrupt bit per interrupt bus
  • 32mmx25mm package
  • Other factors (computer card form, widths of various buses…) are determined to yield the maximal density of ASICs per rack
design philosophy21
Design Philosophy
  • Next to determine:
    • Circuit card connectors
    • Card cross section
    • Card wiring
  • Objectives
    • Compactness
    • Low cost
    • Electrical signaling quality
card to card connectors
Card-to-Card Connectors
  • Differential: because all high-speed buses are differential
  • Two differential signal pairs per column of pins
    • Signal buses to spread out horizontally across nearly the entire width of each connection
    • Fewer layers to escape, fewer crosses
  • Final choice: Metral 4000 connector
circuit card cross sections
Circuit Card Cross Sections
  • Fundamental requirement: high electrical signaling quality
  • Alternating signal and ground layers
  • 14 total layers except the midplane (18 layers)
  • Node card requires additional power layers to distribute 1.5V core voltage to computer cards
circuit card cross sections24
Circuit Card Cross Sections
  • In some layers with long distance nets, need low resistive loss
    • Wide (190 um to 215 um) 1.0-ounce copper traces
  • Other layers, minimize card thickness
    • Narrow (100 um) 0.5-ounce nets
  • Card dielectrics: low-cost FR4
    • Sufficient for signaling speed 1.4 Gb/s
card sizes
Card Sizes
  • Determined by a combination of manufacturability and system form factor consideration
  • Node cards are near to the maximum card size obtainable from the industry-standard low cost 0.46m x 0.61m
  • Midplane is confined to the largest panel size that could still be manufactured by multiple card vendors
card wiring
Card Wiring
  • Goal: minimize card layers (minimize card cost)
  • Routing order
    • 3d torus network (most regular and numerous) on cards
    • Pin assignment for torus network to minimize net signal crossing
card wiring27
Card Wiring
  • Routing order (cont’d)
    • Global collective network & interrupt bus
      • Exact logical structures determined to minimize # layers
    • Layout of 16-byte-wide SDRAM
      • Optimize package escape and # routing layers
    • ASIC pin assignment
    • High-speed clocks
    • Low-speed nets
  • “Overview of the Blue Gene/L system architecture”, IBM J Res. & Dev., Vol. 49, No. 2/3, March/May 2005
  • “Packaging the Blue Gene/L supercomputer”, IBM J Res. & Dev., Vol. 49, No. 2/3, March/May 2005
  • “Blue Gene/L torus interconnection network”, IBM J Res. & Dev., Vol. 49, No. 2/3, March/May 2005