the blue gene experience n.
Download
Skip this Video
Download Presentation
The Blue Gene Experience

Loading in 2 Seconds...

play fullscreen
1 / 8

The Blue Gene Experience - PowerPoint PPT Presentation


  • 105 Views
  • Uploaded on

The Blue Gene Experience. Manish Gupta IBM T. J. Watson Research Center Yorktown Heights, NY . Blue Gene/L (2005). 136.8 Teraflop/s on LINPACK (64K processors). System. Blue Gene/L. 64 Racks, 64x32x32. Rack. 32 Node Cards. Node Card. 180/360 TF/s 32 TB. (32 chips 4x4x2)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The Blue Gene Experience' - kyra-wilson


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the blue gene experience

The Blue Gene Experience

Manish Gupta

IBM T. J. Watson Research Center

Yorktown Heights, NY

blue gene l 2005
Blue Gene/L (2005)

136.8 Teraflop/s on LINPACK (64K processors)

slide3

System

Blue Gene/L

64 Racks, 64x32x32

Rack

32 Node Cards

Node Card

180/360 TF/s

32 TB

(32 chips 4x4x2)

16 compute, 0-2 IO cards

2.8/5.6 TF/s

512 GB

Compute Card

2 chips, 1x2x1

90/180 GF/s

16 GB

Chip

2 processors

5.6/11.2 GF/s

1.0 GB

2.8/5.6 GF/s

4 MB

blue gene l compute asic
Blue Gene/L Compute ASIC
  • Low power processors
  • Chip-level integration
  • Powerful networks
blue gene l networks
Blue Gene/L Networks

3 Dimensional Torus

  • Interconnects all compute nodes (65,536)
  • Virtual cut-through hardware routing
  • 1.4Gb/s on all 12 node links (2.1 GB/s per node)
  • 1 µs latency between nearest neighbors, 5 µs to the farthest
  • Communications backbone for computations
  • 0.7/1.4 TB/s bisection bandwidth, 68TB/s total bandwidth

Global Collective

  • One-to-all broadcast functionality
  • Reduction operations functionality
  • 2.8 Gb/s of bandwidth per link
  • Latency of one way traversal 2.5 µs
  • Interconnects all compute and I/O nodes (1024)

Low Latency Global Barrier and Interrupt

  • Latency of round trip 1.3 µs

Ethernet

  • Incorporated into every node ASIC
  • Active in the I/O nodes (1:8-64)
  • All external comm. (file I/O, control, user interaction, etc.)

Control Network

ras reliability availability serviceability
RAS (Reliability, Availability, Serviceability)
  • System designed for reliability from top to bottom
    • System issues
      • Redundant bulk supplies, power converters, fans, DRAM bits, cable bits
      • Extensive data logging (voltage, temp, recoverable errors … ) for failure forecasting
      • Nearly no single points of failure
    • Chip design
      • ECC on all SRAMs
      • All dataflow outside processors is protected by error-detection mechanisms
      • Access to all state via noninvasive back door
    • Low power, simple design leads to higher reliability
    • All interconnects have multiple error detections and correction coverage
      • Virtually zero escape probability for link errors
blue gene l system architecture

C-Node 63

C-Node 63

C-Node 0

C-Node 0

CNK

CNK

CNK

CNK

Blue Gene/L System Architecture

tree

Pset 0

Service Node

I/O Node 0

SystemConsole

Front-endNodes

FileServers

Linux

app

app

fs client

ciod

Functional Gigabit Ethernet

CMCS

torus

DB2

I/O Node 1023

Linux

I2C

app

app

Control GigabitEthernet

LoadLeveler

fs client

ciod

IDo chip

JTAG

Pset 1023