configuring a large scale gals system
Download
Skip this Video
Download Presentation
Configuring a Large-Scale GALS System

Loading in 2 Seconds...

play fullscreen
1 / 17

Configuring a Large-Scale GALS System - PowerPoint PPT Presentation


  • 61 Views
  • Uploaded on

Configuring a Large-Scale GALS System. M.M. Khan*, J. Navaridas†, L.A. Plana*, M. Luj´an*, J.V Woods*, J. Miguel-Alonso† and S.B. Furber* *School of Computer Science, The University of Manchester, UK †University of The Basque Country, Spain. SpiNNaker. Objectives High-performance Robust

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Configuring a Large-Scale GALS System' - meghan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
configuring a large scale gals system

Configuring a Large-Scale GALS System

M.M. Khan*, J. Navaridas†, L.A. Plana*, M. Luj´an*,

J.V Woods*, J. Miguel-Alonso† and S.B. Furber*

*School of Computer Science, The University of Manchester, UK

†University of The Basque Country, Spain

spinnaker
SpiNNaker
  • Objectives
    • High-performance
    • Robust
    • Low-power
spinnaker cmp
SpiNNaker CMP
  • System RAM
  • Boot ROM
  • MC Router
  • Sys. Controller
  • Ethernet
  • SDRAM
  • 20 Proc. Nodes
processing node
Processing Node
  • ARM968E-S
  • Comm. Ctlr.
  • Interrupt Ctlr.
  • DMA Ctlr.
  • Timer
  • TCM (100K)
communication network
Communication Network
  • MC Router
  • Packets
    • MC
    • P2P
    • NN
  • 1Gb/s inter-chip
  • 6Gb/s per Node
  • Six two-way inter-chip links

*L.A. Plana et al.An On-Chip and Inter-Chip Communications Network for the Spinnaker Massively-Parallel Neural Net Simulator. In Proc. Second ACM/IEEE International Symposium on Networks-on-Chip (NoCS 2008), pages 215 – 216, 2008.

performance
Performance
  • 64K CMPs
  • > 1m ARM968
  • 256 tera IPS computing power
  • >8 TB memory
  • 6 Gb/s/Node Comm. NoC (spike channel)
  • 1 Gb/s System NoC (synaptic channel)
  • 109 neurons in real-time
fault tolerance
Fault-tolerance
  • Redundancy
  • Fault-detection and Isolation
  • Fault-recovery
  • Min. single-point-of-failure
  • Run-time configuration
  • Run-time recovery
  • Run-time application loading
low power
Low-power
  • Hardware
    • Asynchronous Communication
    • Low-power ARM968
  • Software
    • Asynchronous Event-Driven Model
standard application model
Standard Application Model
  • Sleepy processors
  • Event-driven application
  • No scheduler
  • No software threads
  • Only ISRs
  • Driven by Interrupts
configuration process i
Configuration Process-I

POST

  • Min Boot-ROM code
  • POST+chip components initialization
  • Batch mode

Load Boot code in TCM

Select Monitor Proc.

Configure Interrupts

yes

Monitor

no

Configure Chip

Go to Sleep

configuration process ii
Configuration Process-II

Recovery

  • Event-driven Model
  • Real-time Configuration
  • Processors on Sleep

Host System Comm.

yes

no

Host Chip

Packet Comm.

Frame + Packet Comm.

Assign (x, y)

Assign (0, 0)

Conf. Router

Conf. Router

Status to Host Chip

Acc. Status to Host

flood fill mechanism
Flood-fill Mechanism

1 Ethernet Connection

  • Event-driven model
  • Droplets of data block to origin chip(s)
  • A pipelined wave of data from origin(s) to other chips

2 Ethernet Connections

animations from http://physics-animations.com/Physics/English/int_ref.htm#Wlb

flood fill mechanism1
Flood-fill Mechanism
  • Various Mechs.
    • Broadcast
    • 5 Chips fwd
    • 3 Chips fwd
    • 2 Chips fwd
  • Performance Vs robustness
evaluation
Evaluation
  • SystemC system-level model
  • Cycle-accurate
  • Instruction accurate
  • 129706 cycles for configuration process-I
ad