Configuring a large scale gals system
Download
1 / 17

Configuring a Large-Scale GALS System - PowerPoint PPT Presentation


  • 61 Views
  • Uploaded on

Configuring a Large-Scale GALS System. M.M. Khan*, J. Navaridas†, L.A. Plana*, M. Luj´an*, J.V Woods*, J. Miguel-Alonso† and S.B. Furber* *School of Computer Science, The University of Manchester, UK †University of The Basque Country, Spain. SpiNNaker. Objectives High-performance Robust

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Configuring a Large-Scale GALS System' - meghan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Configuring a large scale gals system

Configuring a Large-Scale GALS System

M.M. Khan*, J. Navaridas†, L.A. Plana*, M. Luj´an*,

J.V Woods*, J. Miguel-Alonso† and S.B. Furber*

*School of Computer Science, The University of Manchester, UK

†University of The Basque Country, Spain


Spinnaker
SpiNNaker

  • Objectives

    • High-performance

    • Robust

    • Low-power


Spinnaker cmp
SpiNNaker CMP

  • System RAM

  • Boot ROM

  • MC Router

  • Sys. Controller

  • Ethernet

  • SDRAM

  • 20 Proc. Nodes


Processing node
Processing Node

  • ARM968E-S

  • Comm. Ctlr.

  • Interrupt Ctlr.

  • DMA Ctlr.

  • Timer

  • TCM (100K)


Communication network
Communication Network

  • MC Router

  • Packets

    • MC

    • P2P

    • NN

  • 1Gb/s inter-chip

  • 6Gb/s per Node

  • Six two-way inter-chip links

*L.A. Plana et al.An On-Chip and Inter-Chip Communications Network for the Spinnaker Massively-Parallel Neural Net Simulator. In Proc. Second ACM/IEEE International Symposium on Networks-on-Chip (NoCS 2008), pages 215 – 216, 2008.


Performance
Performance

  • 64K CMPs

  • > 1m ARM968

  • 256 tera IPS computing power

  • >8 TB memory

  • 6 Gb/s/Node Comm. NoC (spike channel)

  • 1 Gb/s System NoC (synaptic channel)

  • 109 neurons in real-time


Fault tolerance
Fault-tolerance

  • Redundancy

  • Fault-detection and Isolation

  • Fault-recovery

  • Min. single-point-of-failure

  • Run-time configuration

  • Run-time recovery

  • Run-time application loading


Low power
Low-power

  • Hardware

    • Asynchronous Communication

    • Low-power ARM968

  • Software

    • Asynchronous Event-Driven Model


Standard application model
Standard Application Model

  • Sleepy processors

  • Event-driven application

  • No scheduler

  • No software threads

  • Only ISRs

  • Driven by Interrupts


Configuration process i
Configuration Process-I

POST

  • Min Boot-ROM code

  • POST+chip components initialization

  • Batch mode

Load Boot code in TCM

Select Monitor Proc.

Configure Interrupts

yes

Monitor

no

Configure Chip

Go to Sleep


Configuration process ii
Configuration Process-II

Recovery

  • Event-driven Model

  • Real-time Configuration

  • Processors on Sleep

Host System Comm.

yes

no

Host Chip

Packet Comm.

Frame + Packet Comm.

Assign (x, y)

Assign (0, 0)

Conf. Router

Conf. Router

Status to Host Chip

Acc. Status to Host


Flood fill mechanism
Flood-fill Mechanism

1 Ethernet Connection

  • Event-driven model

  • Droplets of data block to origin chip(s)

  • A pipelined wave of data from origin(s) to other chips

2 Ethernet Connections

animations from http://physics-animations.com/Physics/English/int_ref.htm#Wlb


Flood fill mechanism1
Flood-fill Mechanism

  • Various Mechs.

    • Broadcast

    • 5 Chips fwd

    • 3 Chips fwd

    • 2 Chips fwd

  • Performance Vs robustness


Evaluation
Evaluation

  • SystemC system-level model

  • Cycle-accurate

  • Instruction accurate

  • 129706 cycles for configuration process-I





ad