1 / 17

Configuring a Large-Scale GALS System

Configuring a Large-Scale GALS System. M.M. Khan*, J. Navaridas†, L.A. Plana*, M. Luj´an*, J.V Woods*, J. Miguel-Alonso† and S.B. Furber* *School of Computer Science, The University of Manchester, UK †University of The Basque Country, Spain. SpiNNaker. Objectives High-performance Robust

meghan
Download Presentation

Configuring a Large-Scale GALS System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Configuring a Large-Scale GALS System M.M. Khan*, J. Navaridas†, L.A. Plana*, M. Luj´an*, J.V Woods*, J. Miguel-Alonso† and S.B. Furber* *School of Computer Science, The University of Manchester, UK †University of The Basque Country, Spain

  2. SpiNNaker • Objectives • High-performance • Robust • Low-power

  3. SpiNNaker CMP • System RAM • Boot ROM • MC Router • Sys. Controller • Ethernet • SDRAM • 20 Proc. Nodes

  4. Processing Node • ARM968E-S • Comm. Ctlr. • Interrupt Ctlr. • DMA Ctlr. • Timer • TCM (100K)

  5. Communication Network • MC Router • Packets • MC • P2P • NN • 1Gb/s inter-chip • 6Gb/s per Node • Six two-way inter-chip links *L.A. Plana et al.An On-Chip and Inter-Chip Communications Network for the Spinnaker Massively-Parallel Neural Net Simulator. In Proc. Second ACM/IEEE International Symposium on Networks-on-Chip (NoCS 2008), pages 215 – 216, 2008.

  6. Performance • 64K CMPs • > 1m ARM968 • 256 tera IPS computing power • >8 TB memory • 6 Gb/s/Node Comm. NoC (spike channel) • 1 Gb/s System NoC (synaptic channel) • 109 neurons in real-time

  7. Fault-tolerance • Redundancy • Fault-detection and Isolation • Fault-recovery • Min. single-point-of-failure • Run-time configuration • Run-time recovery • Run-time application loading

  8. Low-power • Hardware • Asynchronous Communication • Low-power ARM968 • Software • Asynchronous Event-Driven Model

  9. Standard Application Model • Sleepy processors • Event-driven application • No scheduler • No software threads • Only ISRs • Driven by Interrupts

  10. Configuration Process-I POST • Min Boot-ROM code • POST+chip components initialization • Batch mode Load Boot code in TCM Select Monitor Proc. Configure Interrupts yes Monitor no Configure Chip Go to Sleep

  11. Configuration Process-II Recovery • Event-driven Model • Real-time Configuration • Processors on Sleep Host System Comm. yes no Host Chip Packet Comm. Frame + Packet Comm. Assign (x, y) Assign (0, 0) Conf. Router Conf. Router Status to Host Chip Acc. Status to Host

  12. Flood-fill Mechanism 1 Ethernet Connection • Event-driven model • Droplets of data block to origin chip(s) • A pipelined wave of data from origin(s) to other chips 2 Ethernet Connections animations from http://physics-animations.com/Physics/English/int_ref.htm#Wlb

  13. Flood-fill Mechanism • Various Mechs. • Broadcast • 5 Chips fwd • 3 Chips fwd • 2 Chips fwd • Performance Vs robustness

  14. Evaluation • SystemC system-level model • Cycle-accurate • Instruction accurate • 129706 cycles for configuration process-I

  15. Evaluation

  16. Evaluation

  17. Conclusions

More Related