1 / 20

Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks

Concurrent VLSI Architecture Group. Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks. Daniel U. Becker , Nan Jiang, George Michelogiannakis , William J. Dally Stanford University. ICCD 2012, 9/30/12–10/3/12, Montreal, Canada. Overview.

marja
Download Presentation

Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Concurrent VLSI Architecture Group Adaptive Backpressure:Efficient Buffer Management for On-Chip Networks Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally Stanford University ICCD 2012, 9/30/12–10/3/12, Montreal, Canada

  2. Overview • Input buffer sharing is attractive in NoCs • Improves area and power efficiency • But facilitates spread of congestion • Adaptive Backpressure mitigates performance degradation by avoiding unproductive use of buffer space in the presence of congestion • Avoid downsides of buffer sharing while maintaining benefits in benign case Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  3. Dynamic Buffer Management • Buffer space is expensive resource in NoCs • 30-35% network power (MIT RAW, UT TRIPS) • Dynamic management increases utilization by sharing buffer space among multiple VCs • Optimize use of expensive buffer resources • Decrease incremental cost of VCs • Improved area and power efficiency • 25% more throughput or 34% less power [Nicopoulos’06] Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  4. Buffer Monopolization • Blocked flits from congested VC accumulate in buffer • Effective buffer size reduced for other VCs • Performance degradation (latency / throughput) • Congestion spreads across VCs (flows / apps / VMs / …) VC 0 VC 1 Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  5. Adaptive Backpressure Goal: • Avoid unproductive use of buffer space • But allow sharing when beneficial Approach: • Match arrival and departure rate for each VC by regulating credit availability (backpressure) • Derive quota from credit round trip times Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  6. Quota Motivation (1) Router 0 Router 1 Router 0 Router 1 Tcrt,0 Idle cycle Credit stall time Without congestion, full throughput requires Tcrt,0 credits Insufficient credit supply causes idle cycle downstream Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  7. Quota Motivation (2) Router 0 Router 1 Router 0 Router 1 Tcrt,0+Tstall Congestion stall Congestion stall Queuing stall Queuing stall Queuing stall Queuing stall Queuing stall Queuing stall Credit stall Excess flits Excess flits Excess drained time Congestion stall causes unproductive buffer occupancy Matching stalls avoids unproductive buffer occupancy Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  8. Quota Heuristic • Track credit RTT for each output VC • RTT=RTTmin⇒ set quota to RTTmin • No downstream congestion • Allow one flit in each cycle of RTT interval • RTT>RTTmin⇒ subtract difference from RTTmin • Each congestion and queuing stall adds to RTT • Allow one credit stall per downstream stall Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  9. Implementation • Network design determines RTTmin for each link • Track RTT for single in-flight credit per VC • Update quota value upon return • Switch allocator masks all VCs that exceed quota • Simple extension to existing flow control logic • No additional signaling required • < 5% overhead for 16x64b buffer with 4 VCs Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  10. Evaluation Methodology • BookSim 2.0 • 8x8 2D mesh, 64-bit channels, DOR • 16-slot input buffers, 4 VCs • Combined VC and switch allocation • Synthetic traffic and application benchmarks • Compare ABP to unrestricted sharing Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  11. Network Stability (1) • For adversarial traffic, throughput in Mesh is unstable at high load • Traffic merging causes starvation • Tree saturation causes widespread congestion • ABP improves stability • Throttles sources that inject at very high rate • Efficient buffer use reduces tree saturation • Faster recovery from transient congestion Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  12. Network Stability (2) [tornado traffic] 6.3x Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  13. Network Stability (3) [foreground traffic at 50% injection rate] 3.3x -13% saturation rate Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  14. Performance Isolation (1) • Inject two classes of traffic into network • Shared buffer space, separate VCs • Sharing causes interference between classes • ABP reduces interference • Contains effects of congestion within a class • Better isolation between workloads, VMs, … Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  15. Performance Isolation (2) [uniform random foreground traffic] -33% -38% [uniform random background traffic] [hotspot background traffic] Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  16. Performance Isolation (3) [50% uniform random background traffic] -31% w/o background Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  17. Application Performance (1) • 8 interleaved memory controllers • Heterogeneous network nodes • Array of stream processors • Streaming data to memory • Modeled as hotspot traffic • In-order general purpose core • Running at 4x network frequency • Executing PARSEC benchmarks • Modeled using Netrace[Hestness’11] • Common network • Disjoint VC ranges • Shared buffer space Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  18. Application Performance (2) -31% w/o background [12.5% injection rate for streaming traffic] Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  19. Conclusions • Sharing improves buffer utilization, but can lead to undesired interference effects • Adaptive Backpressure regulates credit flow to avoid unproductive use of shared buffer space • Mitigates performance degradation in presence of adversarial traffic • But maintains key benefits of buffer sharing under benign conditions Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

  20. Thank you for your attention! The End Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure

More Related