1 / 21

LIBRA: Multi-mode On-Chip Network Arbitration for Locality-Oblivious Task Placement

LIBRA: Multi-mode On-Chip Network Arbitration for Locality-Oblivious Task Placement. - Master’s degree defense -. Gwangsun Kim Computer Science Department Korea Advanced Institute of Science and Technology 2011. 12. 20. Table of Contents. Motivation LIBRA

Download Presentation

LIBRA: Multi-mode On-Chip Network Arbitration for Locality-Oblivious Task Placement

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LIBRA: Multi-mode On-Chip Network Arbitration for Locality-Oblivious Task Placement - Master’s degree defense - Gwangsun Kim Computer Science Department Korea Advanced Institute of Science and Technology 2011. 12. 20.

  2. Table of Contents • Motivation • LIBRA • Introduction to Probabilistic Distance-based Arbitration • Virtual Contention-based Arbitration • Hybrid Arbitration • Evaluation • Conclusions

  3. Motivation • On-Chip Network is an important shared resource in CMP. • Fair allocation of shared resource is needed. [Data collected by C. Batten, Y. Pan]

  4. Motivation • Experiment: 16-core CMPRun SPEC benchmark and 15 copies of memory-intensive microbenchmark to create hotspot.The location of SPEC bench is varied. • Round-robin arbiter resultsin a significant unfairness. • Why fairness in OCN matters? • Hard to predict performance (SLA). • Complicates OS design. • Parallel application slowdown. • This work proposes LIBRA,an OCN support for locality-oblivious task placement. Hotspot MC Up to 12x!

  5. Overview of LIBRA • Locality-Oblivious Bandwidth Regulatory Aribter • Libra: constellation of zodiac thatsymbolizes a balance. • Leverages probabilistic distance-based arbitration (MICRO’10) • Consists of two mechanisms: • Virtual contention arbitration (VCA) • Solve with unfairness • Hybrid arbitration • Solve high latency problem • Combination of 1 and 2: multi-mode arbitration

  6. Probabilistic Distance-based Arbitration (PDBA) • Proposed to provide fairness in on-chip networks. • Probabilistic arbitration • Weight is multiplied by contention degree source queue 1 1 1 1 1 1 x2 x2 x1 1 1 2 2 4 4 2 x2 x2 Router 1 Router 2 Router 0

  7. Limitation of Real Contention-based Arbitration • Real contention: when two or more requests contend. • Real contention-based arbitration (RCA): • Non-contention is not accounted for. • In many cases, there is no real contention → unfairness 1 1 4 4 2 Unfair bandwidth allocation!

  8. Virtual Contention-based Arbitration (VCA) • Considers historical non-contention in future arbitration. • Two modes • Virtual contention mode example: Real contention mode Virtual contention mode Increase priority counter by Last weight: 1 1 Priority counter: 0 1 Virtualcontention Last weight: 4 2 Priority counter: 0 4 4

  9. Virtual Contention-based Arbitration Cont’d • Real contention mode example: • If priority of all ports are the same, then do PDBA. Last weight: 1 1 Priority counter: 0 1 Realcontention Last weight: 4 2 2 4>0, so wins. Priority counter: 4 3 Decrement priority counter.

  10. Hybrid Arbiter • VCA increases router critical path → low clock freq. • Observation: fairness matters only at high load. • At low load, there are few contention → RR is fine. • At high load, there are many contention and the impact is huge VCA is needed, but packets are queued up in the buffer → more time for processing. 1 1 Do pre-calculation 1 1 VCA RR 2 2 2 2 Low load: RR has little impact on fairness High load: VCA provides fairness

  11. Hybrid Arbiter Cont’d • If there was no chance for pre-calculation, use RR. • Use VCAwhenever possible.

  12. LIBRA: Multi-mode Arbitration • Operate in one of multiple modes depending on contention type and load. • Contention type: # of requests for the output port • Load: whether pre-calculation is done or not

  13. Methodology • Area and timing evaluation: Synopsys Design Compiler and IC Compiler. • Synthetic simulation using cycle-accurate Booksim simulator. • SPEC CPU 2006 application and microbenchmark simulation using cycle-accurate GEMS + Booksim simulator. Synthetic traffic simulation parameters GEMS simulation parameters

  14. Timing and Area • Baseline (RR): 1.4GHz and 0.07mm2 • LIBRA reduces latency significantly,while introducing low area overhead. [MICRO’10]

  15. Synthetic Traffic Evaluation • Network stability and throughput Uniform random Tornado Bitcomp

  16. Support for Locality-oblivious Task Placement • Configuration • 14 copies of memory-intensive microbenchmark. • SPEC bench. placement: closest or farthest to the hotspot. • LIBRA reduces max. slowdown by 2.7x and 1.8x compared to RR and AGE, respectively.

  17. Analysis on Unfairness of AGE • AGE can be unfair in closed-loop evaluation. : buffer depth : # of in-flight packet from • Assumptions: • All nodes send packets to MC • Ideal age-based arbitration • Steady state , ,

  18. Cost Comparison of QoS Mechanisms • Area overhead comparison: additional area overhead per node (um2) [MICRO’10] [MICRO’09] [MICRO’10] [ISCA’08] LIBRA achieves 38% lower area overhead! (compared to PVC)

  19. Conclusions • Impact of task placement on performance: up to 30x with RR. • This work proposes LIBRA, a multi-mode arbitration. • VCA for providing global fairness. • Hybrid arbitration for reducing latency overhead. • LIBRA can support locality-oblivious task placement. • Analysis on unfairness of age-based arbitration. • LIBRA has 38% lower area overhead compared to PVC.

  20. Q&A Thank you!

  21. Hybrid Arbiter Cont’d • If there was no chance for pre-calculation, use RR. • Use VCAwhenever possible. Pre-calculationstage (PC) Arbitration stage (SAc) < + < X + X

More Related