1 / 45

Low-Latency Interfaces for Mixed-Timing Domains [in DAC-01]

Low-Latency Interfaces for Mixed-Timing Domains [in DAC-01]. Tiberiu Chelcea Steven M. Nowick Department of Computer Science Columbia University {tibi,nowick}@cs.columbia.edu. Introduction. Key Trend in VLSI systems: systems-on-a-chip (SoC) Two fundamental challenges:

coyne
Download Presentation

Low-Latency Interfaces for Mixed-Timing Domains [in DAC-01]

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Low-Latency Interfaces for Mixed-Timing Domains[in DAC-01] Tiberiu Chelcea Steven M. Nowick Department of Computer Science Columbia University {tibi,nowick}@cs.columbia.edu

  2. Introduction Key Trend in VLSI systems: systems-on-a-chip (SoC) Two fundamental challenges: • mixed-timing domains • long interconnect delays Our Goal: design of efficient interface circuits Desirable Features: • arbitrarily robust • low-latency, high-throughput • modularity, scalability Few satisfactory solutions to date….

  3. Timing Issues in SoC Design (a) single-clock (b) mixed-timing domains sync or async Domain #1 Domain #1 longinter- connect longinter- connect Domain #2 sync or async Domain #2

  4. Timing Issues in SoC Design (cont.) Solution: provide interface circuits (a) single-clock (b) mixed-timing domains sync or async Domain #1 Domain #1 longinter- connect longinter- connect sync or async Domain #2 Domain #2 Carloni et al., “relay stations” NEW: “mixed-timingFIFO’s” NEW: “mixed-timing“relay stations”

  5. Contributions Complete set of mixed-timing interface circuits: • sync-sync, async-sync, sync-async, async-async Features: • Arbitrary Robustness: wrt synchronization failures • High-Throughput: • in steady-state operation: no synchronization overhead • Low-Latency:“fast restart” • in empty FIFO: only synchronization overhead • Reusability: • each interface partitioned into reusable sub-components Two Contributions: • Mixed-Timing FIFO’s • Mixed-Timing Relay Stations

  6. Contribution #1: Mixed-Timing FIFO’s Addresses issue of interfacing mixed-timing domains Features: token ring architecture • circular array of identical cells • shared buses: data + control • data: “immobile” once enqueued • distributed control: allows concurrent put/get operations 2 circulating tokens: define tail & head of queue Potential benefits: • low latency • low power • scalability

  7. Contribution #2: Mixed-Timing Relay Stations Addresses issue of long interconnect delays “Latency-Insensitive Protocols”: safely tolerate long interconnect delays between systems Prior Contribution: introduce “relay stations” • single-clock domains (Carloni et al., ICCAD-99) Our Contribution: introduce “mixed-timing relay stations” • mixed-clock (sync-sync) • async-sync First proposed solutions to date….

  8. Related Work Single-Clock Domains: handling clock discrepancies • clock skew and jitter (Kol98, Greenstreet95) • long interconnect delays (Carloni99) Mixed-Timing Domains: 3 common approaches • Use “Wrapper Logic”: • add logic layer to synchronize data/control (Seitz80, Seizovic94) • drawback:long latencies in communication • Modify Receiver’s Clock: • stretchable and pausible clocks (Chapiro84, Yun96, Bormann97, Sjogren/Myers97) • drawback: penalties in restarting clock

  9. Related Work: Closer Approaches Mixed-Timing Domains (cont.): • Interface Circuits: Mixed-Clock FIFO’s (Intel, Jex et al. 1997): • drawback: significant area overhead = synchronizerfor each cell Our approach: mixed-clock FIFO’s • … only 2 synchronizers for entire FIFO

  10. Outline • Mixed-Clock Interfaces • FIFO • Relay Station • Async-Sync Interfaces • FIFO • Relay Station • Results • Conclusions

  11. Initiates put operations Indicates data items validity (always 1 in this design) Initiates get operations Indicates when FIFO full Bus for data items Indicates when FIFO empty Bus for data items Controls put operations Controls get operations Mixed-Clock FIFO: Block Level full req_get valid_get req_put Mixed-Clock FIFO synchronous put inteface synchronous get interface empty data_put data_get CLK_put CLK_get

  12. Sender starts a put operation Put Controller enables a put operation FIFO not full TAIL Cell enqueues data Full Detector Put Controller Get Controller Empty Detector HEAD Mixed-Clock FIFO: Steady-State Simulation At the end of clock cycle Steady state: FIFO neither full, nor empty full req_put data_put CLK_put CLK_get data_get req_get valid_get empty

  13. Passes the put token TAIL Full Detector Put Controller Get Controller Empty Detector HEAD Mixed-Clock FIFO: Steady-State Simulation full req_put data_put CLK_put CLK_get data_get req_get valid_get empty

  14. TAIL Full Detector Put Controller Get Controller Empty Detector HEAD Mixed-Clock FIFO: Steady-State Simulation full req_put data_put CLK_put CLK_get data_get req_get valid_get empty Get Operation

  15. TAIL Full Detector Put Controller Get Controller Empty Detector HEAD Steady state operation: Puts and Gets “reasonably spaced” Zero probability of synchronization failure Steady state operation: Zero synchronization overhead Mixed-Clock FIFO: Steady-State Simulation full req_put data_put CLK_put CLK_get data_get req_get valid_get empty

  16. TAIL TAIL TAIL Full Detector Put Controller Get Controller Empty Detector HEAD Mixed-Clock FIFO: Steady-State Simulation full req_put data_put CLK_put CLK_get data_get req_get valid_get empty

  17. Put interface stalled TAIL Full Detector Put Controller Get Controller Empty Detector HEAD Mixed-Clock FIFO: Full Scenario FIFO FULL full req_put data_put CLK_put CLK_get data_get req_get valid_get empty

  18. TAIL Full Detector Put Controller Get Controller Empty Detector HEAD Mixed-Clock FIFO: Full Scenario full req_put data_put CLK_put CLK_get data_get req_get valid_get empty

  19. TAIL Full Detector Put Controller Get Controller Empty Detector HEAD Mixed-Clock FIFO: Full Scenario FIFO NOT FULL full req_put data_put CLK_put CLK_get data_get req_get valid_get empty

  20. TAIL Full Detector Put Controller Get Controller Empty Detector HEAD Mixed-Clock FIFO: Full Scenario full req_put data_put CLK_put CLK_get data_get req_get valid_get empty

  21. Data item in En Enables a put operation Validity bit in Synchronous Put Part reusable reusable En en_put req_put data_put ptok_out gtok_out gtok_in ptok_in En Data Validity Controller Status Bits: f_i Cell FULL SR e_i Cell EMPTY En valid data_get en_get Synchronous Get Part Data item out Enables a get operation Validity bit out Mixed-Clock FIFO: Cell Implementation CLK_put en_put req_put data_put ptok_out ptok_in f_i REG e_i gtok_out gtok_in CLK_get en_get valid data_get

  22. FIFO not full Full Detector Put Controller Get Controller Empty Detector Mixed-Clock FIFO: Architecture full req_put data_put CLK_put CLK_get data_get req_get valid_get empty

  23. Synchronization Issues Challenge: interfaces are highly-concurrent • Global “FIFO state”: controlled by 2 different clocks Problem #1: Metastability • Each FIFO interface needs clean state signals Solution:Synchronize “full” & “empty” signals • “full” with CLK_put • “empty” with CLK_get Add 2 (or more) synchronizing latches to each signal Observable “full”/“empty”safely approximate true FIFO state

  24. CLK_put full e_0 e_1 e_2 e_3 e_1 e_2 e_3 e_0  Two consecutive empty cells = FIFO not full CLK_put CLK_put Synchronizing Latches NO two consecutive empty cells Synchronization Issues (cont.) Problem #2:FIFO now may underflow/overflow! • synchronizing latches add extra latency Solution: Modify definitions of “full” and “empty” New FULL:0 or 1 empty cells left New EMPTY:0 or 1 full cells left New Full Detector

  25. Synchronization Issues (cont.) Problem #3:Potential for deadlock Scenario: suppose only 1 data item in quiescent FIFO • FIFO still considered “empty” (new definition) • Get interface: cannot dequeue data item! Solution:bi-modal “empty detector”, combines: • “New empty” detector (0 or 1 data items) • “True empty” detector (0 data items) Two results folded into single global “empty” signal

  26. Combine into global “empty” Detects “new empty” (0 or 1 empty cells) When NOT reconfigured, use “oe”: FIFO quiescent  avoids deadlock When reconfigured use “ne”: FIFO active  avoids underflow CLK_get CLK_get CLK_get CLK_get Detects “true empty” (0 empty cells) Reconfigure whenever active get interface Synchronization Issues: Avoiding Deadlock Bi-modal empty detection: select either ne or oe CLK_get ne f_0 f_1 f_2 f_3 f_1 f_2 f_3 f_0 empty en_get CLK_get oe f_0 f_1 f_2 f_3 req_get

  27. FIFO not full Full Detector Put Controller Get Controller Empty Detector Mixed-Clock FIFO: Architecture full req_put data_put CLK_put CLK_get data_get req_get valid_get empty

  28. Put Controller: enables put operation disabled when FIFOfull Get Controller: enables get operation indicates when data valid disabled when FIFOempty Put/Get Controllers en_get req_get en_put full req_put valid_get empty valid

  29. Outline • Mixed-Clock Interfaces • FIFO • Relay Station • Async-Sync Interfaces • FIFO • Relay Station • Results • Conclusions

  30. system 1 now sends “data packets” to system 2 system 1 sends “data items” to system 2 Delay = > 1 cycle Delay = 1 cycle RS RS RS RS Data Packet = • “stop” control = stopIn + stopOut • apply counter-pressure • result: stall communication data item + CLK Steady State: pass data on every cycle (either valid or invalid) validity bit Problem: Works only for single-clock systems! Relay Stations: Overview Proposed by Carloni et al. (ICCAD’99) System 1 System 2

  31. MR mux switch AR Control Relay Stations: Implementation • In normal operation: • packetIn copied to MR and forwarded onpacketOut • When stopped (stopIn=1): • stopOutraised on the next clock edge • extra packet copied to AR packetIn packetOut stopOut stopIn

  32. Steady state:always pass data Data items: both valid & invalid Stopping mechanism:stopIn & stopOut Steady state:only pass data when requested Data items:only valid data Stopping mechanism: none (only full/empty) Mixed- Clock FIFO Relay Station Relay Station vs. Mixed-Clock FIFO full empty validOut validIn stopOut stopIn req_put req_get dataOut dataIn dataIn dataOut

  33. NEW MCRS RS RS RS RS CLK2 CLK1 Change ONLY Put and Get Controllers full req_get stopOut stopIn valid_get req_put valid_get valid_put Mixed-Clock FIFO Mixed-Clock Relay Station empty packetIn packetOut data_put CLK1 CLK2 data_get data_put data_get CLK_put CLK_get Mixed-Clock Relay Stations (MCRS) System 1 System 2 CLK Mixed-Clock Relay Station derived from the Mixed-Clock FIFO

  34. Identical: - FIFO cells - Full/Empty detectors(...or can simplify) Only modify: Put & Get Controllers Always enqueue data (unless full) Mixed-Clock Relay Station: Implementation Mixed-Clock Relay Station vs. Mixed-Clock FIFO en_get stopIn en_put full validOut empty validIn to cells valid Put Controller Get Controller

  35. Outline • Mixed-Clock Interfaces • FIFO • Relay Station • Async-Sync Interfaces • FIFO • Relay Station • Results • Conclusions

  36. Async-Sync FIFO: Block Level Asynchronous put interface: uses handshaking communication • put_req: request operation • put_ack: acknowledge completion • no “full” signal Synchronous get interface: no change req_get req_get full put_req valid_get valid_get req_put put_ack Mixed-Clock FIFO Async-Sync FIFO empty empty data_put data_get put_data data_get CLK_put CLK_get CLK_get Async Domain Sync Domain

  37. No Full Detector or Put Controller When FIFO full, acknowledgement withheld until safe to perform the put operation Asynchronous put interface Get Controller Empty Detector Get interface: exactly as in Mixed-Clock FIFO Async-Sync FIFO: Architecture put_ack put_req put_data cell cell cell cell cell CLK_get data_get req_get valid_get empty

  38. Asynchronous Put Part Data Validity Controller reusable C OPT + from async FIFO (Async00) new DV En reusable (from mixed-clock FIFO) Synchronous Get Part Async-Sync FIFO: Cell Implementation put_ack put_req put_data we we1 e_i REG f_i gtok_in gtok_out CLK_get en_get get_data

  39. System 1 (async) System 2 (sync) ARS ARS RS Async-Sync Relay Stations (ASRS) Micropipeline ASRS optional CLK2

  40. Outline • Mixed-Clock Interfaces • FIFO • Relay Station • Async-Sync Interfaces • FIFO • Relay Station • Results • Conclusions

  41. Results Each circuit implemented: • using both academic and industry tools • MINIMALIST: Burst-Mode controllers [Nowick et al. ‘99] • PETRIFY: Petri-Net controllers [Cortadella et al. ‘97] Pre-layout simulations: 0.6m HP CMOS technology Experiments: • various FIFO capacities (4/8/16 cells) • various data widths (8/16 bits)

  42. Results: Latency Experimental Setup: - 8-bit data items - various FIFO capacities (4, 8, 16) Latency = time from enqueuing to dequeueing data into an empty FIFO For each design, latency not uniquely defined: Min/Max

  43. Results: Maximum Operating Rate Synchronous interfaces: MegaHertz Asynchronous interfaces: MegaOps/sec Put vs. Get rates: - sync put faster than sync get - async put slower than sync get

  44. Conclusions Introduced several new low-latency interface circuits Address 2 major issues in SoC design: • Mixed-timing domains • mixed-clock FIFO • async-sync FIFO • Long interconnect delays • mixed-clock relay station • async-sync relay station Other designs implemented and simulated: • Sync-Async FIFO + Relay Station • Async-Async FIFO + Relay Station Reusable components: mix & match to build circuits Provide useful set of interface circuits for SoC design

More Related