1 / 38

Bridging the gap between asynchronous design and designers

Bridging the gap between asynchronous design and designers. Hao Zheng. Outline. What is an asynchronous circuit ? Asynchronous communication Asynchronous design styles (Micropipelines) Asynchronous logic building blocks Control specification and implementation

betty
Download Presentation

Bridging the gap between asynchronous design and designers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bridging the gap between asynchronous designand designers HaoZheng

  2. Outline • What is an asynchronous circuit ? • Asynchronous communication • Asynchronous design styles (Micropipelines) • Asynchronous logic building blocks • Control specification and implementation • Delay models and classes of async circuits • Why asynchronous circuits ?

  3. R CL R CL R CL R CLK Synchronous circuit Implicit (global) synchronization between blocks Clock period > Max Delay (CL + R) Time is an independent physical variable (quantity)

  4. Asynchronous circuit Ack R CL R CL R CL R Req Explicit (local) synchronization: Req / Ack handshakes Time = events + quantity Time does not exist if nothing happens (Aristotle)

  5. Motivation for Asynchronous • Asynchronous design is often unavoidable: • Asynchronous interfaces, arbiters etc. • Modern clocking is multi-phase and distributed – and virtually ‘asynchronous’ (cf. GALS – next slide): • Mesachronous (clock travels together with data) • Local (possibly stretchable) clock generation • Robust asynchronous design flow is coming (e.g. VLSI programming from Philips, NCL from Theseus Logic, fine-grain pipelining from Fulcrum)

  6. Motivation(TechnologyAspects) • Lowpower • Automatic clock gating • Electromagneticcompatibility • No peakcurrentsaround clock edges • Security • No ‘electro-magneticdifference’ between logical ‘0’ and ‘1’in dual railcode • Robustness • Highimmunitytotechnology and environmentvariations (temperature, powersupply, ...)

  7. Motivation(Designer’sView) • Modularityforsystem-on-chip design • Plug-and-playinterconnectivity • Average-case peformance • No worst-case delaysynchronization • Many interfaces are asynchronous • Buses, networks, ...

  8. Globally Async Locally Sync (GALS) Asynchronous World Clocked Domain Req3 Req1 R R CL Ack3 Ack1 Local CLK Req4 Req2 Ack4 Ack2 Async-to-sync Wrapper

  9. Key Design Differences • Synchronous logic design: • proceeds without taking timing correctness (hazards, signal ack-ing etc.) into account • Combinational logic and memory latches (registers) are built separately • Static timing analysis of CL is sufficient to determine the Max Delay (clock period) • Fixed set-up and hold conditions for latches

  10. Key Design Differences • Asynchronous logic design: • Must ensure hazard-freedom, signal ack-ing, local timing constraints • Combinational logic and memory latches (registers) are often mixed in “complex gates” • Dynamic timing analysis of logic is needed to determine relative delays between paths • To avoid complex issues, circuits may be built as Delay-insensitive and/or Speed-independent (Maller’s theory vs Huffman asynchronous automata)

  11. Verification and Testing Differences • Synchronous logic verification and testing: • Only functional correctness aspect is verified and tested • Testing can be done with standard ATE and at low speed • Asynchronous logic verification and testing: • In addition to functional correctness, temporal aspect is crucial: e.g. causality and order, deadlock-freedom • Testing must cover faults in complex gates (logic+memory) and must proceed at normal operation rate • Delay fault testing may be needed

  12. Synchronous communication • Clock edges determine the time instants where data must be sampled • Data wires may glitch between clock edges (set-up/hold times must be satisfied) • Data are transmitted at a fixed rate(clock frequency) 1 1 0 0 1 0

  13. Dual Rail 1 1 1 • Two wires with L(low) and H (high) per bit • “LL” = “spacer”, “LH” = “0”, “HL” = “1” • n-bit data communication requires 2n wires • Each bit isself-timed • Otherdelay-insensitivecodesexist (e.g. k-of-n) and event-basedsignalling (choicecriteria: pin and powerefficiency) 0 0 0

  14. BundledData • Validity signal • Similar toanaperiodic local clock • n-bit data communication requiresn+1 wires • Data wires may glitch when no validity signal. • Signaling protocols • level sensitive (latch) • transition sensitive (register): 2-phase / 4-phase 1 1 0 0 1 0

  15. Example: Memory Read Cycle Validaddress • Transition signaling, 4-phase Address A A Valid data Data D D

  16. Example: Memory Read Cycle Valid address • Transition signaling, 2-phase A A Address Valid data Data D D

  17. AsynchronousModules DATA PATH • Signaling protocol: reqin+ start+ [computation] done+ reqout+ ackout+ ackin+reqin- start- [reset] done- reqout- ackout- ackin-(more concurrencyisalsopossible) Data IN Data OUT start done req in req out CONTROL ack in ack out

  18. A C Z B A B Z+ 0 0 0 0 1 Z 1 0 Z 1 1 1 AsynchronousLatches: C element Vdd A B Z B A Z B A Z Static Logic Implementation A B [van Berkel 91] Gnd

  19. Vdd A B Z B A Gnd C-element: Other Implementations Vdd A Weak inverter B Z B A Dynamic Quasi-Static Gnd

  20. A.t C.t B.t A.f C.f B.f Dual-RailLogic Dual-rail AND gate Validbehaviorformonotonicenvironment

  21. done C Completiondetectiontree CompletionDetection Dual-rail logic • • • • • •

  22. DifferentialCascodeVoltageSwitchLogic start Z.f Z.t done A.t N-type transistor network C.f B.f A.f B.t C.t start 3-input AND/NAND gate

  23. Examples of Dual-Rail Design • Asynchronous dual-rail ripple-carry adder (A. Martin, 1991) • Critical delay is proportional to logN (N=number of bits) • 32-bit adder delay (1.6m MOSIS CMOS): 11ns versus 40 ns for synchronous • Async cell transistor count = 34 versus synchronous = 28 • More recent success stories (modularity and automatic synthesis) of dual-rail logic from Null-Convension Logic from Theseus Logic

  24. start done delay Bundled-Data LogicBlocks Single-rail logic • • • • • • Conventionallogic + matcheddelay

  25. r1 g1 C d1 r2 g2 d2 r1 a1 r a r2 out0 a2 in sel out1 outf in outt Micropipelines (Sutherland 89) Micropipeline (2-phase) control blocks Request-Grant-Done (RGD)Arbiter Join Merge Call Select Toggle

  26. C C C delay delay delay Micropipelines (Sutherland 89) Aout Ain C L logic L logic L logic L Rin Rout

  27. DataPath/ Control L logic L logic L logic L Rin Rout CONTROL Ain Aout Synthesis of control is a major challenge

  28. Control specification A+ A B+ B A- A input B output B-

  29. Control specification A+ B- B A A- B+

  30. C Control specification A+ B+ A C+ C B A- B- C-

  31. C Control specification A+ B+ A C+ C A- B B- C-

  32. Ro+ Ri+ Ri Ro FIFO cntrl Ao+ Ai+ Ao Ai Ro- Ri- C C Ai- Ao- Ri Ro Ao Ai Control Specification

  33. Gate vs Wiredelaymodels • Gatedelaymodel: delays in gates, no delays in wires • Wiredelaymodel: delays in gates and wires

  34. DI DelayModelsforAsync. Circuits • Boundeddelays (BD):realisticforgates and wires. • Technologymappingiseasy, verificationisdifficult • Speedindependent (SI):Unbounded (pessimistic) delaysforgates and “negligible” (optimistic) delaysfor wires. • Technologymappingis more difficult, verificationiseasy • Delayinsensitive (DI):Unbounded (pessimistic) delaysforgates and wires. • DI class (builtout of basicgates) isalmostempty • Quasi-delayinsensitive (QDI):Delayinsensitiveexceptforcriticalwireforks (isochronicforks). • In practiceitis the same as speedindependent BD SI  QDI

  35. Environment models • Slow enough environment = Fundamental mode (Inputs change AFTER system has settled) • Reactive environment = I/O mode (Inputs may change once the first output changes)

  36. Correctness of a Circuit wrtDelay Assumptions C-element: z = ab +zb + za a a b z b z

  37. Resistance • Concurrentmodelsforspecification • CSP, Petrinets, ...: no more FSMs • Difficulttodesign • Hazards, synchronization • Complextiminganalysis • Difficulttoestimate performance • Difficultto test • No wayto stop the clock

  38. But ... some successful stories • Philips • AMULET microprocessors • Sharp • Intel (RAPPID) • Start-up companies: • Theseus logic, Fulcrum, Self-Timed Solutions • Recent blurb: It's Time for Clockless Chips, by Claire Tristram (MIT Technology Review, v. 104, no.8, October 2001: http://www.technologyreview.com/magazine/oct01/tristram.asp) • ….

More Related