Loading in 2 Seconds...

Physical Limits of Computing Dr. Mike Frank CIS 6930, Sec. #3753X Spring 2002

Loading in 2 Seconds...

95 Views

Download Presentation
##### Physical Limits of Computing Dr. Mike Frank CIS 6930, Sec. #3753X Spring 2002

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Physical Limits of ComputingDr. Mike Frank CIS 6930, Sec.**#3753XSpring 2002 Lecture #23Adiabatic Electronics & CMOS Mon., Mar. 11**Administrivia & Overview**• Don’t forget to keep up with homework! • We are 7 out of 14 weeks into the course. • You should have earned ~50 points by now. • Course outline: • Part I&II, Background, Fundamental Limits - done • Part III, Future of Semiconductor Technology - done • Part IV, Potential Future Computing Technologies - done • Part V, Classical Reversible Computing • Fundamentals of Adiabatic Processes & logic - last Wed. & Fri.(----------------------- Spring Break ------------------------) • Adiabatic electronics & CMOS logic families - TODAY • Limits of adiabatics: Leakage and clock/power supplies. - Wed. 3/13 • RevComp theory I: Emulating Irreversible Machines - Fri. 3/15 • RevComp theory II: Bounds on Space-Time Overheads - Mon. 3/18 • (plus ~7 more lectures…) • Part VI, Quantum Computing • Part VII, Cosmological Limits, Wrap-Up**Conventional Gates are Irreversible**• Logic gate behavior (on receiving new input): • Many-to-one transformation of local state! • Required to dissipate bT by Landauer principle • Incurs ½CV2 dissipation in 2 out of 4 cases. Transformation of local state: Example: Static CMOS Inverter: in out**Adiabatic Rules for Transistors**• Rule 1: Never turn on a transistor if it has a nonzero voltage across it! • I.e., between its source & drain terminals. • Why: This erases info. & causes ½CV2 disspation. • Rule 2: Never apply a nonzero voltage across a transistor even during any onoff transition! • Why: When partially turned on, the transistor has relatively low R, gets high P=V2/R dissipation. • Corollary: Never turn off a transistor if it has a nonzero current going through it! • Why: As R gradually increases, the V=IR voltage drop will build, and then rule 2 will be violated.**Adiabatic Rules continued**• Transistor Rule 3: Never suddenly change the voltage applied across any on transistor. • Why: So transition will be more reversible; dissipation will approach CV2(RC/t), not ½CV2. Adiabatic rules for other components: • Diodes: Don’t use them at all! • There is always a built-in voltage drop across them! • Resistors: Avoid moderate network resistances. • e.g. stay away from range >10 k and <1 M • Capacitors: Minimize, reliability permitting. • Note: Dissipation scales with C2!**Transistor Rules Summarized**Legal transitions in green. (For n- or p-FETs.)Dissipative states and transitions in red. off high low off off high high low low off low high on on low high high low on on low low high high****Transformation of local state:**Input-Barrier, Clocked-Bias Retractile*** Must reset outputprior to input.* Combinational logiconly! • Cycle of operation: • Inputs raise or lower barriers • Do logic w. series/parallel barriers • Clock applies bias force which changes state, or not 0 0 0 Examples:Hall’s logic,SCRL gates,Rod logic interlocks Input barrier height N 1 0 Clocked force applied **Retractile Logic w. SCRL gates**• Simple combinational logic of any depth N: • Requires N timing phases • Non-pipelined • No sequential reuse ofHW (even worse) • Sequential logicis required! Time **Sequential Retractile Logic**• Approach #1 (Hall ‘92): • After every N stages, invoke an irreversible latch • stores the output of the last stage • Then, retract all the stages, • and begin a new cycle • Problems: • Reduces dissipation by at most a factor of N • Also reduces HW efficiency by order N! • In worst case, compared to a pipelined, sequential circuit • Approach #2 (Knight & Younis, ‘93): • The “store output” stage can also be reversible! • Gives fully-adiabatic, sequential, pipelined circuits! • N can be as small 1 or 2 & still have arbitrarily high Q**P**Simple Reversible CMOS Latch • Uses a standard CMOS transmission gate • Sequence of operation: (1) input initially matches latch contents (output) (2) input changesoutput changes (3) latch closes (4) input removed Before Input Inputinput: arrived: removed:inoutinoutinouta a a a a a b b a b P in out**Resetting a Reversible Latch**• Can reversibly unlatch data as follows: (exactly the reverse of the latching process) • (1) Data value d stored on memory node M. • (2) Present an exact copy of d on input. • (3) Open the latch (connecting input to M). • No dissipation since voltage levels match • (4) Retract the copy of d from the input. • Retracts copy stored in latch also.**Input-Bias Clocked-Barrier Logic**• Cycle of operation: • Data input applies bias • Add forces to do logic • Clock signal raises barrier • Data input bias removed Can amplify/restore input signalin clocking step. Retractinput 1 1 Retractinput Clockbarrierup Can reset latch reversibly givencopy of contents. 0 0 Clock up Input“1” Input“0” Examples: AdiabaticQDCA, SCRL latch, Rod logic latch, PQ logic,Buckled logic N 1 0**SCRL 6-tick clock cycle**Initial state: All gates off, all nodes neutral. in out**SCRL 6-tick clock cycle**Tick #1: Input goes valid, forward T-gate opens. in out**SCRL 6-tick clock cycle**Tick #2: Forward gate charges, output goes valid.(Tick #1 of subsequent gate.) in out**SCRL 6-tick clock cycle**Tick #3: Forward T-gate closes, reverse gate charges. in out**SCRL 6-tick clock cycle**Tick #4: Reverse T-gate opens, forward gate discharges. in out**SCRL 6-tick clock cycle**Tick #5: Reverse gate discharges, input goes neutral. in out**SCRL 6-tick clock cycle**Tick #6: Reverse T-gate closes, output goes neutral.Ready for next input! in out**24 ticks/cyclein this version-includes 2-levelretractile**stages**Some Interesting Questions**• About pipelined, sequential, fully-adiabatic CMOS logic: • Q: Does it require these intermediate voltage levels? • A: No, you can get by with only 2 different levels. • Q: What is the minimum number of externally provided timing signals you can get away with? • A: 4 (12 if split levels are used) • Q: Can the order-N different timing signals needed for long retractile cascades be internally generated within an adiabatic circuit? • A: Yes, but not statically, unless N2 hardware is used • where N is the number of stages per full sequential cycle • We now demonstrate these answers.**P**2LAL: 2-level Adiabatic Logic P P • Use simplified T-gate symbol: • Basic buffer element: • cross-coupled T-gates • Only 4 timing signals,4 ticks per cycle: • i rises during tick i • i falls during tick ((i+1) mod 4)+1 : 2 Tick # in 1 2 3 4 1 2 out 3 1 4**2LAL Cycle of Operation**21 in1 in0 20 out1 in 11 10 21 in=0 out0 out=0 11 10**Shift Register Structure**• 1-tick delay per logic stage: • Logic pulse timing & propagation: 2 3 4 1 in out 1 2 3 4 1 2 3 4 ... 1 2 3 4 ... in in**More complex logic functions**• Non-inverting Boolean functions: • For inverting functions, must use quad-rail logic encoding: • To invert, justswap the rails! • Zero-transistor“inverters.” A B A A B AB AB A = 0 A = 1 A0 A0 A1 A1**Hardware Efficiency issues**• Hardware efficiency: How many logic operations per unit hardware per unit time? • Hardware spacetime complexity: How much hardware for how much time per logic op? • We’re interested in minimizing:(# of transistors) × (# of ticks) / (gate cycle) • SCRL inverter, w. return path: • (8 transistors) (6 ticks) = 48 transistor-ticks • Quad-rail 2LAL buffer stage: • (16 transistors) (4 ticks) = 64 transistor-ticks**More SCRL vs. 2LAL**• SCRL reversible NAND, w. all inverters: • (23 transistors) (6 ticks) = 138 T-ticks • Quad-rail 2LAL AND: • (48 transistors) (4 ticks) = 192 T-ticks • Result of comparison: Although 2LAL minimizes # of rails, and # ticks/cycle, it does not minimize overall spacetime complexity. • The question of whether 6-tick SCRL minimizes per-op spacetime complexity among pipelined adiabatic CMOS logics is still open.