Generation of Optimal Bit-Width Topology of Fast Hybrid Adder in a Parallel Multiplier

Generation of Optimal Bit-Width Topology of Fast Hybrid Adder in a Parallel Multiplier Sabyasachi Das Synplicity Inc. Sunil P. Khatri Texas A&M University (sunilkhatri@tamu.edu) Presented by David Pan, UT Austin

What is a Multiplier? • IC block that perform multiplication operation • Well-known logic architectures • Computationally-intensive • Wide usage in DSP, Graphics, Microprocessors

Structure of Multiplier Inputs • Multiplier block consists of 3 parts (written in the order of data-flow) • Partial Product Generator (PPGen) • Partial Product Reduction Tree (PPRT) • Final Carry-Propagation Adder (CPA) Partial Product Generator (PPGen) Partial Product Reduction Tree (PPRT) Final Carry Propagation Adder (CPA) Output

Final Adder in a Multiplier • Frequently used adder architectures • Ripple-Carry • Area-efficient, but slow • Timing-efficient if inputs have skewed arrival time • Parallel-Prefix architecture (Brent-Kung, Kogge-Stone) • Faster architecture • Requires more area • Carry-Select • Large area overhead (often >100%) • Better delay if Cin signal arrives late.

3-stage Hybrid Adder • Multipliers exhibit a typical arrival time pattern (in the input of the CPA) • Hybrid adder produces best result for Multipliers • This outperforms all stand-alone architectures Stelling et al., “Design Strategies for optimal hybrid final adders in a parallel multiplier”, In The Journal of VLSI Signal Processing, 1996

wrpl wbk wcs wrpl wbk wcs SubAdder1 (Ripple) SubAdder2 (Brent-Kung) SubAdder3 (Carry-Select) wrpl wbk wcs 3-Stage Hybrid Adder There are many possible configurations (w1, w2and w3). Exhaustive exploration is not feasible (huge runtime) How to identify the best configuration?

Identification of Optimal Topology • Width of the Ripple adder • At every bit (i), compute T(Ci+1) and check if • T(Ci+1) ≤ T(ai+1) or • T(Ci+1) ≤ T(bi+1) • If check passes, wrpl = i+1 • Else continue checking until 3 consecutive bits fail the check (Hill Climbing) • Return the value i as the Ripple Adder width

Delay of the Hybrid Adder wrpl wbk wcs wrpl wbk wcs SubAdder1 (Ripple) SubAdder2 (Brent-Kung) SubAdder3 (Carry-Select) wrpl wbk wcs Ts3 + Dmx Tco2 + Dmx Ts2 Thybrid =Max (Ts2, (Tco2+Dmx), (Ts3+Dmx))

Identification of Optimal Topology • Width of the BK and Carry-Select Adders • Initial Configuration • wbk = 2p, where p= log2 (n – wrpl) • wcs = n – wbk – wrpl • Example: If n=32 and wrpl=7 then wbk=16 and wcs=9 • Iterative approach • Estimate delay of a configuration and explore in the appropriate direction (similar to Binary Search)

Results • For different adder widths, our approach always found best configuration in very short runtime. • Runtime example: for a 32-bit Adder, • Trying all possible configurations (561) takes 16-23 hours of runtime • Our approach takes 4-18 minutes of runtime and always computes the best configuration.

Results • Now, it is feasible to use this powerful hybrid-adder architecture during synthesis (~12% faster adder).

Thank you

Generation of Optimal Bit-Width Topology of Fast Hybrid Adder in a Parallel Multiplier

Generation of Optimal Bit-Width Topology of Fast Hybrid Adder in a Parallel Multiplier

Presentation Transcript

Four-Bit Serial Adder

LOW VOLTAGE OPERATION OF A 32-BIT ADDER USING LEVEL CONVERTERS

A Timing-Driven Synthesis Approach of a Fast Four-Stage Hybrid Adder in Sum-of-Products

4-Bit Adder

Project 2 – Design of a 4-bit Multiplier

Four-Bit Adder- Subtractor

Use CMOS Transistors to bit a 4-bit Adder

Implementation of 1-bit Full Adder on SiGe FPGA

Lab 2 4-Bit Adder

Development of a Novel Parallel Hybrid Transmission

Topology Generation

FOUR BIT CARRY LOOK AHEAD ADDER

THE PARALLEL BINARY ADDER

Design of a 32-Bit Hybrid Prefix-Carry Look-Ahead Adder By

Design of a Reversible Binary Coded Decimal Adder by Using Reversible 4-bit Parallel Adder

4-BIT FAST ADDER (look ahead carry 방식 )

Use of Topology in Optimal Motion Planning

Fast Modular Multiplication using Parallel Prefix Adder

LOW VOLTAGE OPERATION OF A 32-BIT ADDER USING LEVEL CONVERTERS

A Timing-Driven Synthesis Approach of a Fast Four-Stage Hybrid Adder in Sum-of-Products

Generation of Optimal Bit-Width Topology of Fast Hybrid Adder in a Parallel Multiplier