CSE 246: Computer Arithmetic Algorithms and Hardware Design

CSE 246: Computer Arithmetic Algorithms and Hardware Design Prof Chung-Kuan Cheng Lecture 3

How to compare two RNS numbers • We can approximate the magnitude of a RNS number by the following formula where

An Example Suppose, x = (6|3|0) RNS (7|5|3) y = (3|0|1) RNS (7|5|3) Then we have x/105 = [6(1/7) + 3(1/5) + 0(2/3)] mod 1 ≈ 0.457 y/105 = [3(1/7) + 0(1/5) + 1(2/3)] mod 1 ≈ 0.095 Clearly, x (48) is greater than y (10).

Double Base Number System (DBNS) • DBNS is a new kind of number system, where there are two bases, p and q. • Any number x is represented by the equation Also, this number system could be redundant, e.g. 54 = 2030+2130+2131+2031+2032+2232 = 2133

Double Base Number System (DBNS) • We can represent DBNS numbers in a two-dimensional table. For example we can express 54 by this tabular representation. For each entry in the table, we multiply the corresponding row-value and column-value. Then we add up all such entries to get the value of the number represented by the table.

Double Base Number System (DBNS) • DBMS can be of practical use too in some scenarios. • In binary number representation, each bit has approximately 0.5 probability of being 1. • But in DBNS, the number of bits that are logic 1 in the tabular representation could be much less. • Effectively, we can reduce the number of 01 and 10 transitions, thus saving power.

Double Base Number System (DBNS) A greedy approach to minimize the number of TRUE bits in the tabular representation of any integer : GREEDY (x) { if (x > 0) then do{ find the largest 2-integer w such that w ≤ x; write(w); x = x-w; greedy(x); } }

Double Base Number System (DBNS) • It can be shown that expected number of bits that are ‘turned on’ in a DBNS representation of integer is • O[lg x/(lg lg x)], which is significantly lower than the corresponding number in the positional binary system, O(lg x). • As an example, consider the integer 2215 • In binary system, number of ‘1’s ≈ 100 • In DBNS, number of ‘1’s ≈ 30 • In the next few slides we shall discuss how we can implement ADDITION operation on two DBNS numbers.

DBNS Numbers: Addition Consider the integers 14 and 20. In DBNS system, 14 = 2231 + 2130 [We represent this number by a green cross] 20 = 2132 + 2130 [We represent this number by a red cross] The addition operation is performed by representation the numbers in tabular form, and then ‘merging’ the tables.

DBNS System: Addition The final merged table is : And the sum of 14 and 20 is 2230 + 2231 + 2132 = 34, which is indeed correct

DBNS System: Addition • Few rules for ‘shifting’ values in the merged table • We can always use algebraic manipulations to minimize number of entries in a DBNS table, e.g. • 2i3j + 2i3j+1 = 2i+23j • 2i3j + 2i+13j = 2i3j+1 • A variant of 2-integers are represented by using only single digit. They are of the form 2s3t, and might be useful in logarithmic operations.

Chapter 2: ADDERS • Half Adders • Half adders can add two 1-bit binary numbers when there is no carry in. • If the inputs are xi and yi, the sum and carry-out is given by the formula • si = xi ^ yi • ci+1 = xi . yi • We use the following notations throughout the slides • . means logical AND • + means logical OR • ^ means logical XOR • ‘ means complementation

Full Adder • The inputs are x[i], y[i] (operand bits) and c[i] (carry in) • The outputs are s[i] (result bit) and c[i+1] (carry out) • Inputs and outputs are related by these relations • s[i] = x[i] ^ y[i] ^ c[i] • c[i+1] = x[i].y[i] + c[i].(x[i] + y[i]) = x[i].y[i] + c[i].(x[i] ^ y[i])

Full Adder • If carry-in bit is zero, then full adder becomes half adder • If carry-in bit is one, then • s[i] = (x[i] ^ y[i])’ • c[i+1] = x[i] + y[i] • To add two n-bit numbers, we can chain n full adders to build a ripple carry adder

Ripple Carry Adder x[0] y[0] cin/c[0] x[n-1] y[n-1] x[1] y[1] c[n-1] . . . c[1] c[2] cout s[n-1] s[1] s[0] Overflow happen when operands are of same sign, and the result is of different sign. If we use 2’s complement to represent negative numbers, overflow occurs when (cout ^ c[n-1]) is 1

Ripple Carry Adder • For sake of brevity, we use the following notations: • g[i] = x[i].y[i] • p[i] = x[i] + y[i] • In terms of these notations, we can rewrite carry equations as • c[1] = g[0] + p[0].c[0] • c[2] = g[1] + p[1].c[1] • and so on… • We shall use these notations afterwards while discussing the design of other kind of adders • It has been observed that expected length of carry chain is 2, while expected maximal length of carry chain is lg n. Hence, ripple carry adders are in general fast.

Ripple Carry Adder • How do know that an adder has completed the operation? • Worst case scenario: Wait for the longest chain in the carry propagation network • We might inspect c[i+1] and its complement b[i+1] to determine the status of the adder

Improvement to Ripple Carry Adder: Manchester Adders • By intelligently using our device properties, we can reduce the complexity of the circuit used to compute carries in a ripple carry adder. • Define: a[i] = (x[i])’.(y[i])’ • Next we observe that c[i+1] is 1 in exactly these scenarios: • g[i] is 1, i.e. both x[i] & y[i] are 1 • c[i] is 1 and it is propagated because p[i] is 1 • c[i+1] is ‘pulled down’ to logic 0 irrespective of the value of c[i], when a[i] is 1, i.e. both x[i] and y[i] are 0 • From these conditions, and keeping in mind the general characteristics of transistor devices we can design simplified circuits for computing carries – as shown in the next slide

Improvement to Ripple Carry Adder: Manchester Adders

Implementation of Manchester Adder using MOS transistors This is essentially the same circuit for computing carry, but implemented with MOS devices

Manchester Adder: Alternate design • We divide the computation cycle into two distinct half-cycle : ‘precharge’ and ‘evaluate’. In the precharge half-cycle, g[i] and c[i+1] are assigned a tentative value of logic 1. This is evaluated in the next half-cycle with actual value of a[i]. • The actual circuit for computing carries is shown in the next slide.

Manchester Adder: Alternate design evaluation precharge Q Time 

Carry Look-ahead Adder • In a ripple-carry adder m-full adders are grouped together (m is usually equal to 4). Once the carry-in to the group is known, all the internal carries and the output carry is calculated simultaneously. • We can use some algebraic manipulations to minimize hardware complexity. • Consider the carry out of the group • c[i] = g[i-1] + p[i-1].c[i-1] • Putting the value of c[i-1], we can rewrite as c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].c[i-2] • Proceeding in this manner we get c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].g[i-3] + p[i-1].p[i-2].p[i-3].g[i-4] + p[i-1].p[i-2].p[i-3].p[i-4].c[i-4] • To further simplify the equation, we note that g[i-1] = g[i-1].p[i-1], and p[i-1] can be factored out

Ling’s Adder c[i] = g[i-1] + p[i-1].g[i-2] + p[i-1].p[i-2].g[i-3] + p[i-1].p[i-2].p[i-3].g[i-4] + p[i-1].p[i-2].p[i-3].p[i-4].c[i-4] We replace p[i]=x[i]^y[i] with t[i]=x[i]+y[i]. Because g[i]=g[i]t[i], we have c[i] = g[i-1]t[i-1] + t[i-1]g[i-2] + t[i-1].t[i-2].g[i-3] + t[i-1].t[i-2].t[i-3].g[i-4] + t[i-1].t[i-2].t[i-3].t[i-4].c[i-4] Let h[i] = g[i-1] + g[i-2] + t[i-2].g[i-3] + t[i-2].t[i-3].g[i-4] + t[i-2].t[i-3].t[i-4].t[i-5] h[i-4] C[i]= h[i]t[i-1]

Generalized Design for Adders: Prefix Adder • Prefix computation • Given n inputs x1, x2, x3…xn and an associative operator ×. We want to compute yi = xi× xi-1× xi-2 …× x2× x1 for all i, 1≤ i ≤n • x can be a scalar/vector/matrix • For design of adders, we define the operator × in the following manner • (g, p) = (g’, p’) × (g’’, p’’) • g = g’’ + p’’.g’ • p = p’.p’’

Alternate modeling of Prefix Computer: Finite State Machine • A finite state machine has a set of states, and it ‘moves’ from one state to another according to input. Mathematically, • sk = f (sk-1, ak-1) • The problem is to determine final state sn in O(lg n) operations, given initial state s0 and sequence of inputs (a0, a1, …an-1) • This problem can be formulated in terms of prefix computation

Alternate modeling of Prefix Computer: Finite State Machine • We assume that number of states are small and finite. • Let sk = fak-1(sk-1), fak-1 can be represented by matrix Mak-1 • Now we are ready to represent our problem in terms of prefix computation.

Alternate modeling of Prefix Computer: Finite State Machine • The algorithm • Compute Mai in parallel • Compute • N1 = Ma1 • N2 = Ma2.Ma1 • … • Nn = Man.Man-1…Ma1 • Compute Si+1= Ni(S0)

CSE 246: Computer Arithmetic Algorithms and Hardware Design