MIPS Architecture Arithmetic/Logic Instructions ALU Design– Chapter 4 By N. Guydosh 2/17/04
Integer Representation • 32 bit representation.. every bit is contributes to the magnitude of the number • Range for unsigned integers is 0 through 232 -1 or 0 through 4,294,967,294 = 0xFFFFFFFF • But what about negative numbers?- The universal standard is signed two’s Complement.- Automatically results in the most significant bit (bit 31, little endian notation) being a “1” for negative numbers and a “0” for positive numbers ... thus we have a sign bit.- Results in simple ALU design
Two’s Complement - 1 • Two’s Complement Representation: • 0 (zero) is 0x00000000 • Positive numbers (231 - 1 of them) are:0x00000001, 0x00000002, ... 0x7FFFFFFF = 231 -1 = 2,147,483,647 • Negative numbers (231 of them) are:-1 = 0xFFFFFFFF,-2 = 0xFFFFFFFE ,. . .-2,147,483,647 = 0x80000001 (2's complement) = -(231 - 1)-2,147,483,648 = 0x80000000 (2's complement) = -231 , see note below & p. 213 text. • There are more negative numbers than positive but we’ll live with it • NOTE: the 2’s complement of 0x80000000 is also 0x80000000 because of a net carry-out when forming the 2’s complement – an anomaly. This number has no positive counterpart in a 32 bit architecture. We will see later that when we use 2’s complement for the exponent of a floating point number, we may treat values such as 1000…0 as “-0” and 000…0 as “+0”
Two’s Complement - 2 • Three formulas for doing 2's complement: 1. Complement all the bits (1's complement) and add 1 2. Starting with the least significant bit, retain all low order zeros, retain the first nonzero digit, complement the remaining digits. 3. A real slick method:If the digits in the 32 bit number are: B31B30B29 . . .B1B0Then the “sign/magnitude” decimal equivalent is: B31*-231 + B30*230 + B29*229 + . . .+ B1*21 + B0*20Note that the sign bit position is multiplied by: -231 and not231!!! A generalization of converting from binary to decimal. which also works for 2's complement. ... see P. 213-214
Uses of 2’s complement • How 2's complement is used: • For “algebraic” addition of signed numbers represent negative numbers by their 2's complement of the magnitude • For the operation of subtraction add the 2's complement of the minuend to the subtrahend ... Take the 2's complement of the minuend even if it is a negative number. ... We converted the operation of subtraction to addition - All we need is an adder. • Note there we distinguished between the sign of a number and the operation (add or subtract) being done. In either case we take the 2's complement.
2’s Complement Sign Extension • To convert an integer shorter than 32 bits to a 32 bit integer we simply extend the sign bit (0 or 1) to the left until we have 32 bits - for example we convert a 16 bit version of 2 and -2 (base 10) as follows:2 = 0x0002 ==> 0x00000002 (sign bit is 0)-2 = 0xFFFE ==> 0xFFFFFFFE (sign bit is 1)
2’s Complement and the Instruction Set • How this affects the MIPS instruction set . . . see p. 215 • The “less than” test gets more complicated ... the slt and slti instructions interpret integers as signed-2d = 0xfffffffe (2's comp) is less than 5280d = 0x000014a0 • unsigned version of these instructions:sltu and sltiu assume numbers are unsigned:0xfffffffe is now greater than 0x000014a0
Data Types And Signs • Fundamental Principle: • Unlike c language which intrinsically types data, data represented at the assembly language level is untyped - data types are interpreted by the operation being done on them: • Example in C:int num; /* represents a signed 32 bit integer *//* num carries a label saying “I am signed” */unsigned int num; /* represents an unsigned 32 bit integer */ /* num carries a label saying “I am unsigned”*/ • In assembly language the operation or instruction type determines whether num is signed or unsigned. • The unsigned counterparts of slt and slti are:Assuming that $16 = 0xFFFFFFFE$17 = 0x000014A0Then:slt $8, $16, $17 #Signed comparison ==> $8 = 1 sltu $9 $16, $17 #Signed comparison ==> $9 = 0 • As an unsigned number 0xFFFFFFFE is a very large positive integer (4,294,967,294), but as a signed number it is -2.
Addition, Subtraction And Overflow • The most straight forward method of doing binary addition is the way we do it on paper - starting from low order we add corresponding pairs of bits and the carry-in generating a sum and carry-out. the carry-out is then propagated to the next stage. • Carry-ripple approach • Do subtraction by adding the 2's complement of the minuend to the subtrahend. • What if the answer cannot be represented in 32 bits?
Overflow • Example; consider a 5 bit word for convenience. Assume the high (most significant) bit is the sign bit for 2’s complement:-9 = 10111-5 = 11011-14 = 1 10010 ==> there is a net carryout, but answer is correct because sign bit is correct: 10010 ==>2’s compliment = -01110 = -14 )d -9 = 10111-7 = 11001-16 = 1 10000 ==> there is a net carryout, but answer is still correct because sign bit is correct: 10000 ==> 2’s complement = -10000 = -16 )d (the unique number that has no positive counterpart in a 5 bit scheme) - 9 = 10111-10 = 10110 -19 = 1 01101 ==> overflow because sign bit wrong 01101 = +13)d
Detecting overflow • Rule (see p. 222, fig 4.4): Addition: Overflow can occur only if the signs are the same and the sign of the answer is different from the sign of the operands. Subtraction: Overflow (ie., underflow) results only if the operands have opposite signs and the sign of the answer is not the same as the sign of the 1st operand (or the sign of the answer is the same as the sign of the 2nd operand). • Conditions for overflow (fig. 4.4)Operation Operand A Operand B Result A + B 0 0 < 0 A + B < 0< 0 0 A – B 0< 0< 0 A - B < 0 0 0 • Simple hardware test for overflow in 2’s complement Add/Subtract • Overflow occurs if and only if the carry into the sign bit is not the same as the carry out from the sign bit (thus explaining why net carry-outs are not always overflows). The condition being tested for is necessary & sufficient for overflow.See Exercise 4.42. Now review previous examples.
Exception Handling And Overflows • MIPS will generate an overflow exception (interrupt) only for signed arithmetic. • The address of the offending instruction is saved in the exception program counter (EPC) and a branch is made to an error/interrupt handler for corrective action. • MIPS instruction mfc0 is used to move the EPC to MIPS general purpose register. On returning from the exception handler, the software now has the option of returning back to the offending instruction via a jr instruction.
UNSIGNED ARITHMETIC • Data interpreted as unsigned ( by means of unsigned instructions) addresses and other data for which negative has no meaning. • The unsigned counterparts of add, addi, and sub are:addu, addiu, and subu with syntax the same as the signed counterparts. • The main differences are the unsigned version do not cause exceptions in case of overflow
Logical Operations ... See P. 225 • Shift Logical Left And Shift Logical Rightsll $rd, $rt, shamt #reg $rd = reg $16 << shamt (in bits)srl $rd, $rt, shamt #reg $rd = reg $16 >> shamt (in bits) • Good for efficient multiplication and division by powers (shamt) of two • “and” and “or” instructions ... see P. 226-227Good for bit manipulation - testing and setting bits in a word field • Type R instructions:and $rd, $rs, $rt # $rd = $rs & $rt or $rd, $rs, $rt # $rd = $rs | $rt , Note: “|” is the“or” operator
Logical Operations(cont.) • Type I instructions:andi $rt, $rs, imm # $rt = $rs & imm ori $rt, $rs, imm # $rt = $rs | immimm treated as unsigned ... expanded to 32 bits by padding with zeros rather than extending the sign bit Application: we can create a 32 bit constant using lui and ori in much the same way we did using lui and addi.
Designing An ALU • We now look at hardware design which will support the arithmetic instructions just described • This is the number cruncher of the machine • Start by building a 1 bit ALU and replicating it 32 times • Function of the ALU: • add, subtract, and, or, test for less than, overflow detection • Simple example of an single stage adder design
Single Stage One Bit Adder From Mano, “Digital Design” 2nd ed. p. 121
One Bit ALU Each bit has a full adder, “and”, and “or” function. Unit for bits 0 - 30 Unit for bit 31. Has overflow detect andslt” capability using sign bitwhich will fed to “less” inputof bit 0: For slt $s1, $s2, $s3do $s2 - $s3 and put sign bitin bit 0 & zero out remaining bits.
Final 32 Bit ALU For slt, zero out bits 1-32 via “Less” input and feed sign bit into “Less” input of bit 0. Test for zero if all bits 0 – use “or” gate. If Bnegate = 1, and Operation = 3, for set on less than (slt), then subtraction is done “under the covers”, but not transmitted outside. Set is the sign bit of the subtract result. <== Note “wrap-around” connection
Carry Look Ahead • Avoids the slow carry-ripple affect of the above circuit • Carry ripple is very intuitive, but inefficient – must sequence through all low order 31 bit positions before getting the carry-in for bit position 32 (bit 31). • A less intuitive, but more efficient method calculates the carries in advance: carry lookahead
Carry Look Ahead (cont) From Mano, “Digital Design” 2nd ed. p. 158 Reading the diagram directly: Pi is “carry propagate” for bit position i … propagates through bit position i even if position i does not generate its own carry. Gi is “carry generate” for bit position i … generated directly by this bit position independently of any carry in Pi = ai bi, Gi = aibi Si = Pi Ci, Ci+1 = Gi +PiCi … a recursive formula
Carry Look Ahead (cont) • By reading the full adder circuit diagram directly, we see that: Pi = Pixor = ai bi. The book uses: Pi = Pior = ai+ bi for this function – a much cheaper and simpler circuit. How is this done? See below. • If the latter (“OR”) is used, it must be used only for the carry out function. The sum function still requires the XOR (former version). Since the XOR version is already needed, why not also use it for the carry out? - maybe there are fan-out considerations. • Thus we also may use:Si = Pixor Ci, where Pixor = ai biCi+1 = Gi +Pior Ci where Pior = ai+ bi , and Gi = aibiremember Pixor will also work for Ci is also but is a more complicated circuit. • Proof of the equivalence of Pixor and Pior for carry out calculations:From the circuit diagram on previous slide we have the Boolean equations: Ci+1 = Gi + Pixor Ci = aibi + (ai bi)Ci = aibi+ (aib’i +a’ibi) Ci…prime means complement = aibi+ aibi+ aib’i Ci +a’ibi Ci = ai(bi+b’i Ci ) + bi(ai+a’i Ci ) = ai(bi+Ci ) + bi(ai+Ci ) = aibi + aibi + aiCi + biCi = aibi + (ai + bi)Ci = Gi + Pior Ci Note: we used boolean identities: x = x+x, and x+x’y = x+y
Carry Look Ahead (cont) • Formulas for carries – showing increasing fan-in.Recursively expand out the carry-out formulas: From Mano, “Digital Design” 2nd ed. p. 159
Carry Look Ahead (cont) Note that “fan-in” increases as the bit position increases. From Mano, “Digital Design” 2nd ed. p. 159
Carry Look Ahead (cont) 4 bit carry look-ahead adder From Mano, “Digital Design” 2nd ed. p. 160 See also fig. 4.24, p. 426 P & H
Carry Look-Ahead – Reality sets in! • Note that the fan-in of the “and” gates and the “or” gate for the carry-out any bit position increases with the bit position. At Some point it becomes impossible to implement due to fan-in/fan-out constraints of the technology. • We can decompose a 32 bit adder to 4 groups of 8 bit carry look-ahead adders, we can limit the fan-in of the carry look-ahead circuitry. The 4 8 bit adders, in turn, can also utilize carry-look-ahead from group to group. This is an example of using a second level of abstraction of the carry-look-ahead concept. • See the 16 bit example on pp. 242-248 for an illustration.