Number Representation. How to Represent Negative Numbers?. So far, unsigned numbers Obvious solution: define leftmost bit to be sign! 0 => +, 1 =>  Rest of bits can be numerical value of number Representation called sign and magnitude. Shortcomings of sign and magnitude?.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Number Representation
00000
00001
...
01111
11111
10000
...
11110
00000
11111
00001
11110
00010
0
1
1
2
2
.
.
.
.
.
.
15
15
16
01111
10001
10000
0000 ... 0000 0000 0000 0000two = 0ten0000 ... 0000 0000 0000 0001two = 1ten0000 ... 0000 0000 0000 0010two = 2ten. . .0111 ... 1111 1111 1111 1101two = 2,147,483,645ten0111 ... 1111 1111 1111 1110two = 2,147,483,646ten0111 ... 1111 1111 1111 1111two = 2,147,483,647ten1000 ... 0000 0000 0000 0000two = –2,147,483,648ten1000 ... 0000 0000 0000 0001two = –2,147,483,647ten1000 ... 0000 0000 0000 0010two = –2,147,483,646ten. . . 1111 ... 1111 1111 1111 1101two =–3ten1111 ... 1111 1111 1111 1110two =–2ten1111 ... 1111 1111 1111 1111two =–1ten
= 1x231+1x230 +1x229+...+1x22+0x21+0x20
= 231+ 230 + 229 + ...+ 22 + 0 + 0
= 2,147,483,648ten + 2,147,483,644ten
= 4ten
1111 1111 1111 1100two
1111 1111 1111 1111 1111 1111 1111 1100two
Addition of Positive Numbers
Carries
a: 0 0 1 1
b: 0 1 0 1
Sum:1 0 0 0
Definition
Sum = ABCin + ABCin + ABCin + ABCin
CarryOut = AB + ACin + BCin
¯ ¯ ¯ ¯ ¯ ¯
CarryIn
A
+
Sum
B
CarryOut
CarryIn0
A0
1bit
FA
Sum0
B0
CarryOut0
CarryIn1
A1
1bit
FA
Sum1
B1
CarryOut1
CarryIn2
A2
1bit
FA
Sum2
B2
CarryOut2
CarryIn3
A3
1bit
FA
Sum3
B3
CarryOut3
Fast Adders
Cin
ABCout
000“kill”
01Cin“propagate”
10Cin“propagate”
111“generate”
A0
S0
G
B0
P
C1 =G0 + C0· P0
= A0B0 + C0(A0+B0)
A
S
P = A + B
G = A B
G
B
P
C2 = G1 + G0 · P1 + C0· P0 · P1
A
S
G
B
P
C3 = G2 + G1 · P2 + G0 · P1 · P2 + C0· P0 · P1 · P2
A
S
G = G3 + P3·G2 + P3·P2·G1 + P3·P2·P1·G0
G
B
P
P = P3·P2·P1·P0
C4 = . . .
¯ ¯ ¯ ¯ ¯ ¯
Sum = ABCin + ABCin + ABCin + ABCin
C
L
A
4bit
Adder
4bit
Adder
4bit
Adder
C0
G0
P0
C1 =G0 + C0 · P0
Delay = 3 + 2 + 3 = 8
DelayRC = 15*2 + 3 = 33
C2 = G1 + G0 · P1 + C0 · P0 · P1
C3 = G2 + G1 · P2 + G0 · P1 · P2 + C0 · P0 · P1 · P2
G
P
C4 = . . .
Signed Addition & Subtraction
0 1 0 1(+5)+0 0 1 0(+2) 0 1 1 1(+7)
0 1 0 1(+5)+1 0 1 0(6) 1 1 1 1(1)
0 0 1 0(+2) 0 0 1 00 1 0 0(+4)+1 1 0 0(4) 1 1 1 0 (2)
1 0 1 1(5)+1 1 1 0(2)11 0 0 1(7)
0 1 1 1(+7)+1 1 0 1(3)10 1 0 0(+4)
1 1 1 0(2) 1 1 1 01 0 1 1(5)+0 1 0 1(+5)10 0 1 1 (+3)
Decimal
Binary
Decimal
2’s Complement
0
0000
0
0000
1
0001
1
1111
2
0010
2
1110
3
0011
3
1101
4
0100
4
1100
5
0101
5
1011
6
0110
6
1010
7
0111
7
1001
8
1000
0
1
1
1
1
0
1
1
1
7
1
1
0
0
– 4
3
– 5
+
0
0
1
1
+
1
0
1
1
1
0
1
0
– 6
0
1
1
1
7
0
1
1
1
1
0
0
1
1
1
7
1
1
0
0
–4
3
– 5
+
0
0
1
1
+
1
0
1
1
1
0
1
0
– 6
0
1
1
1
7
CarryIn1
A1
1bit
FA
Result1
B1
CarryOut1
CarryIn0
A0
1bit
FA
Result0
X
Y
X XOR Y
B0
0
0
0
CarryOut0
0
1
1
1
0
1
1
1
0
CarryIn2
A2
1bit
FA
Result2
B2
CarryIn3
Overflow
A3
1bit
FA
Result3
B3
CarryOut3
Arithmetic & Branching Conditions
0 1 0 1(+5)+0 1 0 0(+4) 1 0 0 1(7?)
0 1 0 1(+5)+1 0 1 0(6)1 1 1 1(1)
0 0 1 1(+3)+1 1 0 1(3)10 0 0 0(0)
0 1 1 1(+7)+1 1 0 1(3)10 1 0 0(+4)
Multiplication of Positive Numbers
Multiplicand 1101(13)Multiplier1011(11) 11011101 00001101Product 10001111(143)
0
0
0
0
A3
A2
A1
A0
B0
A3
A2
A1
A0
B1
A3
A2
A1
A0
B2
A3
A2
A1
A0
B3
P7
P6
P5
P4
P3
P2
P1
P0
A3
A2
A1
A0
A3
A2
A1
A0
A3
A2
A1
A0
A3
A2
A1
A0
0
0
0
0
0
0
0
B0
B1
B2
B3
P7
P6
P5
P4
P3
P2
P1
P0
4 bits
Multiplicand
0
Multiplicand 1101
C ProductMultiplier0 000010110 11011011Add0 01101101Shift
1 00111101Add0 10011110Shift
0 10011110NoAdd0 01001111Shift
1 00011111Add0 10001111Shift
MUX
Shift Right
Control
4bit FA
Add/NoAdd
C
Product
(Multiplier)
8 bits
4 bits
SignedOperand & Multiplication
Multiplicand 10011(13)Multiplier 01011(+11)1111110011111110011000000001110011000000Product 1101110001(143)
Current BitBit to the RightExplanationExampleOp
10Begins run of 1s0001111000sub
11Middle of run of 1s0001111000none
01End of run of 1s0001111000add
00Middle of run of 0s0001111000none
Originally for Speed (when shift was faster than add)
OperationMultiplicandProductnext?
0. initial value00100000 0111 010 > sub
1a. P = P  m1110 +11101110 0111 0shift P (sign ext)
1b. 00101111 0011111 > nop, shift
2.00101111 1001111 > nop, shift
3.00101111 1100101 > add
4a.0010 +0010
0001 11001shift
4b.00100000 1110 0done
OperationMultiplicandProductnext?
0. initial value00100000 1101 010 > sub
1a. P = P  m1110 +11101110 1101 0shift P (sign ext)
1b. 00101111 0110101 > add + 0010
2a.0001 01101shift P
2b.00100000 1011010 > sub +1110
3a.00101110 10110shift
3b.0010 1111 0101 111 > nop
4a1111 0101 1 shift
4b.00101111 10101 done
Fast Multiplication
(read yourself!)
Integer Division
1001 Quotient
Divisor 1000 1001010 Dividend–1000 10 101 1010–1000 10 Remainder (or Modulo result)
Binary => 1 * divisor or 0 * divisor
Divisor
33 bits
Shift Left
Control
33bit FA
Q Setting
Remainder
(Quotient)
65 bits
33 bits
Signbit Checking
0010QuotientDivisor111000Dividend 1110Remainder
RemainderQuotient
Initially000001000Shift00001000_Sub(11)11101Set q011110Restore000010000
Shift00010000_Sub(11) 11101Set q011111Restore000100000
Shift00100000_Sub(11) 11101Set q000001000010001
Shift00010001_Sub(11) 11101001_Set q011111Restore000100010
Floatingpoint Numbers & Operations
0to2N  1
2(N1)to 2(N1)  1
exponent
mantissa
radix (base)
decimal point
6.02 x 1023
exponent
Mantissa
radix (base)
“binary point”
1.0two x 21
31
30
23
22
0
S
Exponent
Significand
1 bit
8 bits
23 bits
31
30
20
19
0
S
Exponent
Significand
1 bit
11 bits
20 bits
Significand (cont’d)
32 bits
0
0111 1110
000 0000 0000 0000 0000 0000
1/2
2
0
1000 0000
000 0000 0000 0000 0000 0000
0
1111 1111
000 0000 0000 0000 0000 0000
1/2
2
0
0000 0001
000 0000 0000 0000 0000 0000
31
30
23
22
0
S
Exponent
Significand
1 bit
8 bits
23 bits
ExponentSignificandObject
000
0nonzero???
1254anything+/ fl. pt. #
2550+/ infinity
255nonzeroNaN
result of operation overflows, i.e., is larger than the largest number that
can be represented
overflow is not the same as divide by zero (raises a different exception)
S 1 . . . 1 0 . . . 0
+/ infinity
It may make sense to do further computations with infinity
e.g., X/0 > Y may be a valid comparison
Not a number, but not infinity (e.q. sqrt(4))
invalid operation exception (unless operation is = or =)
S 1 . . . 1 nonzero
NaN
HW decides what goes here
NaNs propagate: f(NaN) = NaN
"Floating Point numbers are like piles of sand; every time you move one you lose a little sand, but you pick up a little dirt."
How many extra bits?
IEEE: As if computed the result exactly and rounded.
Addition:
1.xxxxx1.xxxxx1.xxxxx
+1.xxxxx0.001xxxxx0.01xxxxx
1x.xxxxy 1.xxxxxyyy 1x.xxxxyyy
postnormalization prenormalization pre and post
normalized result, but some nonzero digits to the right of the
significand > the number should be rounded
E.g., B = 10, p = 3:
2bias
0 2 1.69
= 1.6900 * 10
=  .0785 * 10
= 1.6115 * 10
2bias
0 0 7.85

2bias
0 2 1.61
one round digit must be carried to the right of the guard digit so that
after a normalizing left shift, the result can be rounded, according
to the value of the round digit
IEEE Standard:
four rounding modes: round to nearest (default)
round towards plus infinity
round towards minus infinity
round towards 0
round to nearest:
round digit < B/2 then truncate
> B/2 then round up (add 1 to ULP: unit in last place)
= B/2 then round to nearest even digit
it can be shown that this strategy minimizes the mean error
introduced by rounding
Additional bit to the right of the round digit to better fine tune rounding
d0 . d1 d2 d3 . . . dp1 0 0 0
0 . 0 0 X . . . X X X S
X X S
Sticky bit: set to 1 if any 1 bits fall off
the end of the round digit
+
d0 . d1 d2 d3 . . . dp1 0 0 0
0 . 0 0 X . . . X X X 1
d0 . d1 d2 d3 . . . dp1 0 0 0
0 . 0 0 X . . . X X X 0


X X 0
generates a borrow
Rounding Summary:
Radix 2 minimizes wobble in precision
Normal operations in +,,*,/ require one carry/borrow bit + one guard digit
One round digit needed for correct rounding
Sticky bit needed when round digit is B/2 for max accuracy
Rounding to nearest has mean error = 0 if uniform distribution of digits
are assumed
2bias
denorm
gap
1bias
bias
2
2
0
2
normal numbers with hidden bit >
B = 2, p = 4
The gap between 0 and the next representable number is much larger
than the gaps between nearby representable numbers.
IEEE standard uses denormalized numbers to fill in the gap, making the
distances between numbers near 0 more alike.
2bias
1bias
bias
2
2
0
2
p1
bits of
precision
p bits of
precision
same spacing, half as many values!
NOTE: PDP11, VAX cannot represent subnormal numbers. These
machines underflow to zero instead.