Number Representation. How to Represent Negative Numbers?. So far, unsigned numbers Obvious solution: define leftmost bit to be sign! 0 => +, 1 =>  Rest of bits can be numerical value of number Representation called sign and magnitude. Shortcomings of sign and magnitude?.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
00001
...
01111
11111
10000
...
11110
Another try: complement the bits00000
11111
00001
11110
00010
0
1
1
2
2
.
.
.
.
.
.
15
15
16
01111
10001
10000
0000 ... 0000 0000 0000 0000two = 0ten0000 ... 0000 0000 0000 0001two = 1ten0000 ... 0000 0000 0000 0010two = 2ten. . .0111 ... 1111 1111 1111 1101two = 2,147,483,645ten0111 ... 1111 1111 1111 1110two = 2,147,483,646ten0111 ... 1111 1111 1111 1111two = 2,147,483,647ten1000 ... 0000 0000 0000 0000two = –2,147,483,648ten1000 ... 0000 0000 0000 0001two = –2,147,483,647ten1000 ... 0000 0000 0000 0010two = –2,147,483,646ten. . . 1111 ... 1111 1111 1111 1101two = –3ten1111 ... 1111 1111 1111 1110two = –2ten1111 ... 1111 1111 1111 1111two = –1ten
= 1x231+1x230 +1x229+...+1x22+0x21+0x20
= 231+ 230 + 229 + ...+ 22 + 0 + 0
= 2,147,483,648ten + 2,147,483,644ten
= 4ten
1111 1111 1111 1100two
1111 1111 1111 1111 1111 1111 1111 1100two
a: 0 0 1 1
b: 0 1 0 1
Sum: 1 0 0 0
OneBit Full Adder (1/3)Sum = ABCin + ABCin + ABCin + ABCin
CarryOut = AB + ACin + BCin
¯ ¯ ¯ ¯ ¯ ¯
OneBit Full Adder (2/3)A
+
Sum
B
CarryOut
OneBit Full Adder (3/3)CarryIn0
A0
1bit
FA
Sum0
B0
CarryOut0
CarryIn1
A1
1bit
FA
Sum1
B1
CarryOut1
CarryIn2
A2
1bit
FA
Sum2
B2
CarryOut2
CarryIn3
A3
1bit
FA
Sum3
B3
CarryOut3
A B Cout
0 0 0 “kill”
0 1 Cin “propagate”
1 0 Cin “propagate”
1 1 1 “generate”
A0
S0
G
B0
P
C1 =G0 + C0· P0
= A0B0 + C0(A0+B0)
A
S
P = A + B
G = A B
G
B
P
C2 = G1 + G0 · P1 + C0· P0 · P1
A
S
G
B
P
C3 = G2 + G1 · P2 + G0 · P1 · P2 + C0· P0 · P1 · P2
A
S
G = G3 + P3·G2 + P3·P2·G1 + P3·P2·P1·G0
G
B
P
P = P3·P2·P1·P0
C4 = . . .
Carry Look Ahead: reducing Carry Propagation delay¯ ¯ ¯ ¯ ¯ ¯
Sum = ABCin + ABCin + ABCin + ABCin
Carry Look Ahead: DelaysL
A
4bit
Adder
4bit
Adder
4bit
Adder
Cascaded CLA: overcoming Fanin constraintC0
G0
P0
C1 =G0 + C0 · P0
Delay = 3 + 2 + 3 = 8
DelayRC = 15*2 + 3 = 33
C2 = G1 + G0 · P1 + C0 · P0 · P1
C3 = G2 + G1 · P2 + G0 · P1 · P2 + C0 · P0 · P1 · P2
G
P
C4 = . . .
0 1 0 1 (+5)+0 0 1 0 (+2) 0 1 1 1 (+7)
0 1 0 1 (+5)+1 0 1 0 (6) 1 1 1 1 (1)
0 0 1 0 (+2) 0 0 1 00 1 0 0 (+4) +1 1 0 0 (4) 1 1 1 0 (2)
1 0 1 1 (5)+1 1 1 0 (2)11 0 0 1 (7)
0 1 1 1 (+7)+1 1 0 1 (3)10 1 0 0 (+4)
1 1 1 0 (2) 1 1 1 01 0 1 1 (5) +0 1 0 1 (+5)10 0 1 1 (+3)
Decimal
Binary
Decimal
2’s Complement
0
0000
0
0000
1
0001
1
1111
2
0010
2
1110
3
0011
3
1101
4
0100
4
1100
5
0101
5
1011
6
0110
6
1010
7
0111
7
1001
8
1000
0
1
1
1
1
0
1
1
1
7
1
1
0
0
– 4
3
– 5
+
0
0
1
1
+
1
0
1
1
1
0
1
0
– 6
0
1
1
1
7
0
1
1
1
1
0
0
1
1
1
7
1
1
0
0
–4
3
– 5
+
0
0
1
1
+
1
0
1
1
1
0
1
0
– 6
0
1
1
1
7
A1
1bit
FA
Result1
B1
CarryOut1
Overflow Detection LogicCarryIn0
A0
1bit
FA
Result0
X
Y
X XOR Y
B0
0
0
0
CarryOut0
0
1
1
1
0
1
1
1
0
CarryIn2
A2
1bit
FA
Result2
B2
CarryIn3
Overflow
A3
1bit
FA
Result3
B3
CarryOut3
0 1 0 1 (+5)+0 1 0 0 (+4) 1 0 0 1 (7?)
0 1 0 1 (+5)+1 0 1 0 (6)1 1 1 1 (1)
0 0 1 1 (+3)+1 1 0 1 (3)10 0 0 0 (0)
0 1 1 1 (+7)+1 1 0 1 (3)10 1 0 0 (+4)
Multiplicand 1101 (13)Multiplier1011 (11) 11011101 00001101Product 10001111 (143)
0
0
0
A3
A2
A1
A0
B0
A3
A2
A1
A0
B1
A3
A2
A1
A0
B2
A3
A2
A1
A0
B3
P7
P6
P5
P4
P3
P2
P1
P0
Unsigned Combinational MultiplierA3
A2
A1
A0
A3
A2
A1
A0
A3
A2
A1
A0
A3
A2
A1
A0
How does it work?0
0
0
0
0
0
0
B0
B1
B2
B3
P7
P6
P5
P4
P3
P2
P1
P0
4 bits
Multiplicand
0
Multiplicand 1101
C ProductMultiplier0 0000 10110 1101 1011 Add0 0110 1101 Shift
1 0011 1101 Add0 1001 1110 Shift
0 1001 1110 NoAdd0 0100 1111 Shift
1 0001 1111 Add0 1000 1111 Shift
MUX
Shift Right
Control
4bit FA
Add/NoAdd
C
Product
(Multiplier)
8 bits
4 bits
Multiplicand 10011 (13)Multiplier 01011 (+11)1111110011111110011000000001110011000000Product 1101110001 (143)
Current Bit Bit to the Right Explanation Example Op
1 0 Begins run of 1s 0001111000 sub
1 1 Middle of run of 1s 0001111000 none
0 1 End of run of 1s 0001111000 add
0 0 Middle of run of 0s 0001111000 none
Originally for Speed (when shift was faster than add)
Operation Multiplicand Product next?
0. initial value 0010 0000 0111 0 10 > sub
1a. P = P  m 1110 + 1110 1110 0111 0 shift P (sign ext)
1b. 0010 1111 00111 11 > nop, shift
2. 0010 1111 10011 11 > nop, shift
3. 0010 1111 11001 01 > add
4a. 0010 + 0010
0001 11001 shift
4b. 0010 0000 1110 0 done
Operation Multiplicand Product next?
0. initial value 0010 0000 1101 0 10 > sub
1a. P = P  m 1110 + 1110 1110 1101 0 shift P (sign ext)
1b. 0010 1111 01101 01 > add + 0010
2a. 0001 01101shift P
2b. 0010 0000 10110 10 > sub + 1110
3a. 0010 1110 10110 shift
3b. 0010 1111 0101 1 11 > nop
4a 1111 0101 1 shift
4b. 0010 1111 10101 done
(read yourself!)
1001 Quotient
Divisor 1000 1001010 Dividend–1000 10 101 1010–1000 10 Remainder (or Modulo result)
Binary => 1 * divisor or 0 * divisor
33 bits
Shift Left
Control
33bit FA
Q Setting
Remainder
(Quotient)
65 bits
33 bits
Signbit Checking
0010QuotientDivisor11 1000 Dividend 1110Remainder
Restoring Division AlgorithmRemainderQuotient
Initially 00000 1000Shift 00001 000_Sub(11) 11101Set q0 11110Restore 00001 0000
Shift 00010 000_Sub(11) 11101Set q0 11111Restore 00010 0000
Shift 00100 000_Sub(11) 11101Set q0 00001 00001 0001
Shift 00010 001_Sub(11) 11101 001_Set q0 11111Restore 00010 0010
0 to 2N  1
2(N1) to 2(N1)  1
mantissa
radix (base)
decimal point
Scientific Notation Review6.02 x 1023
Mantissa
radix (base)
“binary point”
Scientific Notation for Binary Numbers1.0two x 21
30
23
22
0
S
Exponent
Significand
1 bit
8 bits
23 bits
Floating Point Representation (1/2)30
20
19
0
S
Exponent
Significand
1 bit
11 bits
20 bits
Significand (cont’d)
32 bits
Double Precision Fl. Pt. Representation0111 1110
000 0000 0000 0000 0000 0000
1/2
2
0
1000 0000
000 0000 0000 0000 0000 0000
0
1111 1111
000 0000 0000 0000 0000 0000
1/2
2
0
0000 0001
000 0000 0000 0000 0000 0000
IEEE 754 Floating Point Standard (3/4)30
23
22
0
S
Exponent
Significand
1 bit
8 bits
23 bits
IEEE 754 Floating Point Standard (4/4)Exponent Significand Object
0 0 0
0 nonzero ???
1254 anything +/ fl. pt. #
255 0 +/ infinity
255 nonzero NaN
result of operation overflows, i.e., is larger than the largest number that
can be represented
overflow is not the same as divide by zero (raises a different exception)
S 1 . . . 1 0 . . . 0
+/ infinity
It may make sense to do further computations with infinity
e.g., X/0 > Y may be a valid comparison
Not a number, but not infinity (e.q. sqrt(4))
invalid operation exception (unless operation is = or =)
S 1 . . . 1 nonzero
NaN
HW decides what goes here
NaNs propagate: f(NaN) = NaN
"Floating Point numbers are like piles of sand; every time you move one you lose a little sand, but you pick up a little dirt."
How many extra bits?
IEEE: As if computed the result exactly and rounded.
Addition:
1.xxxxx 1.xxxxx 1.xxxxx
+ 1.xxxxx 0.001xxxxx 0.01xxxxx
1x.xxxxy 1.xxxxxyyy 1x.xxxxyyy
postnormalization prenormalization pre and post
normalized result, but some nonzero digits to the right of the
significand > the number should be rounded
E.g., B = 10, p = 3:
2bias
0 2 1.69
= 1.6900 * 10
=  .0785 * 10
= 1.6115 * 10
2bias
0 0 7.85

2bias
0 2 1.61
one round digit must be carried to the right of the guard digit so that
after a normalizing left shift, the result can be rounded, according
to the value of the round digit
IEEE Standard:
four rounding modes: round to nearest (default)
round towards plus infinity
round towards minus infinity
round towards 0
round to nearest:
round digit < B/2 then truncate
> B/2 then round up (add 1 to ULP: unit in last place)
= B/2 then round to nearest even digit
it can be shown that this strategy minimizes the mean error
introduced by rounding
Additional bit to the right of the round digit to better fine tune rounding
d0 . d1 d2 d3 . . . dp1 0 0 0
0 . 0 0 X . . . X X X S
X X S
Sticky bit: set to 1 if any 1 bits fall off
the end of the round digit
+
d0 . d1 d2 d3 . . . dp1 0 0 0
0 . 0 0 X . . . X X X 1
d0 . d1 d2 d3 . . . dp1 0 0 0
0 . 0 0 X . . . X X X 0


X X 0
generates a borrow
Rounding Summary:
Radix 2 minimizes wobble in precision
Normal operations in +,,*,/ require one carry/borrow bit + one guard digit
One round digit needed for correct rounding
Sticky bit needed when round digit is B/2 for max accuracy
Rounding to nearest has mean error = 0 if uniform distribution of digits
are assumed
2bias
denorm
gap
1bias
bias
2
2
0
2
normal numbers with hidden bit >
B = 2, p = 4
The gap between 0 and the next representable number is much larger
than the gaps between nearby representable numbers.
IEEE standard uses denormalized numbers to fill in the gap, making the
distances between numbers near 0 more alike.
2bias
1bias
bias
2
2
0
2
p1
bits of
precision
p bits of
precision
same spacing, half as many values!
NOTE: PDP11, VAX cannot represent subnormal numbers. These
machines underflow to zero instead.