1 / 73

Chapter 3

Chapter 3. Arithmetic for Computers. Arithmetic for Computers. §3.1 Introduction. Operations on integers Addition and subtraction Multiplication and division Dealing with overflow Floating-point real numbers Representation and operations. Integer Addition. Example: 7 + 6.

asasia
Download Presentation

Chapter 3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 3 Arithmetic for Computers

  2. Arithmetic for Computers §3.1 Introduction • Operations on integers • Addition and subtraction • Multiplication and division • Dealing with overflow • Floating-point real numbers • Representation and operations Chapter 3 — Arithmetic for Computers — 2

  3. Integer Addition • Example: 7 + 6 §3.2 Addition and Subtraction • Overflow if result out of range • Adding +ve and –ve operands, no overflow • Adding two +ve operands • Overflow if result sign is 1 • Adding two –ve operands • Overflow if result sign is 0 Chapter 3 — Arithmetic for Computers — 3

  4. Integer Subtraction • Add negation of second operand • Example: 7 – 6 = 7 + (–6) +7: 0000 0000 … 0000 0111–6: 1111 1111 … 1111 1010+1: 0000 0000 … 0000 0001 • Overflow if result out of range • Subtracting two +ve or two –ve operands, no overflow • Subtracting +ve from –ve operand • Overflow if result sign is 0 • Subtracting –ve from +ve operand • Overflow if result sign is 1 Chapter 3 — Arithmetic for Computers — 4

  5. Dealing with Overflow • Some languages (e.g., C) ignore overflow • Use MIPS addu, addui, subu instructions • Other languages (e.g., Ada, Fortran) require raising an exception • Use MIPS add, addi, sub instructions • On overflow, invoke exception handler • Save PC in exception program counter (EPC) register • Jump to predefined handler address • mfc0 (move from coprocessor reg) instruction can retrieve EPC value, to return after corrective action Chapter 3 — Arithmetic for Computers — 5

  6. CarryIn0 A0 1-bit ALU Result0 B0 CarryOut0 CarryIn1 A1 1-bit ALU Result1 B1 CarryOut1 CarryIn2 A2 1-bit ALU Result2 B2 CarryOut2 CarryIn3 A3 1-bit ALU Result3 B3 CarryOut3 But What about Performance? • Critical Path of n-bit Rippled-carry adder is n*CP Design Trick: Throw hardware at it Chapter 3 — Arithmetic for Computers — 6

  7. A0 S G B0 P A1 S G B1 P A2 S G B2 P A3 S G B3 P Carry Look Ahead (Design trick: peek) C0 = Cin A B C-out 0 0 0 “kill” 0 1 C-in “propagate” 1 0 C-in “propagate” 1 1 1 “generate” C1 = G0 + C0  P0 G = A and B P = A or B C2 = G1 + G0 P1 + C0  P0  P1 C3 = G2 + G1 P2 + G0  P1  P2 + C0  P0  P1  P2 G P Chapter 3 — Arithmetic for Computers — 7 C4 = . . .

  8. Plumbing as Carry Lookahead Analogy Chapter 3 — Arithmetic for Computers — 8

  9. C L A 4-bit Adder 4-bit Adder 4-bit Adder Cascaded Carry Look-ahead (16-bit): Abstraction C0 G0 P0 C1 = G0 + C0  P0 C2 = G1 + G0 P1 + C0  P0  P1 C3 = G2 + G1 P2 + G0  P1  P2 + C0  P0  P1  P2 G P Chapter 3 — Arithmetic for Computers — 9 C4 = . . .

  10. 2nd level Carry, Propagate as Plumbing Chapter 3 — Arithmetic for Computers — 10

  11. Arithmetic for Multimedia • Graphics and media processing operates on vectors of 8-bit and 16-bit data • Use 64-bit adder, with partitioned carry chain • Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors • SIMD (single-instruction, multiple-data) • Saturating operations • On overflow, result is largest representable value • c.f. 2s-complement modulo arithmetic • E.g., clipping in audio, saturation in video Chapter 3 — Arithmetic for Computers — 11

  12. 1000 × 1001 1000 0000 0000 1000 1001000 Multiplication §3.3 Multiplication • Start with long-multiplication approach multiplicand multiplier product Length of product is the sum of operand lengths Chapter 3 — Arithmetic for Computers — 12

  13. Multiplication Hardware Initially 0 Chapter 3 — Arithmetic for Computers — 13

  14. Optimized Multiplier • Perform steps in parallel: add/shift • One cycle per partial-product addition • That’s ok, if frequency of multiplications is low Chapter 3 — Arithmetic for Computers — 14

  15. 4 Versions of Multiply • Successive refinements • See next Chapter 3 — Arithmetic for Computers — 15

  16. 0 0 0 0 A3 A2 A1 A0 B0 A3 A2 A1 A0 B1 A3 A2 A1 A0 B2 A3 A2 A1 A0 B3 P7 P6 P5 P4 P3 P2 P1 P0 Unsigned Combinational Multiplier • Stage i accumulates A * 2 i if Bi == 1 Chapter 3 — Arithmetic for Computers — 16

  17. A3 A2 A1 A0 A3 A2 A1 A0 A3 A2 A1 A0 A3 A2 A1 A0 How does it work? • at each stage shift A left ( x 2) • use next bit of B to determine whether to add in shifted multiplicand • accumulate 2n bit partial product at each stage 0 0 0 0 0 0 0 B0 B1 B2 B3 P7 P6 P5 P4 P3 P2 P1 P0 Chapter 3 — Arithmetic for Computers — 17

  18. Unsigned shift-add multiplier (version 1) • 64-bit Multiplicand reg, 64-bit ALU, 64-bit Product reg, 32-bit multiplier reg Shift Left Multiplicand 64 bits Multiplier Shift Right 64-bit ALU 32 bits Write Product Control 64 bits Multiplier = datapath + control Chapter 3 — Arithmetic for Computers — 18

  19. 1. Test Multiplier0 Multiply Algorithm Version 1 Start Multiplier0 = 1 • Product Multiplier Multiplicand 0000 0000 0011 0000 0010 • 0000 0010 0001 0000 0100 • 0000 0110 0000 0000 1000 • 0000 0110 Multiplier0 = 0 1a. Add multiplicand to product & place the result in Product register 2. Shift the Multiplicand register left 1 bit. 3. Shift the Multiplier register right 1 bit. 32nd repetition? No: < 32 repetitions Yes: 32 repetitions Done Chapter 3 — Arithmetic for Computers — 19

  20. Observations on Multiply Version 1 • 1 clock per cycle =>  100 clocks per multiply • Ratio of multiply to add 5:1 to 100:1 • 1/2 bits in multiplicand always 0=> 64-bit adder is wasted • 0’s inserted in right of multiplicand as shifted=> least significant bits of product never changed once formed • Instead of shifting multiplicand to left, shift product to right? Chapter 3 — Arithmetic for Computers — 20

  21. MULTIPLY HARDWARE Version 2 • 32-bit Multiplicand reg, 32 -bit ALU, 64-bit Product reg, 32-bit Multiplier reg Multiplicand 32 bits Multiplier Shift Right 32-bit ALU 32 bits Shift Right Product Control Write 64 bits Chapter 3 — Arithmetic for Computers — 21

  22. 0 0 0 0 A3 A2 A1 A0 B0 A3 A2 A1 A0 B1 A3 A2 A1 A0 B2 A3 A2 A1 A0 B3 P7 P6 P5 P4 P3 P2 P1 P0 How to think of this? Remember original combinational multiplier: Chapter 3 — Arithmetic for Computers — 22

  23. A3 A2 A1 A0 A3 A2 A1 A0 A3 A2 A1 A0 A3 A2 A1 A0 Simply warp to let product move right... 0 0 0 0 • Multiplicand stay’s still and product moves right B0 B1 B2 B3 P7 P6 P5 P4 P3 P2 P1 P0 Chapter 3 — Arithmetic for Computers — 23

  24. Start Multiplier0 = 1 Multiplier0 = 0 1. Test Multiplier0 1a. Add multiplicand to the left half ofproduct & place the result in the left half ofProduct register 2. Shift the Product register right 1 bit. 3. Shift the Multiplier register right 1 bit. 32nd repetition? No: < 32 repetitions Yes: 32 repetitions Done Multiply Algorithm Version 2 Product Multiplier Multiplicand 0000 0000 0011 0010 1: 0010 0000 0011 0010 2: 0001 0000 0011 0010 3: 0001 0000 0001 0010 1: 0011 0000 0001 0010 2: 0001 1000 0001 0010 3: 0001 1000 0000 0010 1: 0001 1000 0000 0010 2: 0000 1100 0000 0010 3: 0000 1100 0000 0010 1: 0000 1100 0000 0010 2: 0000 0110 0000 0010 3: 0000 0110 0000 0010 0000 0110 0000 0010 Chapter 3 — Arithmetic for Computers — 24

  25. Start Multiplier0 = 1 Multiplier0 = 0 1. Test Multiplier0 1a. Add multiplicand to the left half ofproduct & place the result in the left half ofProduct register 2. Shift the Product register right 1 bit. 3. Shift the Multiplier register right 1 bit. 32nd repetition? No: < 32 repetitions Yes: 32 repetitions Done Still more wasted space! Product Multiplier Multiplicand 0000 0000 0011 0010 1: 0010 0000 0011 0010 2: 0001 0000 0011 0010 3: 0001 00000001 0010 1: 0011 00000001 0010 2: 0001 1000 0001 0010 3: 0001 10000000 0010 1: 0001 10000000 0010 2: 0000 11000000 0010 3: 0000 11000000 0010 1: 0000 11000000 0010 2: 0000 0110 0000 0010 3: 0000 0110 0000 0010 0000 0110 0000 0010 Chapter 3 — Arithmetic for Computers — 25

  26. Observations on Multiply Version 2 • Product register wastes space that exactly matches size of multiplier=> combine Multiplier register and Product register Chapter 3 — Arithmetic for Computers — 26

  27. MULTIPLY HARDWARE Version 3 • 32-bit Multiplicand reg, 32 -bit ALU, 64-bit Product reg, (0-bit Multiplier reg) Multiplicand 32 bits 32-bit ALU Shift Right Product (Multiplier) Control Write 64 bits Chapter 3 — Arithmetic for Computers — 27

  28. 1. Test Product0 1a. Add multiplicand to the left half of product & place the result in the left half of Product register Multiply Algorithm Version 3 Start Product0 = 1 Multiplicand Product0010 0000 0011 Product0 = 0 2. Shift the Product register right 1 bit. 32nd repetition? No: < 32 repetitions Yes: 32 repetitions Done Chapter 3 — Arithmetic for Computers — 28

  29. Observations on Multiply Version 3 • 2 steps per bit because Multiplier & Product combined • MIPS registers Hi and Lo are left and right half of Product • Gives us MIPS instruction MultU • How can you make it faster? • What about signed multiplication? • easiest solution is to make both positive & remember whether tocomplement product when done (leave out the sign bit, run for 31 steps) • apply definition of 2’s complement • need to sign-extend partial products and subtract at the end • Booth’s Algorithm is elegant way to multiply signed numbers using same hardware as before and save cycles • can handle multiple bits at a time Chapter 3 — Arithmetic for Computers — 29

  30. Motivation for Booth’s Algorithm • Example 2 x 6 = 0010 x 0110: 0010 x 0110 + 0000 shift (0 in multiplier) + 0010 add (1 in multiplier) + 0010 add (1 in multiplier) + 0000 shift (0 in multiplier) 00001100 • ALU with add or subtract gets same result in more than one way: 6 = – 2 + 8 0110 = – 00010 + 01000 = 11110 + 01000 • For example • 0010 x 0110 0000 shift (0 in multiplier) – 0010 sub (first 1 in multpl.) .0000 shift (mid string of 1s) . + 0010 add (prior step had last1) 00001100 Chapter 3 — Arithmetic for Computers — 30

  31. –1 + 10000 01111 Booth’s Algorithm middle of run end of run beginning of run Current Bit Bit to the Right Explanation Example Op 1 0 Begins run of 1s 0001111000 sub 1 1 Middle of run of 1s 0001111000 none 0 1 End of run of 1s 0001111000 add 0 0 Middle of run of 0s 0001111000 none Originally for Speed (when shift was faster than add) • Replace a string of 1s in multiplier with an initial subtract when we first see a one and then later add for the bit after the last one 0 1 1 1 1 0 Chapter 3 — Arithmetic for Computers — 31

  32. Booths Example (2 x 7) Operation Multiplicand Product next? 0. initial value 0010 0000 01110 10 -> sub 1a. P = P - m 1110 + 1110 1110 01110 shift P (sign ext) 1b. 0010 1111 00111 11 -> nop, shift 2. 0010 1111 10011 11 -> nop, shift 3. 0010 1111 11001 01 -> add 4a. 0010 + 0010 0001 11001 shift 4b. 0010 0000 1110 0 done Chapter 3 — Arithmetic for Computers — 32

  33. Booths Example (2 x -3) Operation Multiplicand Product next? 0. initial value 0010 0000 11010 10 -> sub 1a. P = P - m 1110 + 1110 1110 11010 shift P (sign ext) 1b. 0010 1111 01101 01 -> add + 0010 2a. 0001 01101 shift P 2b. 0010 0000 10110 10 -> sub + 1110 3a. 0010 1110 10110 shift 3b. 0010 1111 0101 1 11 -> nop 4a 1111 0101 1 shift 4b. 0010 1111 10101 done Chapter 3 — Arithmetic for Computers — 33

  34. Faster Multiplier • Uses multiple adders • Cost/performance tradeoff • Can be pipelined • Several multiplication performed in parallel Chapter 3 — Arithmetic for Computers — 34

  35. MIPS Multiplication • Two 32-bit registers for product • HI: most-significant 32 bits • LO: least-significant 32-bits • Instructions • mult rs, rt / multu rs, rt • 64-bit product in HI/LO • mfhi rd / mflo rd • Move from HI/LO to rd • Can test HI value to see if product overflows 32 bits • mul rd, rs, rt • Least-significant 32 bits of product –> rd Chapter 3 — Arithmetic for Computers — 35

  36. Division §3.4 Division • Check for 0 divisor • Long division approach • If divisor ≤ dividend bits • 1 bit in quotient, subtract • Otherwise • 0 bit in quotient, bring down next dividend bit • Restoring division • Do the subtract, and if remainder goes < 0, add divisor back • Signed division • Divide using absolute values • Adjust sign of quotient and remainder as required quotient dividend 1001 1000 1001010 -1000 10 101 1010 -1000 10 divisor remainder n-bit operands yield n-bitquotient and remainder Chapter 3 — Arithmetic for Computers — 36

  37. Division Hardware Initially divisor in left half Initially dividend Chapter 3 — Arithmetic for Computers — 37

  38. Optimized Divider • One cycle per partial-remainder subtraction • Looks a lot like a multiplier! • Same hardware can be used for both Chapter 3 — Arithmetic for Computers — 38

  39. Faster Division • Can’t use parallel hardware as in multiplier • Subtraction is conditional on sign of remainder • Faster dividers (e.g. SRT devision) generate multiple quotient bits per step • Still require multiple steps Chapter 3 — Arithmetic for Computers — 39

  40. MIPS Division • Use HI/LO registers for result • HI: 32-bit remainder • LO: 32-bit quotient • Instructions • div rs, rt / divu rs, rt • No overflow or divide-by-0 checking • Software must perform checks if required • Use mfhi, mflo to access result Chapter 3 — Arithmetic for Computers — 40

  41. Floating Point §3.5 Floating Point • Representation for non-integral numbers • Including very small and very large numbers • Like scientific notation • –2.34 × 1056 • +0.002 × 10–4 • +987.02 × 109 • In binary • ±1.xxxxxxx2 × 2yyyy • Types float and double in C normalized not normalized Chapter 3 — Arithmetic for Computers — 41

  42. Floating Point Standard • Defined by IEEE Std 754-1985 • Developed in response to divergence of representations • Portability issues for scientific code • Now almost universally adopted • Two representations • Single precision (32-bit) • Double precision (64-bit) Chapter 3 — Arithmetic for Computers — 42

  43. IEEE Floating-Point Format • S: sign bit (0  non-negative, 1  negative) • Normalize significand: 1.0 ≤ |significand| < 2.0 • Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit) • Significand is Fraction with the “1.” restored • Exponent: excess representation: actual exponent + Bias • Ensures exponent is unsigned • Single: Bias = 127; Double: Bias = 1203 single: 8 bitsdouble: 11 bits single: 23 bitsdouble: 52 bits S Exponent Fraction Chapter 3 — Arithmetic for Computers — 43

  44. Single-Precision Range • Exponents 00000000 and 11111111 reserved • Smallest value • Exponent: 00000001 actual exponent = 1 – 127 = –126 • Fraction: 000…00 significand = 1.0 • ±1.0 × 2–126 ≈ ±1.2 × 10–38 • Largest value • exponent: 11111110 actual exponent = 254 – 127 = +127 • Fraction: 111…11 significand ≈ 2.0 • ±2.0 × 2+127 ≈ ±3.4 × 10+38 Chapter 3 — Arithmetic for Computers — 44

  45. Double-Precision Range • Exponents 0000…00 and 1111…11 reserved • Smallest value • Exponent: 00000000001 actual exponent = 1 – 1023 = –1022 • Fraction: 000…00 significand = 1.0 • ±1.0 × 2–1022 ≈ ±2.2 × 10–308 • Largest value • Exponent: 11111111110 actual exponent = 2046 – 1023 = +1023 • Fraction: 111…11 significand ≈ 2.0 • ±2.0 × 2+1023 ≈ ±1.8 × 10+308 Chapter 3 — Arithmetic for Computers — 45

  46. Floating-Point Precision • Relative precision • all fraction bits are significant • Single: approx 2–23 • Equivalent to 23 × log102 ≈ 23 × 0.3 ≈ 6 decimal digits of precision • Double: approx 2–52 • Equivalent to 52 × log102 ≈ 52 × 0.3 ≈ 16 decimal digits of precision Chapter 3 — Arithmetic for Computers — 46

  47. Floating-Point Example • Represent –0.75 • –0.75 = (–1)1 × 1.12 × 2–1 • S = 1 • Fraction = 1000…002 • Exponent = –1 + Bias • Single: –1 + 127 = 126 = 011111102 • Double: –1 + 1023 = 1022 = 011111111102 • Single: 1011111101000…00 • Double: 1011111111101000…00 Chapter 3 — Arithmetic for Computers — 47

  48. Floating-Point Example • What number is represented by the single-precision float 11000000101000…00 • S = 1 • Fraction = 01000…002 • Exponent = 100000012 = 129 • x = (–1)1 × (1 + 012) × 2(129 – 127) = (–1) × 1.25 × 22 = –5.0 Chapter 3 — Arithmetic for Computers — 48

  49. Floating-Point Addition • Consider a 4-digit decimal example • 9.999 × 101 + 1.610 × 10–1 • 1. Align decimal points • Shift number with smaller exponent • 9.999 × 101 + 0.016 × 101 • 2. Add significands • 9.999 × 101 + 0.016 × 101 = 10.015 × 101 • 3. Normalize result & check for over/underflow • 1.0015 × 102 • 4. Round and renormalize if necessary • 1.002 × 102 Chapter 3 — Arithmetic for Computers — 51

  50. Floating-Point Addition • Now consider a 4-digit binary example • 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375) • 1. Align binary points • Shift number with smaller exponent • 1.0002 × 2–1 + –0.1112 × 2–1 • 2. Add significands • 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1 • 3. Normalize result & check for over/underflow • 1.0002 × 2–4, with no over/underflow • 4. Round and renormalize if necessary • 1.0002 × 2–4 (no change) = 0.0625 Chapter 3 — Arithmetic for Computers — 52

More Related