1 / 15

Floating Point Representations

Floating Point Representations. CDA 3101 Discussion Session 0 2. Question 1. Converting the binary number 1010 0100 1001 0010 0100 1001 0010 0100 2 to decimal, if the binary is Unsigned? 2 ’ s complement? Single precision floating-point?. Question 1 .1.

hidi
Download Presentation

Floating Point Representations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Floating Point Representations CDA 3101 Discussion Session 02

  2. Question 1 • Converting the binary number 1010 0100 1001 0010 0100 1001 0010 01002 to decimal, if the binary is Unsigned? 2’s complement? Single precision floating-point?

  3. Question 1.1 • Converting bin (unsigned) to dec 1010 0100 1001 0010 0100 1001 0010 01002 1*231 + 1*229 + … + 1*28 + 1*25 + 1*22 = 2761050404

  4. Question 1.2 • Converting bin (2’s complement) to dec 1010 0100 1001 0010 0100 1001 0010 01002 -1*231 + 1*229 + … + 1*28 + 1*25 + 1*22 = -1533916892

  5. S(1) Biased Exponent(8) Fraction (23) Question 1.3 • Converting bin (Single precision FP) to dec 1010 0100 1001 0010 0100 1001 0010 01002 Sign bit : 1 Exponent : 01001001 = 73 Fraction : 00100100100100100100100 =1*2-3 + 1*2-6 + … + 1*2-15 + 1*2-18 + 1*2-21 =0.142857074 (-1)S * (1.Fraction) * 2(Exponent - 127) =(-1)1 * (1.142857074) * 2(73 - 127) =-1.142857074 * 2-54 =-6.344131187 * 10-17

  6. Question 2 • Show the IEEE 754 binary representation for the floating-point number 0.110 in single­precision and double­precision

  7. 0 01111011 10011001100110011001100 Question 2.1 • Converting 0.110 to single-precision FP Step1: Covert fraction 0.1 to binary (multiplying by 2) 0.1*2 = 0.2, 0.2*2 = 0.4, 0.4*2 = 0.8, 0.8*2 = 1.6, 0.6*2 = 1.2,0.2*2 = 0.4, 0.4*2 = 0.8, 0.8*2 = 1.6, 0.6*2 = 1.2, …000110011… 1.10011… * 2-4 Step2: Express in single precision format (-1)S * (1.Fraction) * 2(Exponent +127) =(-1)0 * (1.10011001100110011001100) * 2(-4+127)

  8. 0 01111111011 1001100110011001100110011001100110011001100110011001 Question 2.2 • Converting 0.110 to double-precision FP Step1: Covert fraction 0.1 to binary (multiplying by 2) 0.1*2 = 0.2, 0.2*2 = 0.4, 0.4*2 = 0.8, 0.8*2 = 1.6, 0.6*2 = 1.2,0.2*2 = 0.4, 0.4*2 = 0.8, 0.8*2 = 1.6, 0.6*2 = 1.2, …000110011… 1.10011… * 2-4 Step2: Express in double precision format (-1)S * (1.Fraction) * 2(Exponent +1023) =(-1)0 * (1.1001100110011001100110) * 2(-4+1023)

  9. Question 3 • Convert the following single-precision numbers into decimal a. 0 11111111 0000000000000000000000 b. 0 00000000 0000000000000000000010

  10. S(1) Biased Exponent(8) Fraction (23) Question 3.1 • Converting bin (Single precision FP) to dec 0 11111111 000000000000000000000002 Sign bit : 0 Exponent : 11111111 = Infinity Fraction : 00000000000000000000000 = 0 Infinity

  11. S(1) Biased Exponent(8) Fraction (23) Question 3.2 • Converting bin (Single precision FP) to dec 0 00000000 000000000000000000000102 Sign bit : 0 Exponent : 00000000 = 0 Fraction : 00000000000000000000010 =1*2-22 =0.000000238 (-1)S * (0.Fraction) * 2-126 =(-1)0 * (0.000000238) * 2-126 = 2.797676555 * 10-45

  12. Question 4 • Consider the 80-bit extended-precision IEEE 754 floating point standard that uses 1 bit for the sign, 16 bits for the biased exponent and 63 bits for the fraction (f). Then, write (i) the 80- bit extended-precision floating point representation in binary and (ii) the corresponding value in base-10 positional (decimal) system of • the third smallest positive normalized number • the largest (farthest from zero) negative normalized number • the third smallest positive denormalized number that can be represented.

  13. Question 4.1 • The third smallest positive normalized number Bias: 215-1 = 32767 Sign: 0 Biased Exponent: 0000 0000 0000 0001 Fraction (f): 61 zeros followed by 10 Decimal Value: (-1)0*2(1-32767)*(1+2-62) = 2-32766+2-32828

  14. Question 4.2 • The largest (farthest from zero) negative normalized number Sign: 1 Biased Exponent: 1111 1111 1111 1110 Fraction: 63 ones Decimal Value: (-1)1*2(65534-32767)*(1+2-1+2-2+…+2-63) = -232767(264-1)2-63 = -232768 (approx.)

  15. Question 4.3 • The third smallest positive denormalized number Sign: 0 Biased Exponent: 0000 0000 0000 0000 Fraction: 61 zeros followed by 11 Decimal Value: (-1)0*2-32766*(2-62+2-63) = 3*2-32829

More Related