CSE 246: Computer Arithmetic Algorithms and Hardware Design

1 / 16

CSE 246: Computer Arithmetic Algorithms and Hardware Design - PowerPoint PPT Presentation

CSE 246: Computer Arithmetic Algorithms and Hardware Design. Winter 2004 Lecture 9. Instructor: Prof. Chung-Kuan Cheng. Topics:. Floating Point Numbers (IEEE P754) Standard Operations Exceptional Situations Rounding Modes. Standard. 2 32  Typically. Goal: Dynamic Range:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'CSE 246: Computer Arithmetic Algorithms and Hardware Design' - debra

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

CSE 246: Computer Arithmetic Algorithms and Hardware Design

Winter 2004

Lecture 9

Instructor:

Prof. Chung-Kuan Cheng

Topics:
• Floating Point Numbers (IEEE P754)
• Standard
• Operations
• Exceptional Situations
• Rounding Modes
Standard

232  Typically

• Goal: Dynamic Range:

largest #/ smallest #

• If too large, holes between #’s
Standard
• ulp (unit in the last place)
• Difference between two consecutive values of the significand.

3 Parts  x =  s  be

Sign Bit

Significand

8-bit exponent

Standard
• a1a2a3a4a5a6a7a8b1b2b3b22b23
• 1.* normalized number
• 0.* denormalized number

0 0.b1b2b3b22b23  2-126

1 --------------------------------- 1. b1b2b3b22b23  2-126

2

.

.

.

253

254 ------------------------------- 1.b1b2b3b22b23  2127

•   if bi = 0 for all i = 1,2,…,23, NaN otherwise

NaN  Not a Number

Standard

0.01x2-3 = 0.00x2-2

• Same number, so normalize to remove redundancy
• Smallest Number

0.00…01x2-126 = 1.0x2-23x2-126

= 1x2-149

• 1.1101111001110011100101
• Difference between 2 #’s small for normalized

0.0001 2 times compared to magnitudes

0.0010

Standard - Example

s. eeeeeeee nnnnnnnnnnnnnnnnnnnnnnn

0.00000000 00000000000000000000000 = 0.000…0x2-126

1.00000000 00000000000000000000000 = 0

0.00000001 00000000000000000000000 = 1.000…0x2-126

- minimal normalized #

0.00000001 00000000000000000000001 = 1.000…1x2-126

.

.

.

0.01111111 00000000000000000000001 = 1.000…1x20

0.10000000 00000000000000000000001 = 1.000…0x21

Standard – Example Cont.

0.11111110 00000000000000000000001 = 1.000…1x2127

0.11111110 11111111111111111111111 = 1.111…1x2127

- Normalized Maximum

0.11111111 00000000000000000000000 = 

Nmin = 1.0 x 2-126

Nmax = (2 – 2-23)2127

Double Floating Point
• a1a2…a11b1b2…b52

000…00 0. b1b2…b52 x 2-1023

000…01 1. b1b2…b52 x 2-1022

.

.

.

011…11 1. b1b2…b52 x 20

100…00 1. b1b2…b52 x 21

.

.

.

111…10 1. b1b2…b52 x 21023

111…11 =  if bi = 0 for all i = 1,2,…,52

Overflow/Underflow

Underflow

Denser

Sparser

Nmin

Nmax

Overflow

• s1xbe1 + (s2xbe2) = sxbe

= s1xbe1 + s2/be1-e2 x be1

= (s1  s2/be1-e2) x be1

• (s1xbe1) x (s2xbe2) = (s1xs2)be1+e2
Exceptions

a/0 =  if a > 0

a/ = 0 if a != 0

a·0 = 0

a· =  if a > 0

0· = invalid operation (NaN)

0/0 = invalid operation (NaN)

NaP op a = NaN

a +  = 

 -  = NaN

Rounding Mode
• Adder Output = Cout z1z0.z-1z-2…z-l GRS

Guard Bit

Round Bit

Sticky Bit, OR of all bits below bit R

1.101 x 23

+1.110 x 23

11.011 x 23

1.1011x24 Normalize – need to round or

Rouding

1.110 x 23

- 1.101 x 23

0.001 x 23

1.000 x 20 normalize

1.101 x 23

- 1.111 x 22

1.101 x 23

- 0.1111 x 23

0.1101 x 23

1.101 x 22

Guard bit

Rounding
• Round to the nearest even
• toward 0 1.1011
• Toward + 1.1100
• Toward - 1.1011
Conventional Rounding Error

Rounding Error

1.10100  1.101 = 0

1.10101  1.101 = -0.25

1.10110  1.110 = +0.5

1.10111  1.110 = +0.25

Average Error = 0.5/4 = 0.125