Representing fractions – Fixed point

1 / 35

# Representing fractions – Fixed point - PowerPoint PPT Presentation

Representing fractions – Fixed point. The problem: How to represent fractions with finite number of bits ? . Representing fractions – Fixed point. A number with 10 bits. a 1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a 9 a 10. Representing fractions – Fixed point. A number with 10 bits.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Representing fractions – Fixed point

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Representing fractions – Fixed point
• The problem:
• How to represent fractions with finite number of bits ?
Representing fractions – Fixed point

A number with 10 bits

a1a2a3a4a5a6a7a8a9a10

Representing fractions – Fixed point

A number with 10 bits

a1a2a3a4a5a6a7a8a9a10

a1a2a3a4a5a6a7a8.a9a10

Fixing the point

Fixed point : the problem
• Cannot represent wide ranges of numbers.
• In scientific applications.
Representing Fractions – Floating point

1 * 101

10

-1.23 * 10-2

-0.123

Representing Fractions – Floating point

1 * 101

10

exponent

-1.23 * 10-1

-0.123

Representing Fractions – Floating point

1 * 101

10

Number (Mantissa)

-1.23 * 10-1

-0.123

Representing Fractions – Floating point

(-1)0*1 * 101

10

Sign bit

(-1)1*1.23 * 10-1

-0.123

Problem of uniqueness

100*10-4

0.1

Representation is not Unique

0.001*102

Problem of uniqueness - Normalization

610*10-4

0.61

6.1*10-1

Standardization

One digit to the

Left of the point

0.0061*102

Normalized Binary Floating point

D = (-1)a0 * (1.a1a2a3…)*2b1b2b3…

a0b1b2…bna1a2a3…am

String of bits

Floating point - Questions
• Representing the (signed) exponent
• How to represent zero?
• And Nan, infinity ?
• How to add, subtract and multiply?
• Rounding Errors.
Floating point – Representing the exponent

How to represent singed number ?

Sign bit

2-Complement

Floating point – Representing the exponent

How to represent singed number ?

Sign bit

Neither

2-Complement

Floating point – Representing the exponent
• We want the exponent to be binary ordered:

0000 < 0001 < …. < 1000 < … < 1111

Floating point – Representing the exponent

Number = Number - B

Usually B = 2n-1-1

We define the following sizes like this:

emin

000…0001

emax

111…1110

Floating point – Representing zero,NAN, ±

IEEE754 special values

Denormalized

number

normalized

number

IEEE 754

(Including the sign

Bit)

Infinity
• Provide a safe was to continue calculation when overflow is encountered.
Calculations with Floating Point numbers
• Equalize the exponents (smallerlarger exponent)
• Sum the mantissa
• Renormalize if necessary
Calculations with Floating Point numbers
• Example (in base 10):

|E| = 1 , |M| = 3

91  9.10*101

9.7  9.70*100

Calculations with Floating Point numbers

9.10*101

+ 9.70*100

Not The same Order.

Calculations with Floating Point numbers

9.10*101

+ 0.97*101

9.10*101

+ 9.70*100

10.7*101

renormalize

1.07*102

Calculations with Floating Point numbers
• Example II (in base 10):

|E| = 1 , |M| = 3

91  9.10*101

9.75  9.75*100

Calculations with Floating Point numbers

9.10*101

+ 9.75*100

Not The same Order.

Calculations with Floating Point numbers

9.10 *101

+ 0.975*101

9.10*101

+ 9.75*100

10.75*101

renormalize

5

(rounding error)

1.07*102

Rounding Errors

The Problem:

Squeezing infinite many real numbers into a finite number of bits

Measuring Rounding Errors
• Units in last place (Ulps)
• Relative Error
Measuring Rounding Errors – ULP

p digits

If d.dddd*re represent z

error = |d.dddd – (z/re)|*rp-1

Measuring Rounding Errors – ULP

Example I: r = 10 , p = 3

The number 3.14*10-2 represents 0.0314159

Error = 0.159

Measuring Rounding Errors – ULP
• What is the maximum ULP if the rounding is toward the nearest number?

0.5 ULP

Measuring Rounding Errors – Relative Error

p digits

If d.dddd*re represent z

Relative error = |d.dddd*re – z|/z

Measuring Rounding Errors – Relative errors

Example I: r = 10 , p = 3

The number 3.14*10-2 represents 0.0314159

Relative Error ~ 0.0005