Lecture 13: Integer Arithmetic and Floating Point cont.

Lecture 13: Integer Arithmetic and Floating Point cont. CS 2011 Fall 2014, Dr. Rozier

BOMB LAB STATUS

MIDTERM II

Midterm II November 13th Plan for remaining time

FLOATING POINT

Representation Bits to right of “binary point” represent fractional powers of 2 Represents rational number: Carnegie Mellon Fractional Binary Numbers • • • • • •

Limitation Can only exactly represent numbers of the form x/2k Other rational numbers have repeating bit representations Value Representation 1/3 0.0101010101[01]…2 1/5 0.001100110011[0011]…2 1/10 0.0001100110011[0011]…2 Carnegie Mellon Representable Numbers

Defined by IEEE Std 754-1985 Developed in response to divergence of representations Portability issues for scientific code Now almost universally adopted Two representations Single precision (32-bit) Double precision (64-bit) Floating Point Standard

S: sign bit (0  non-negative, 1  negative) Normalize significand: 1.0 ≤ |significand| < 2.0 Always has a leading pre-binary-point 1 bit, so no need to represent it explicitly (hidden bit) Significand is Fraction with the “1.” restored Exponent: excess representation: actual exponent + Bias Ensures exponent is unsigned Single: Bias = 127; Double: Bias = 1203 IEEE Floating-Point Format single: 8 bitsdouble: 11 bits single: 23 bitsdouble: 52 bits S Exponent Fraction

Consider a 4-digit decimal example 9.999 × 101 + 1.610 × 10–1 1. Align decimal points Shift number with smaller exponent 9.999 × 101 + 0.016 × 101 2. Add significands 9.999 × 101 + 0.016 × 101 = 10.015 × 101 3. Normalize result & check for over/underflow 1.0015 × 102 4. Round and renormalize if necessary 1.002 × 102 Floating-Point Addition

Now consider a 4-digit binary example 1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375) 1. Align binary points Shift number with smaller exponent 1.0002 × 2–1 + –0.1112 × 2–1 2. Add significands 1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1 3. Normalize result & check for over/underflow 1.0002 × 2–4, with no over/underflow 4. Round and renormalize if necessary 1.0002 × 2–4 (no change) = 0.0625 Floating-Point Addition

FP Adder Hardware Step 1 Step 2 Step 3 Step 4

INTEGER MULTIPLICATION

Start with long-multiplication approach 1000 × 1001 1000 0000 0000 1000 1001000 Multiplication multiplicand multiplier product Length of product is the sum of operand lengths

Start with long-multiplication approach 1000 × 1001 1000 0000 0000 1000 1001000 Multiplication multiplicand multiplier product Length of product is the sum of operand lengths Why?

1000 × 1001 1000 0000 0000 1000 1001000 How could we implement this in a better way? • What is unique about binary multiplication?

Multiplication Hardware Initially 0

1000 × 1001 1000 1000 Multiplying Add

1000 × 100 10000 0000 10000 Multiplying Shift! Add Shift!

1000 × 1001000 Multiplying Shift! Done!

Perform steps in parallel: add/shift Optimized Multiplier • One cycle per partial-product addition • That’s ok, if frequency of multiplications is low

Uses multiple adders Cost/performance tradeoff Faster Multiplier • Can be pipelined • Several multiplication performed in parallel

Computing Exact Product of w-bit numbers x, y Either signed or unsigned Ranges Unsigned: 0 ≤ x * y ≤ (2w – 1) 2 = 22w – 2w+1 + 1 Up to 2w bits Two’s complement min: x * y ≥ (–2w–1)*(2w–1–1) = –22w–2 + 2w–1 Up to 2w–1 bits Two’s complement max: x * y ≤ (–2w–1) 2 = 22w–2 Up to 2w bits, but only for (TMinw)2 Maintaining Exact Results Would need to keep expanding word size with each product computed Done in software by “arbitrary precision” arithmetic packages Multiplication

Standard Multiplication Function Ignores high order w bits Implements Modular Arithmetic UMultw(u , v) = u · v mod 2w • • • • • • • • • • • • • • • Unsigned Multiplication in C u Operands: w bits * v u · v True Product: 2*w bits UMultw(u , v) Discard w bits: w bits

SUN XDR library Widely used library for transferring data between machines ele_src malloc(ele_cnt * ele_size) Code Security Example #2 void* copy_elements(void *ele_src[], int ele_cnt, size_t ele_size);

XDR Code void* copy_elements(void *ele_src[], int ele_cnt, size_t ele_size) { /* * Allocate buffer for ele_cnt objects, each of ele_size bytes * and copy from locations designated by ele_src */ void *result = malloc(ele_cnt * ele_size); if (result == NULL) /* malloc failed */ return NULL; void *next = result; int i; for (i = 0; i < ele_cnt; i++) { /* Copy object i to destination */ memcpy(next, ele_src[i], ele_size); /* Move pointer to next memory region */ next += ele_size; } return result; }

What if: ele_cnt = 220 + 1 ele_size = 4096 = 212 Allocation = ?? How can I make this function secure? XDR Vulnerability malloc(ele_cnt * ele_size)

Standard Multiplication Function Ignores high order w bits Some of which are different for signed vs. unsigned multiplication Lower bits are the same • • • • • • • • • • • • • • • Signed Multiplication in C u Operands: w bits * v u · v True Product: 2*w bits TMultw(u , v) Discard w bits: w bits

Operation u << kgives u * 2k Both signed and unsigned Examples u << 3 == u * 8 u << 5 - u << 3 == u * 24 Most machines shift and add faster than multiply Compiler generates this code automatically • • • Power-of-2 Multiply with Shift k u • • • Operands: w bits * 2k 0 ••• 0 1 0 ••• 0 0 u · 2k True Product: w+k bits 0 ••• 0 0 Discard k bits: w bits UMultw(u , 2k) ••• 0 ••• 0 0 TMultw(u , 2k)

Multiply on ARM MUL{<cond>}{S} Rd, Rm, Rs Rd = Rm * Rs MLA{<cond>}{S} Rd, Rm, Rs, Rn Rd = Rm * Rs + Rn

Check for 0 divisor Long division approach If divisor ≤ dividend bits 1 bit in quotient, subtract Otherwise 0 bit in quotient, bring down next dividend bit Restoring division Do the subtract, and if remainder goes < 0, add divisor back Signed division Divide using absolute values Adjust sign of quotient and remainder as required Division quotient dividend 1001 1000 1001010 -1000 10 101 1010 -1000 10 divisor remainder n-bit operands yield n-bitquotient and remainder

Division Hardware Initially divisor in left half Initially dividend

One cycle per partial-remainder subtraction Looks a lot like a multiplier! Same hardware can be used for both Optimized Divider

Can’t use parallel hardware as in multiplier Subtraction is conditional on sign of remainder Faster dividers (e.g. SRT devision) generate multiple quotient bits per step Still require multiple steps Faster Division

Division in ARM • ARMv6 has no DIV instruction.

Division in ARM • ARMv6 has no DIV instruction. N = D x Q + R with 0 <= |R| < |D| N/D = Q + R

An Algorithm for Division

WRAP UP

For next time Homework Exercises: 3.4.2, 3.4.4 3.10.1 – 3.10.5 Due Tuesday 11/4 Read Chapter 4.1-4.4

Lecture 13: Integer Arithmetic and Floating Point cont.

Lecture 13: Integer Arithmetic and Floating Point cont.

Presentation Transcript

Binary and Floating Point Arithmetic

Decimal Floating-Point Arithmetic

Floating-Point Arithmetic

Floating Point Arithmetic

Floating Point Arithmetic

Set 16 FLOATING POINT ARITHMETIC

Floating Point Arithmetic

A Binary Integer Decimal-based Multiplier for Decimal Floating-Point Arithmetic

Floating-Point Arithmetic

Set 16 FLOATING POINT ARITHMETIC

FLOATING POINT ARITHMETIC

Chapter 9 Floating Point Arithmetic

Lecture 19 Additional Integer Arithmetic

Lecture 13 Integer Arithmetic

Lecture 13: (Integer Multiplication and Division) FLOATING POINT NUMBERS

Floating Point Arithmetic

Floating Point Arithmetic

Floating Point Arithmetic

Integer Arithmetic Floating Point Representation Floating Point Arithmetic

Lecture 7 Integer Arithmetic

Floating Point Arithmetic – Part I

Floating Point Arithmetic