Decimal multiplication with efficient partial product generation
This presentation is the property of its rightful owner.
Sponsored Links
1 / 19

Decimal Multiplication with Efficient Partial Product Generation PowerPoint PPT Presentation


  • 93 Views
  • Uploaded on
  • Presentation posted in: General

Decimal Multiplication with Efficient Partial Product Generation. Mike Schulte Dept. of Electrical & Computer Engineering University of Wisconsin at Madison. Mark Erle, Eric Schwarz Server & Technology Group IBM. Outline. Introduction and motivation Decimal multiplication challenges

Download Presentation

Decimal Multiplication with Efficient Partial Product Generation

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Decimal multiplication with efficient partial product generation

Decimal Multiplication with Efficient Partial Product Generation

Mike Schulte

Dept. of Electrical & Computer Engineering

University of Wisconsin

at Madison

Mark Erle, Eric Schwarz

Server & Technology Group

IBM


Outline

Outline

  • Introduction and motivation

  • Decimal multiplication challenges

  • Novel aspects of algorithm

  • Algorithm components

    • Operand recode

    • Digit-by-digit multiplication

    • Partial product generation

    • Overlap removal & encoding

    • Partial product accumulation

    • Final product correction

  • Summary


Introduction and motivation

Introduction and Motivation

  • Preponderance of business data in decimal form

  • Inexact mapping between decimal and binary

  • Decimal arithmetic used (required) in banking, finance, insurance, accounting

  • Increasing support in arithmetic community (revising IEEE 754/854)

  • Significant speedup achievable in hardware

  • Multiplication a key function


Decimal multiplication with efficient partial product generation

By the way, we’re about

20% through the talk:

0.2010 = 0.00110011…2


Decimal multiplication challenges

Decimal Multiplication Challenges

  • Greater number of multiplicand tuples

    • Complicates partial product generation

  • Representing decimal values with two-state devices

    • Complicates partial product generation

    • Complicates partial product accumulation

  • Inability to use binary arithmetic techniques directly


Novel aspects of algorithm

Novel Aspects of Algorithm

  • Recode operands

    • Simplify partial product generation

    • Improve latency of partial product generation

  • Restrict magnitude range of partial product digits

    • Simplify partial product accumulation

    • Improve latency of partial product accumulation


Key aspect of algorithm

Key Aspect of Algorithm

  • Generate partial products as needed, not a priori

  • Benefits:

    • Reduces cycles to generate tuples

    • Reduces wiring to distribute tuples

    • Eliminates registers needed to store tuples

  • Cost can be delay during iterative portion of algorithm

  • Reduce cost via pipelining

    • Generate partial product in cycle i

    • Accumulate partial product in cycle i+1


Operand recode complexity of digit by digit products

Operand Recode - Complexity of Digit-by-digit Products


Operand recode mechanism

Operand Recode - Mechanism

  • Need signed-digits to restrict range

  • E.g., 2 5 6 is recoded into 3 -4 -4

  • aiS .elem. {-5, -4, …, 0, …, +4, +5}

  • Recode in parallel all digits .ge. 5

  • Four cases: ai .ge. 5 ?, ai-1 .ge. 5 ?

  • Need three operations

    • “Do nothing”

    • Increment

    • Radix complement

    • Diminished radix complement


Operand recode implementation

Operand Recode -Implementation

  • Recode entire multiplicand, recode multiplier digit by digit

  • Fig. a: single digit

  • Fig. b: n-digit


Digit by digit product mechanism

Digit-by-digit Product - Mechanism

  • Restrict digits to yield only 16 combinations

    • Magnitude: {0, …, 9}  {-5, …, +5} (100)

    • Absolute value: {-5, …, +5}  {0, …, 5} (36)

    • Zero & identity: {0, …, 5}  {2, …, 5} (16)

  • Lookup-table or combinatorial logic

  • Product characteristics

    • Absolute value  sign correction

    • {0, …, 25}, i.e., two digits  overlap removal

    • Restrict LSD to |5|  signed-digit addition

  • LSD magnitude restriction eases

    • Overlap removal

    • Partial product accumulation


Partial product implementation

Partial Product - Implementation

  • LSD mux selects:

    • a0S or biS = 0

    • a0S = 1

    • biS = 1

    • a0S and biS > 1

  • MSD mux selects:

    • a0S and biS < 2

    • a0S and biS > 1

  • Fig. a: single digit

  • Fig. b: n+1 -digit


Overlap removal encoding

Overlap Removal & Encoding

  • Partial products are sign-corrected, signed-magnitude digits in overlapped form

  • In each digit position

    • Four-bit, signed-magnitude digit {-5, …, +5}

    • Three-bit, signed-magnitude digit {-2, …, +2}

  • Prepare for partial product accumulation via Svoboda signed-digit adder

  • Use combinatorial circuit to

    • remove the overlap

    • produce Svoboda-encoded signed-digits


Partial product accumulation

Partial Product Accumulation

  • Addition with signed-digits eliminates carry propagation

  • Use Svoboda signed-digit adder to accumulate

    • Partial product in encoded form

    • Shifted intermediate product (previous iteration)

  • One final product digit converted to BCD each cycle

  • Four cases: IPi[0] .ge. 0 ?, IPi-1[0] .ge. 0 ?

  • Need four operations

    • Convert to BCD

    • Convert to BCD and decrement

    • Convert additive inverse to BCD and radix complement

    • Convert additive inverse to BCD, radix complement, and decrement


Cycle by cycle

Cycle By Cycle


Block diagram top

Block Diagram -Top


Block diagram bottom

BlockDiagram -Bottom


Summary

Summary

  • Algorithm utilizes restricted-range, signed digits throughout

  • Original aspects include:

    • Recoding operands into restricted-range, signed-digits

    • Generating non-overlapping, sign-corrected partial products from recoded operands

    • Recoding partial products for entry into signed-digit adder

  • Algorithm achieves n+5 latency

  • Extendable to floating-point multiplication


Questions perhaps some answers end

Questions & Perhaps Some AnswersEnd


  • Login