decimal multiplication with efficient partial product generation
Download
Skip this Video
Download Presentation
Decimal Multiplication with Efficient Partial Product Generation

Loading in 2 Seconds...

play fullscreen
1 / 19

Decimal Multiplication with Efficient Partial Product Generation - PowerPoint PPT Presentation


  • 121 Views
  • Uploaded on

Decimal Multiplication with Efficient Partial Product Generation. Mike Schulte Dept. of Electrical & Computer Engineering University of Wisconsin at Madison. Mark Erle, Eric Schwarz Server & Technology Group IBM. Outline. Introduction and motivation Decimal multiplication challenges

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Decimal Multiplication with Efficient Partial Product Generation' - lea


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
decimal multiplication with efficient partial product generation

Decimal Multiplication with Efficient Partial Product Generation

Mike Schulte

Dept. of Electrical & Computer Engineering

University of Wisconsin

at Madison

Mark Erle, Eric Schwarz

Server & Technology Group

IBM

outline
Outline
  • Introduction and motivation
  • Decimal multiplication challenges
  • Novel aspects of algorithm
  • Algorithm components
    • Operand recode
    • Digit-by-digit multiplication
    • Partial product generation
    • Overlap removal & encoding
    • Partial product accumulation
    • Final product correction
  • Summary
introduction and motivation
Introduction and Motivation
  • Preponderance of business data in decimal form
  • Inexact mapping between decimal and binary
  • Decimal arithmetic used (required) in banking, finance, insurance, accounting
  • Increasing support in arithmetic community (revising IEEE 754/854)
  • Significant speedup achievable in hardware
  • Multiplication a key function
slide4
By the way, we’re about

20% through the talk:

0.2010 = 0.00110011…2

decimal multiplication challenges
Decimal Multiplication Challenges
  • Greater number of multiplicand tuples
    • Complicates partial product generation
  • Representing decimal values with two-state devices
    • Complicates partial product generation
    • Complicates partial product accumulation
  • Inability to use binary arithmetic techniques directly
novel aspects of algorithm
Novel Aspects of Algorithm
  • Recode operands
    • Simplify partial product generation
    • Improve latency of partial product generation
  • Restrict magnitude range of partial product digits
    • Simplify partial product accumulation
    • Improve latency of partial product accumulation
key aspect of algorithm
Key Aspect of Algorithm
  • Generate partial products as needed, not a priori
  • Benefits:
    • Reduces cycles to generate tuples
    • Reduces wiring to distribute tuples
    • Eliminates registers needed to store tuples
  • Cost can be delay during iterative portion of algorithm
  • Reduce cost via pipelining
    • Generate partial product in cycle i
    • Accumulate partial product in cycle i+1
operand recode mechanism
Operand Recode - Mechanism
  • Need signed-digits to restrict range
  • E.g., 2 5 6 is recoded into 3 -4 -4
  • aiS .elem. {-5, -4, …, 0, …, +4, +5}
  • Recode in parallel all digits .ge. 5
  • Four cases: ai .ge. 5 ?, ai-1 .ge. 5 ?
  • Need three operations
    • “Do nothing”
    • Increment
    • Radix complement
    • Diminished radix complement
operand recode implementation
Operand Recode -Implementation
  • Recode entire multiplicand, recode multiplier digit by digit
  • Fig. a: single digit
  • Fig. b: n-digit
digit by digit product mechanism
Digit-by-digit Product - Mechanism
  • Restrict digits to yield only 16 combinations
    • Magnitude: {0, …, 9}  {-5, …, +5} (100)
    • Absolute value: {-5, …, +5}  {0, …, 5} (36)
    • Zero & identity: {0, …, 5}  {2, …, 5} (16)
  • Lookup-table or combinatorial logic
  • Product characteristics
    • Absolute value  sign correction
    • {0, …, 25}, i.e., two digits  overlap removal
    • Restrict LSD to |5|  signed-digit addition
  • LSD magnitude restriction eases
    • Overlap removal
    • Partial product accumulation
partial product implementation
Partial Product - Implementation
  • LSD mux selects:
    • a0S or biS = 0
    • a0S = 1
    • biS = 1
    • a0S and biS > 1
  • MSD mux selects:
    • a0S and biS < 2
    • a0S and biS > 1
  • Fig. a: single digit
  • Fig. b: n+1 -digit
overlap removal encoding
Overlap Removal & Encoding
  • Partial products are sign-corrected, signed-magnitude digits in overlapped form
  • In each digit position
    • Four-bit, signed-magnitude digit {-5, …, +5}
    • Three-bit, signed-magnitude digit {-2, …, +2}
  • Prepare for partial product accumulation via Svoboda signed-digit adder
  • Use combinatorial circuit to
    • remove the overlap
    • produce Svoboda-encoded signed-digits
partial product accumulation
Partial Product Accumulation
  • Addition with signed-digits eliminates carry propagation
  • Use Svoboda signed-digit adder to accumulate
    • Partial product in encoded form
    • Shifted intermediate product (previous iteration)
  • One final product digit converted to BCD each cycle
  • Four cases: IPi[0] .ge. 0 ?, IPi-1[0] .ge. 0 ?
  • Need four operations
    • Convert to BCD
    • Convert to BCD and decrement
    • Convert additive inverse to BCD and radix complement
    • Convert additive inverse to BCD, radix complement, and decrement
summary
Summary
  • Algorithm utilizes restricted-range, signed digits throughout
  • Original aspects include:
    • Recoding operands into restricted-range, signed-digits
    • Generating non-overlapping, sign-corrected partial products from recoded operands
    • Recoding partial products for entry into signed-digit adder
  • Algorithm achieves n+5 latency
  • Extendable to floating-point multiplication
ad