Decimal Floating-point Multiplication via Carry-Save Addition Mark Erle Systems & Technology Group International Business Machines Brian Hickmann & Mike.

Slides:



Advertisements
Similar presentations
Fixed Point Numbers The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic. Fixed Point means that we.
Advertisements

Fabián E. Bustamante, Spring 2007 Floating point Today IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
CENG536 Computer Engineering department Çankaya University.
Decimal Floating-Point Arithmetic
1 CONSTRUCTING AN ARITHMETIC LOGIC UNIT CHAPTER 4: PART II.
Datorteknik FloatingPoint bild 1 Floating point Number system corresponding to the decimal notation 1,837 * 10 significand exponent a great number of corresponding.
Chapter 3 Arithmetic for Computers. Multiplication More complicated than addition accomplished via shifting and addition More time and more area Let's.
Chapter # 5: Arithmetic Circuits Contemporary Logic Design Randy H
CSE 378 Floating-point1 How to represent real numbers In decimal scientific notation –sign –fraction –base (i.e., 10) to some power Most of the time, usual.
Copyright 2008 Koren ECE666/Koren Part.4c.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
1 Module 2: Floating-Point Representation. 2 Floating Point Numbers ■ Significant x base exponent ■ Example:
1 ECE369 Chapter 3. 2 ECE369 Multiplication More complicated than addition –Accomplished via shifting and addition More time and more area.
COE 308: Computer Architecture (T041) Dr. Marwan Abu-Amara Integer & Floating-Point Arithmetic (Appendix A, Computer Architecture: A Quantitative Approach,
Number Systems Lecture 02.
FLOATING POINT COMPUTER ARCHITECTURE AND ORGANIZATION.
3-1 Chapter 3 - Arithmetic Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring Computer Architecture.
Computer Organization and Architecture Computer Arithmetic Chapter 9.
Computer Arithmetic Nizamettin AYDIN
Chapter 6: Computer Arithmetic and the ALU
Computing Systems Basic arithmetic for computers.
ECE232: Hardware Organization and Design
Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating.
Data Representation in Computer Systems
Chapter # 5: Arithmetic Circuits
Topic: Arithmetic Circuits Course: Digital Systems Slide no. 1 Chapter # 5: Arithmetic Circuits.
07/19/2005 Arithmetic / Logic Unit – ALU Design Presentation F CSE : Introduction to Computer Architecture Slides by Gojko Babić.
Oct. 18, 2007SYSC 2001* - Fall SYSC2001-Ch9.ppt1 See Stallings Chapter 9 Computer Arithmetic.
Computer Arithmetic II Instructor: Mozafar Bag-Mohammadi Spring 2006 University of Ilam.
HW/SW PARTITIONING OF FLOATING POINT SOFTWARE APPLICATIONS TO FIXED - POINTED COPROCESSOR CIRCUITS - Nalini Kumar Gaurav Chitroda Komal Kasat.
Abdullah Aldahami ( ) March 12, Introduction 2. Background 3. Proposed Multiplier Design a.System Overview b.Fixed Point Multiplier.
1 A Combined Decimal and Binary Floating-point Multiplier Charles Tsen, Sonia González-Navarro, Michael Schulte, Brian Hickmann, Katherine Compton 2009.
Fixed and Floating Point Numbers Lesson 3 Ioan Despi.
Computer Arithmetic II Instructor: Mozafar Bag-Mohammadi Ilam University.
Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.
Computer Engineering FloatingPoint page 1 Floating Point Number system corresponding to the decimal notation 1,837 * 10 significand exponent A great number.
Chapter 3 Arithmetic for Computers. Chapter 3 — Arithmetic for Computers — 2 Arithmetic for Computers Operations on integers Addition and subtraction.
Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.
1 ELEN 033 Lecture 4 Chapter 4 of Text (COD2E) Chapters 3 and 4 of Goodman and Miller book.
Chapter 8 Computer Arithmetic. 8.1 Unsigned Notation Non-negative notation  It treats every number as either zero or a positive value  Range: 0 to 2.
The Power Architecture and Power.org word marks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org.
Ph.D final defence1 Algorithms and Architectures for Decimal Transcendental Function Computation Ph.D Candidate: Dongdong Chen Department of Electrical.
By Liang-Kai Wang and Michael J. Schulte Joseph Schneider March 12, 2010.
Floating Point Representations
Floating Point Numbers
Dr. Clincy Professor of CS
Integer Division.
Topics IEEE Floating Point Standard Rounding Floating Point Operations
Floating Point Numbers: x 10-18
Floating Point Number system corresponding to the decimal notation
CS 232: Computer Architecture II
CS 367 Floating Point Topics (Ch 2.4) IEEE Floating Point Standard
Topic 3d Representation of Real Numbers
ECE/CS 552: Floating Point
CSCI206 - Computer Organization & Programming
CS 105 “Tour of the Black Holes of Computing!”
How to represent real numbers
How to represent real numbers
Computer Arithmetic Multiplication, Floating Point
Floating Point Arithmetic August 31, 2009
CS 105 “Tour of the Black Holes of Computing!”
Faculty of Cybernetics, Statistics and Economic Informatics –
UNIVERSITY OF MASSACHUSETTS Dept
CS213 Floating Point Topics IEEE Floating Point Standard Rounding
Topic 3d Representation of Real Numbers
UNIVERSITY OF MASSACHUSETTS Dept
CS 105 “Tour of the Black Holes of Computing!”
Presentation transcript:

Decimal Floating-point Multiplication via Carry-Save Addition Mark Erle Systems & Technology Group International Business Machines Brian Hickmann & Mike Schulte Electrical & Computer Engineering University of Wisconsin at Madison

2 Outline Introduction and motivation Introduction and motivation Extensions to fixed-point design Extensions to fixed-point design Implementation highlights Implementation highlights Verification and synthesis results Verification and synthesis results Summary Summary

3 Introduction Preponderance of business data in decimal form Preponderance of business data in decimal form Inexact mapping between decimal and binary Inexact mapping between decimal and binary Decimal arithmetic used/required in banking, finance, insurance, accounting Decimal arithmetic used/required in banking, finance, insurance, accounting Increasing support in arithmetic community, (IEEE P754 in ballot review process) Increasing support in arithmetic community, (IEEE P754 in ballot review process) Multiplication a key function Multiplication a key function

4 Motivation What's involved in extending fixed-point multiplication to support floating-point? What's involved in extending fixed-point multiplication to support floating-point? What are the similarities and differences with BFP multiplication? What are the similarities and differences with BFP multiplication?

5

6 Intermediate Exponent Calculation Preferred exponent: Preferred exponent: PE = E A + E B - bias Based on location of the decimal point (effective shift right): Based on location of the decimal point (effective shift right): IE IP = PE + p After left shifting the intermediate product: After left shifting the intermediate product: IE SIP = IE IP – SLA

7 Intermediate Product Shifting Based on leading zero counts of operands Based on leading zero counts of operands SLA may be off by one; need guard digit SLA may be off by one; need guard digit SLA = min(LZ A + LZ B, p) SLA = min(LZ A + LZ B, p) Shift right when IE IP < Emin Shift right when IE IP < Emin

8 Sticky Bit Generation Logically, all bits beyond the round digit must be ORed after left shifting Logically, all bits beyond the round digit must be ORed after left shifting SC = S IP – p – 2, where 2 is for g and r SC = S IP – p – 2, where 2 is for g and r Generate sticky bit on-the-fly, ORing one digit at a time while decrementing SC Generate sticky bit on-the-fly, ORing one digit at a time while decrementing SC SC = min(0, p – (LZ A - LZ B )) SC = min(0, p – (LZ A - LZ B )) –S IP - p = ((p – LZ A ) + (p – LZ B )) – p –Calculate two cycles prior to when needed

9 Rounding - Scheme No rounding overflow... simplifies scheme No rounding overflow... simplifies scheme Unique compound adder needed Unique compound adder needed –SIP may be in redundant form –Require C SIP +0 and C SIP +1; named C +0 and C +1 Possible corrective left shift (cls) of one digit Possible corrective left shift (cls) of one digit –S IP = S A + S B or S A + S B - 1 –Adder p digits wide –Concatenate g or g + 1

10 Rounding – Scheme Continued Three cases based on MSDs of C +0 and C +1 Three cases based on MSDs of C +0 and C +1 –No leading zeros, no corrective left shift –Leading zeros, possible corrective left shift –Zero followed by all nines Logically, select one among the following Logically, select one among the following –C +0, C +1 –C +0 « 1 || g, C +0 « 1 || g + 1 –C +1 « 1 || g, C +1 « 1 || g + 1 –Zero, largest finite number, infinity

11 Exception Detection & Handling Invalid operation Invalid operation –sNaN (pass significand of sNaN) –0 x ∞ (produce qNaN with significand 0) Overflow (and Inexact) Overflow (and Inexact) –IE IP – SLA > Emax –Increase SLA until all LZs removed Underflow (and possibly Inexact) Underflow (and possibly Inexact) –IE IP – SLA < Emin –Decrease SLA until 0, then shift right Inexact Inexact

12

13 Implementation Highlights Leverage operands' LZCs Leverage operands' LZCs –SC, SLA, and IE SIP Handle NaNs with minimal overhead Handle NaNs with minimal overhead –No dataflow modification –Coerce multiplicand or multiplier to 1 Support gradual underflow Support gradual underflow –No dataflow modification –Simply extend number of iterations Simple, control-based rounding scheme Simple, control-based rounding scheme

14 RTL Model and Verification Verilog model for both fixed-point and floating-point multiplier designs Verilog model for both fixed-point and floating-point multiplier designs All rounding modes, NaNs, exceptions All rounding modes, NaNs, exceptions Over 500,000 random & directed testcases Over 500,000 random & directed testcases –IBM decNumber based –IBM Haifa's FPgen (IEEE754R compliance) –IBM dectest Validated pre- and post-synthesis Validated pre- and post-synthesis

15 Synthesis Results 64-bit (16 digit) operands, DPD encoded 64-bit (16 digit) operands, DPD encoded LSI Logic's gflxp 0.11um CMOS, 55ps FO4 LSI Logic's gflxp 0.11um CMOS, 55ps FO4 Synopsys Design Compiler Synopsys Design Compiler Results Results –Fixed-point119,653 um FO4s –Floating-point237,607 um FO4s Critical path Critical path –Fixed-point4:2 compressor (accumulator) –Floating-point128-bit barrel shifer

16 Applicability to Parallel Designs IE and IP shift generation IE and IP shift generation Rounding scheme Rounding scheme NaN handling NaN handling Exception detection and handling Exception detection and handling On-the-fly sticky bit generation... NO On-the-fly sticky bit generation... NO

17 Sequential vs. Parallel Sequential Sequential –Less area –Potentially better cycle time Parallel Parallel –Less latency –Higher throughput

18 Summary Extended fixed-point, serial multiplier to support floating-point Extended fixed-point, serial multiplier to support floating-point Leveraged operands' LZCs Leveraged operands' LZCs Developed an efficient rounding scheme Developed an efficient rounding scheme Verified RTL and gate-level models Verified RTL and gate-level models Presented area and delay numbers for fixed- and floating-point designs Presented area and delay numbers for fixed- and floating-point designs Discussed applicability to parallel designs Discussed applicability to parallel designs

19 Et voilà! Vive le système décimale!

20 Backup Slides

21 No Rounding Overflow If S IP = 2p – 1 If S IP = 2p – 1 –MSD == 0 –Increment will not cause rounding overflow If S IP = 2p If S IP = 2p –Then we must have string of p 9s –p 9s is greater than maximum product –No rounding overflow possible Simplifies rounding scheme Simplifies rounding scheme

22 Decimal Storage Format