Floating Point Operations

Slides:



Advertisements
Similar presentations
Computer Architecture
Advertisements

May 12 th, 2002 Microprocessors. Introduction Motorola controls roughly 40% of the 32-bit embedded processor market ColdFire is the next generation 68K.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Introduction Part 2: Data types and addressing modes dr.ir. A.C. Verschueren.
Assembly Language for x86 Processors 6th Edition Chapter 5: Procedures (c) Pearson Education, All rights reserved. You may modify and copy this slide.
Fabián E. Bustamante, Spring 2007 Floating point Today IEEE Floating Point Standard Rounding Floating Point Operations Mathematical properties Next time.
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007 ECE 498AL, University of Illinois, Urbana-Champaign 1 ECE 498AL Lecture 11: Floating-Point Considerations.
Instruction Set Architecture
University of Washington Today Topics: Floating Point Background: Fractional binary numbers IEEE floating point standard: Definition Example and properties.
Microprocessors The MIPS Architecture (Floating Point Instruction Set) Mar 26th, 2002.
Copyright 2008 Koren ECE666/Koren Part.4c.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.
1/8/ L24 IEEE Floating Point Basics Copyright Joanne DeGroat, ECE, OSU1 IEEE Floating Point The IEEE Floating Point Standard and execution.
What are Exception and Interrupts? MIPS terminology Exception: any unexpected change in the internal control flow – Invoking an operating system service.
Computer Organization and Architecture Computer Arithmetic Chapter 9.
Computer Arithmetic Nizamettin AYDIN
Computer Arithmetic. Instruction Formats Layout of bits in an instruction Includes opcode Includes (implicit or explicit) operand(s) Usually more than.
CEN 316 Computer Organization and Design Computer Arithmetic Floating Point Dr. Mansour AL Zuair.
Machine Instruction Characteristics
Computing Systems Basic arithmetic for computers.
ECE232: Hardware Organization and Design
1/8/ L24 IEEE Floating Point Basics Copyright Joanne DeGroat, ECE, OSU1 IEEE Floating Point The IEEE Floating Point Standard and execution.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /14/2013 Lecture 16: Floating Point Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE.
CH09 Computer Arithmetic  CPU combines of ALU and Control Unit, this chapter discusses ALU The Arithmetic and Logic Unit (ALU) Number Systems Integer.
Oct. 18, 2007SYSC 2001* - Fall SYSC2001-Ch9.ppt1 See Stallings Chapter 9 Computer Arithmetic.
9.4 FLOATING-POINT REPRESENTATION
8-1 Embedded Systems Fixed-Point Math and Other Optimizations.
Lecture 9: Floating Point
Floating Point Representation for non-integral numbers – Including very small and very large numbers Like scientific notation – –2.34 × –
ECE 456 Computer Architecture
1 Number Systems Lecture 10 Digital Design and Computer Architecture Harris & Harris Morgan Kaufmann / Elsevier, 2007.
© Copyright Khronos Group, Page 1 Coping with Fixed Point Mik BRY CEO
1/8/ L24 IEEE Floating Point Basics Copyright Joanne DeGroat, ECE, OSU1 IEEE Floating Point The IEEE Floating Point Standard and execution.
Computer Arithmetic Floating Point. We need a way to represent –numbers with fractions, e.g., –very small numbers, e.g., –very large.
Interrupt driven I/O. MIPS RISC Exception Mechanism The processor operates in The processor operates in user mode user mode kernel mode kernel mode Access.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Chapter 3 Arithmetic for Computers. Chapter 3 — Arithmetic for Computers — 2 Arithmetic for Computers Operations on integers Addition and subtraction.
Floating Point Numbers Representation, Operations, and Accuracy CS223 Digital Design.
CE Operating Systems Lecture 2 Low level hardware support for operating systems.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Lecture 1: Review of Computer Organization
Interrupt driven I/O Computer Organization and Assembly Language: Module 12.
PART 4: (1/2) Central Processing Unit (CPU) Basics CHAPTER 12: P ROCESSOR S TRUCTURE AND F UNCTION.
Interrupts and Exception Handling. Execution We are quite aware of the Fetch, Execute process of the control unit of the CPU –Fetch and instruction as.
o History of Floating Point o Defining Floating Point Arithmetic o Floating Point Representation o Floating Point Format o Floating Point Precisions o.
Introduction to Exceptions 1 Introduction to Exceptions ARM Advanced RISC Machines.
© David Kirk/NVIDIA and Wen-mei W
ARM Cortex M3 & M4 Chapter 4 - Architecture
Protection in Virtual Mode
Computer Architecture & Operations I
Topics IEEE Floating Point Standard Rounding Floating Point Operations
The Cortex-M3/m4 Embedded Systems: Cortex-M3/M4 Instruction Sets
Floating Point Number system corresponding to the decimal notation
CS 367 Floating Point Topics (Ch 2.4) IEEE Floating Point Standard
Arithmetic for Computers
Processor Organization and Architecture
The IEEE Floating Point Standard and execution units for it
ECE 498AL Spring 2010 Lecture 11: Floating-Point Considerations
Processor Organization and Architecture
UNIVERSITY OF MASSACHUSETTS Dept
CNET 315 Microprocessor & Assembly Language
© David Kirk/NVIDIA and Wen-mei W. Hwu,
Morgan Kaufmann Publishers Arithmetic for Computers
UNIVERSITY OF MASSACHUSETTS Dept
Computer Organization and Assembly Language
Computer Architecture and System Programming Laboratory
INSTRUCTION SET DESIGN
Presentation transcript:

Floating Point Operations Chap 13 Tae-min Hwang

Index 1 Floating point data 2 Floating point unit (FPU) 3 Lazy stacking 4 Using FPU 5 Floating point exception

Floating point data Floating point Single precision(32bit) 31 30:23 Float pf = 3.141592F; 31 30:23 22:0 Sign Exponent Fraction Single precision data format Value = −1 𝑺𝒊𝒈𝒏 × 2 (𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕 −127) ×(1+ 1 2 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 22 + 1 4 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 21 + 1 8 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 20 … 1 2 23 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 0 )

Floating point data Floating point Half precision(16bit) –additional command option needed __fp16 15 14:10 9:0 Sign Exponent Fraction Half precision data format Value = −1 𝑺𝒊𝒈𝒏 × 2 (𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕 −15) ×(1+ 1 2 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 9 + 1 4 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 8 + 1 8 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 7 … 1 2 10 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 0 ) Double precision Double pf = 3.1415926535897932384626433832795; 15 14:10 9:0 Sign Exponent Fraction Fraction Double precision data format Value = −1 𝑺𝒊𝒈𝒏 × 2 (𝒆𝒙𝒑𝒐𝒏𝒆𝒏𝒕 −1023) ×(1+ 1 2 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 51 + 1 4 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 50 + 1 8 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 49 … 1 2 52 ∗𝑭𝒓𝒂𝒄𝒕𝒊𝒐𝒏 0 )

Floating point data Precision Exponent Fraction Sign Value Several case of value Precision Exponent Fraction Sign Value NZ NZ(Bit 22 == 0) NZ(Bit 22 == 1) 1 Don`t care +0 -0 DV(− 2 −126 ~− 2 −126 DV(− 2 −14 ~− 2 −14 DV(− 2 −1022 ~− 2 −1022 +∞ −∞ Signaling NaN Quiet NaN Common Single Half Double Single 0xFF NZ : None Zero DV : Denormalized Value NaN : Not a Number

Floating point data Precision Exponent Fraction Sign Value Several case of value Precision Exponent Fraction Sign Value NZ(Bit 9 == 0) NZ(Bit 9 == 1) 1 Don`t care +∞ −∞ Signaling NaN Quiet NaN Half 0x1F NZ(Bit 51 == 0) NZ(Bit 51 == 1) 1 Don`t care +∞ −∞ Signaling NaN Quiet NaN Signaling NaN : 연산 안되는 NAN quiet NAN : 반대 Double 0x7FF NZ : None Zero DV : Denormalized Value NaN : Not a Number

Floating point unit (FPU) Overview The FPU design is compliant with the IEEE 754 standard, but is not a complete implementation Cortex M4 Compliant IEEE 754-2008 Binary Floating Point Arithmetic Not a complete Implement Not implemented(need to be handled by software) : The floating point design supports: fmod(x, y); Thirty-two 32-bit registers S0 ~ S31 Double pf = 3.1415926535897932384626433832795; Single-precision floating point calculations 1001 2 → 9 9 → 1001 2 Integer Fixed point Half precision ↔ single-precision floating point Direct comparison between single & double Round floating point number to integer values floating point number Data transfers of single-precision and double-word data between floating point register bank and memory Data transfer of single-precision between floating point register bank and integer register bank

Floating point unit (FPU) Overview In the architecture, the FPU is viewed as a co-processor In the Cortex-M4 processor, a set of floating point instructions are used

Floating point unit (FPU) Floating point registers The FPU adds a number of registers to the processor system: CPACR (Co-processor Access Control Register) in SCB (System Control Block) Floating point register bank Floating point Status and Control Register (FPSCR) Additional registers in the FPU for floating point operations and control(FPCCR, FPCAR, FPDSCR, MVFR0, MVFR1) CPACR register Description : Allows you to enable or disable the FPU CMCIS-Core : SCB->CPACR 31:24 23:22 21:20 19:0 Reserved CP11 CP10 Reserved Bits CP10 and CP11 Setting 00 01 10 11 Access denied(Default). Any attempted access generate a Usage fault (type NOCP – No Co-processor) Privileged Access only. Unprivileged access generate a Usage fault Reserved – result unpredictable Full access By default CP10 and CP11 are zero after reset

Floating point unit (FPU) Floating point register bank S1 ~ S15 : Caller saved registers S16 ~ S31 : Callee-saved registers

Floating point unit (FPU) Floating point status and control register (FPSCR) Description : holds the arithmetic result flags and sticky status flags, as well as bit fields to control the behavior of the floating point unit CMCIS-Core : SCB->CPACR AHP DN FZ RMode Alternate half precision control bit: 0 – IEEE half, 1 – Alternative half Default NaN (Not a Number) mode control bit: 0 – NaN operands propagate through to the output of a floating point operation (default) 1 – Any operation involving one or more NaN(s) returns the default NaN Flush-to-zero model control bit: 0 – Flush-to-zero mode disabled (default) (IEEE 754 standard compliant) 1 – Flush-to-zero mode enabled; denormalized values (tiny values with exponent equal 0) are flushed 0 Rounding Mode Control field; the specified rounding mode is used by almost all floating point instructions: 00 – Round to Nearest (RN) mode (default) 01 – Round towards Plus Infinity (RP) mode 10 – Round towards Minus Infinity (RM) mode 11 – Round towards Zero (RZ) mode IDC IXC UFC OFC DZC IOC Input Denormal cumulative exception bit; set to 1 when floating point exception occurred, clear by writing 0 to this bit Inexact cumulative exception bit; set to 1 when floating point exception occurred, clear by writing 0 to this bit Underflow cumulative exception bit; set to 1 when floating point exception Overflow cumulative exception bit; set to 1 when floating point exception Division by Zero cumulative exception bit; set to 1 when floating point Invalid Operation cumulative exception bit; set to 1 when floating point Rounding : 근사 Flush to zero : 0으로 친다

Floating point unit (FPU) Floating point context control register (FPCCR) Description : 1. Allows you to control the behavior of exception handling such as the lazy stacking feature 2. Allows you to access some of the control information CMCIS-Core : FPU->FPCCR 31 30 29:9 8 7 6 5 4 3 2 1 ASPEN LSPEN Reserved MONRDY BFRDY Reserved MMRDY HFRDY THREAD Reserved USER LSPACT LSPEN : Enable/disable lazy stacking (state preservation) for S0-S15 & FPSCR. When this is set (default), the exception sequence use lazy stacking feature to ensure low interrupt latency. LSPACT : 0 = Lazy state preservation is not active. 1 = Lazy state preservation is active. Floating point stack frame has been allocated but saving state to it has been deferred Floating point context address register (FPCAR) Description : Holds the address of the FPU registers in the stack frame so that the lazy stacking mechanism knows where to push the FPU registers to later CMCIS-Core : FPU->FPCCR 31:3 2:0 ADDRESS Reserved Bits 2 to 0 are not used because the stack frame is double-word aligned

Floating point unit (FPU) Floating point default status control register (FPDSCR) Description : 1. Holds the default configuration information (operation modes) for the floating point status control data (In a complex system there can be different types of applications running in parallel, each with different FPU configurations) 2. The values are copied to the FPSCR at exception entry CMCIS-Core : FPU->FPDSCR

Floating point unit (FPU) Media and floating point feature registers (MVFR0, MVFR1) Description : 1. Read only registers 2. allow software to determine what instruction features are supported CMCIS-Core : FPU->MVFR0, FPU->MVFR1 31:28 27:24 23:20 19:16 15:12 11:8 7:4 3:0 Rounding Mode Short vectors Square root Divide FP exception trap Double precision Single precision Support 16x64 bit FP register bank MVFR0 Fused MAC FP Half Precision FP conversion - - - - Default NaN mode Flush to zero mode MVFR1 Bit field 0 : feature is not available Bit field 1, 2 : feature is supported

Lazy stacking Key elements of the lazy stacking feature FPCA bit in the CONTROL register : Indicates if the current context (e.g., task) has a floating point operation EXC_RETURN : indicates that the version of stack frame LSPACT bit in FPCCR : Set 1 when 1)lazy stacking enabled & 2)task has a floating point context & 3)longer stack frame used FPCAR register : holds the address to be used for stacking of floating point registers S0eS15 and FPSCR

Lazy stacking Scenario #1: No floating point context in interrupted task There is no change

Lazy stacking Scenario #2: Floating point context in interrupted task but not in ISR

Lazy stacking Scenario #3: Floating point context in interrupted task and in ISR

Lazy stacking Scenario #4: Nested interrupt with floating point context in the second handler

Lazy stacking Scenario #5: Nested interrupt with floating point context in the both handlers

Using FPU Pre-processing Directive __FPU_PRESENT __FPU_USED Indicate if the Cortex-M processor in the microcontroller has FPU or not, if yes, this is set to 1 by the device specific header Indicate whether an FPU is used or not, must be set to 0 if __FPU_PRESENT is 0. Can be either 0 or 1 if __FPU_PRESENT is 1. This is set by the compilation tools (e.g., project setting)

Using FPU Floating point programming in C X=T*atan(T2*sin(X)*cos(X)/(cos(X+Y)+cos(X-Y)-1.0)); Y=T*atan(T2*sin(Y)*cos(Y)/(cos(X+Y)+cos(X-Y)-1.0)); X=T*atanf(T2*sinf(X)*cosf(X)/(cosf(X+Y)+cosf(X-Y)-1.0F)); Y=T*atanf(T2*sinf(Y)*cosf(Y)/(cosf(X+Y)+cosf(X-Y)-1.0F)); 1. Report if any double-precision calculation have been used 2. Force all calculation to single precision only 3. generate disassembled code or linker report files to check whether the compiled image contains any double-precision run-time functions

Using FPU Special FPU modes need to program the FPSCR and the FPDSCR Flush-to-zero mode Allows some floating point calculations to be faster by avoiding the need to calculate results in the denormalized value range Default NaN mode If any of the inputs of a calculation is a NaN, or if the operation results in an invalid result, the calculation returns the default NaN

Floating point exceptions FPSCR provides six sticky bits Exception FRSCR Bit Examples Invalid operation Divison by zero Overflow Underflow Inexact Input Denormal IOC DZC OFC UFC IXC IDC Square root of a negative number (return quiet NaN by default) Divide by zero or log(0) (returns N by default) A result that is too large to be represented correctly (returns N by default) A result that is very small (returns denormalized value by default) The result has been rounded (return rounded results by default) A denormalized input value is replaced with a zero in the calculation due to Flush-to-Zero mode

Using FPU Alternate half-precision mode If any of the inputs of a calculation is a NaN, or if the operation results in an invalid result, the calculation returns the default NaN Rounding modes Can change the rounding mode at run-time In C99, the fenv.h defines the four available modes

Floating point exceptions Some cases C99 has defined a number of functions for checking the floating exception status: #include <fenv.h> // check floating point exception flags int fegetexceptflag(fexcept_t *flagp, int excepts); // clear floating point exception flags int feclearexcept(int excepts); Can examine and change the configuration of the floating point run-time library using: int fegetenv(envp); int fesetenv(envp);