Binary-Level Tools for Floating-Point Correctness Analysis Michael Lam LLNL Summer Intern 2011 Bronis de Supinski, Mentor.

Slides:



Advertisements
Similar presentations
Fakultät für informatik informatik 12 technische universität dortmund Optimizations - Compilation for Embedded Processors - Peter Marwedel TU Dortmund.
Advertisements

Automatically Adapting Programs for Mixed-Precision Floating-Point Computation Mike Lam and Jeff Hollingsworth University of Maryland, College Park Bronis.
© 2005 Microchip Technology Incorporated. All Rights Reserved. Slide 1 NUM 1019 Numerical Methods on the PIC Micro.
Chapter 4 Loops Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved
By Rick Clements Software Testing 101 By Rick Clements
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
Year 6 mental test 5 second questions
1 Processes and Threads Creation and Termination States Usage Implementations.
ZMQS ZMQS
All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, ThanassisAvgerinos,
Solve Multi-step Equations
EE, NCKU Tien-Hao Chang (Darby Chang)
Chapter 4 Memory Management Basic memory management Swapping
ABC Technology Project
Fixed-point and floating-point numbers CS370 Fall 2003.
UNIVERSITY OF MASSACHUSETTS Dept
© 2012 National Heart Foundation of Australia. Slide 2.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Gursharan Singh Tatla Math Co-Processor Nov-10 Gursharan Singh Tatla
Addition 1’s to 20.
25 seconds left…...
Week 1.
We will resume in: 25 Minutes.
1 Chapter 3:Operators and Expressions| SCP1103 Programming Technique C | Jumail, FSKSM, UTM, 2006 | Last Updated: July 2006 Slide 1 Operators and Expressions.
PSSA Preparation.
Foundations of Data Structures Practical Session #7 AVL Trees 2.
14-1 Bard, Gerstlauer, Valvano, Yerraballi EE 319K Introduction to Embedded Systems Lecture 14: Gaming Engines, Coding Style, Floating Point.
Installing Windows XP Professional Using Attended Installation Slide 1 of 30Session 8 Ver. 1.0 CompTIA A+ Certification: A Comprehensive Approach for all.
Today’s lecture Review of Chapter 1 Go over homework exercises for chapter 1.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Introduction Part 2: Data types and addressing modes dr.ir. A.C. Verschueren.
Floating Point Analysis Using Dyninst Mike Lam University of Maryland, College Park Jeff Hollingsworth, Advisor.
Simple Data Type Representation and conversion of numbers
Information Representation (Level ISA3) Floating point numbers.
Computer Organization and Architecture Computer Arithmetic Chapter 9.
Computer Arithmetic Nizamettin AYDIN
Computer Arithmetic. Instruction Formats Layout of bits in an instruction Includes opcode Includes (implicit or explicit) operand(s) Usually more than.
Background (Floating-Point Representation 101)  Floating-point represents real numbers as (± sig × 2 exp )  Sign bit  Significand (“mantissa” or “fraction”)
1 Lecture 5 Floating Point Numbers ITEC 1000 “Introduction to Information Technology”
CEN 316 Computer Organization and Design Computer Arithmetic Floating Point Dr. Mansour AL Zuair.
Computing Systems Basic arithmetic for computers.
Modifying Floating-Point Precision with Binary Instrumentation Michael Lam University of Maryland, College Park Jeff Hollingsworth, Advisor.
Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating.
Data Representation in Computer Systems
Computer Science Engineering B.E.(4 th sem) c omputer system organization Topic-Floating and decimal arithmetic S ubmitted to– Prof. Shweta Agrawal Submitted.
HW/SW PARTITIONING OF FLOATING POINT SOFTWARE APPLICATIONS TO FIXED - POINTED COPROCESSOR CIRCUITS - Nalini Kumar Gaurav Chitroda Komal Kasat.
University of Maryland Dynamic Floating-Point Error Detection Mike Lam, Jeff Hollingsworth and Pete Stewart.
University of Maryland Using Dyninst to Measure Floating-point Error Mike Lam, Jeff Hollingsworth and Pete Stewart.
Chapter 3 Arithmetic for Computers. Chapter 3 — Arithmetic for Computers — 2 Arithmetic for Computers Operations on integers Addition and subtraction.
Cosc 2150: Computer Organization Chapter 9, Part 3 Floating point numbers.
Backgrounder: Binary Math
Computer Architecture & Operations I
Topic 3d Representation of Real Numbers
Data Representation Data Types Complements Fixed Point Representation
Approximations and Round-Off Errors Chapter 3
COMS 361 Computer Organization
Floating Point Numbers
Morgan Kaufmann Publishers Arithmetic for Computers
COMS 161 Introduction to Computing
Topic 3d Representation of Real Numbers
Computer Organization and Assembly Language
Presentation transcript:

Binary-Level Tools for Floating-Point Correctness Analysis Michael Lam LLNL Summer Intern 2011 Bronis de Supinski, Mentor

Background Floating-point represents real numbers as (± sgnf × 2 exp ) o Sign bit o Exponent o Significand (mantissa or fraction) Floating-point numbers have finite precision o Single-precision: 24 bits (~7 decimal digits) o Double-precision: 53 bits (~16 decimal digits) Significand (23 bits)Exponent (8 bits) IEEE Single Significand (52 bits)Exponent (11 bits) IEEE Double

Example π … 3 Images courtesy of BinaryConvert.com Single-precision Double-precision

1/ Images courtesy of BinaryConvert.com Single-precision Double-precisionExample

Motivation Finite precision causes round-off error o Compromises ill-conditioned calculations o Hard to detect and diagnose Increasingly important as HPC scales o Need to balance speed (singles) and accuracy (doubles) o Double-precision may still fail on long-running computations 5

Previous Solutions Analytical (Wilkinson, et al.) o Requires numerical analysis expertise o Conservative static error bounds are largely unhelpful Ad-hoc o Run experiments at different precisions o Increase precision where necessary o Tedious and time-consuming 6

Our Approach Run Dyninst-based mutator o Find floating-point instructions o Insert new code or a call to shared library Run instrumented program o Analysis augments/replaces original program o Store results in a log file View output with GUI 7

Advantages Automated (vs. manual) o Minimize developer effort o Ensure consistency and correctness Binary-level (vs. source-level) o Include shared libraries without source code o Include compiler optimizations Runtime (vs. compile time) o Dataset and communication sensitivity 8

Previous Work Cancellation detection o Logs numerical cancellation of binary digits Alternate-precision analysis o Simulates re-compiling with different precision 9

Summer Contributions Cancellation detection o Improved support for multi-core analysis Overflow detection o New tool for logging integer overflow o Possibilities for extension and incorporation into floating-point analysis Alternate-precision analysis o New in-place analysis o Much-improved performance and robustness 10

Cancellation Loss of significant digits during subtraction operations Cancellation is a symptom, not the root problem Indicates that a loss of information has occurred that may cause problems later (7) (7) (7) (7) (2) (0) (5 digits cancelled) (all digits cancelled)

Cancellation Detector Instrument every addition and subtraction o Simple exponent-based test for cancellation o Log the results to an output file 12

Better support for multi-core o Log to multiple files o Future work: exploring GUI aggregation schemes Ran experiments on AMG Contributions

Contributions New proof-of-concept tool o Instruments all instructions that set OF (the overflow flag) o Log instruction pointer to output o Works on integer instructions o Introduces ~10x overhead Future work o Pruning false positives o Overflow/underflow detection on floating-point instructions o NaN/Inf detection on floating-point instructions 14

Alternate-precision Analysis Previous approach o Replace floating-point values with a pointer oShadow values allocated on heap Disadvantages o Major change in program semantics (copying vs. aliasing) o Lots of pointer-related bugs o Required function calls and use of a garbage collector o Large performance impact (> x) o Increased memory usage (>1.5x) 15

downcast conversion New shadow-value analysis scheme o Narrowed focus: doubles singles o In-place downcast conversion (no heap allocations) o Flag in the high bits to indicate replacement Double Replaced Double 7FF4DEAD Non-signalling NaN Single 16Contributions

Contributions Simpler analysis o Instrument instructions w/ double-precision operands o Check and replace operands o Replace double-precision opcodes o Fix up flags if necessary Streamlined instrumentation o Insert binary blobs of optimized machine code o Pre-generated by mini-assembler inside mutator o Prevents overhead of added function calls o No memory overhead 17

Example gvec[i,j] = gvec[i,j] * lvec[3] + gvar 1movsd 0x601e38(%rax, %rbx, 8) %xmm0 2mulsd -0x78(%rsp) %xmm0 3addsd -0x4f02(%rip) %xmm0 4movsd %xmm0 0x601e38(%rax, %rbx, 8) 18

gvec[i,j] = gvec[i,j] * lvec[3] + gvar 1movsd 0x601e38(%rax, %rbx, 8) %xmm0 check/replace -0x78(%rsp) and %xmm0 2mulss -0x78(%rsp) %xmm0 check/replace -0x4f02(%rip) and %xmm0 3addss -0x20dd43(%rip) %xmm0 4movsd %xmm0 0x601e38(%rax, %rbx, 8) 19Example

XMM register Challenges Currently handled o %rip - and %rsp -relative addressing o %rflags preservation o Math functions from libm o Bitwise operations ( AND/OR/XOR/BTC ) o Size and type conversions o Compiler optimization levels o Packed instructions 20 IEEE Single downcast conversiondowncast conversion IEEE Double downcast conversiondowncast conversion IEEE Single 0x7FF4DEAD

Challenges Future work o 80-bit long double precision o 16-bit IEEE half-precision o 128-bit IEEE quad-precision o Width-dependent random number generation o Non-gcc compilers o Arcane floating-point hacks Sqrt: (1 > 1) - (1<<22) Fast InvSqrt: 0x5f3759df – (val >> 1) 21

Results Runs correctly on Sequoia kernels and other examples: AMGmk4x CrystalMk4x IRSmk7x UMTmk3x LULESH4x Real code with manageable overhead Future work: more optimization Future work: run on full benchmarks 22

Conclusion Cancellation detection o Improved support for multi-core analysis Overflow detection o New tool for logging integer overflow o Possibilities for extension and incorporation into floating-point analysis Alternate-precision analysis o New in-place analysis o Much-improved performance and robustness 23

Future Goals Selective analysis o Data-centric (variables or matrices) o Control-centric (basic blocks or functions) Analysis search space o Minimize precision o Maximize accuracy Goal: Tool for automated floating-point precision analysis and recommendation 24

Acknowledgements Jeff Hollingsworth, University of Maryland (Advisor) Bronis de Supinski, LLNL (Mentor) Tony Baylis, LLNL (Supervisor) Barry Rountree, LLNL Matt Legendre, LLNL Greg Lee, LLNL Dong Ahn, LLNL Thank you! 25

Bitfield Templates Double Single 26 XMM register IEEE Single downcast conversiondowncast conversion IEEE Double downcast conversiondowncast conversion IEEE Single 0x7FF4DEAD