Background (Floating-Point Representation 101)  Floating-point represents real numbers as (± sig × 2 exp )  Sign bit  Significand (“mantissa” or “fraction”)

Background (Floating-Point Representation 101)  Floating-point represents real numbers as (± sig × 2 exp )  Sign bit  Significand (“mantissa” or “fraction”)  Exponent  Floating-point numbers have finite binary precision  Single-precision: 24 binary digits (~7 decimal digits)  Double-precision: 53 binary digits (~16 decimal digits)  Examples:  π  3.141592…  11.0010010 …  1/10  0.1  0.0001100110 … 2 Image from Wikipedia (“Single precision”)

Motivation  Finite precision causes round-off error  Compromises ill-conditioned calculations  Hard to detect and diagnose  Increasingly important as HPC scales  Need to balance speed and accuracy  Lower precision is faster  Higher precision is more accurate  Industry-standard double precision may still fail on long- running computations 3

Previous SolutionsPrevious Solutions  Analytical  Requires numerical analysis expertise  Conservative static error bounds are largely unhelpful  Ad-hoc  Run experiments at different precisions  Increase precision where necessary  Tedious and time-consuming 4

Instrumentation SolutionInstrumentation Solution  Automated (vs. manual)  Minimize developer effort  Ensure consistency and correctness  Binary-level (vs. source-level)  Include shared libraries without source code  Include compiler optimizations  Runtime (vs. compile time)  Dataset and communication sensitivity 5

Solution ComponentsSolution Components  Dyninst-based instrumentation utility (“mutator”)  Cross-platform  No special hardware required  Stack walking and binary rewriting  Shared library with runtime analysis routines  Flexibility and ease of development  Java-based log viewer GUI  Cross-platform  Minimal development effort 6

Analysis ProcessAnalysis Process  Run mutator  Find floating-point instructions  Insert calls to shared library  Run instrumented program  Executes analysis alongside original program  Stores results in a log file  View output with GUI 7

Analysis TypesAnalysis Types  Cancellation detection  Shadow-value analysis 8

Cancellation  Loss of significant digits during subtraction operations  Cancellation is a symptom, not the root problem  Indicates that a loss of information has occurred that may cause problems later 9 1.613647 (7) 1.613647 (7) - 1.613635 (7) - 1.613647 (7) 0.000012 (2) 0.000000 (0) (5 digits cancelled) (all digits cancelled) 1.6136473 - 1.6136467 0.0000006

Detecting CancellationDetecting Cancellation  For each addition/subtraction:  Extract value of each operand  Calculate result and compare magnitudes (binary exponents)  If e ans < max(e x,e y ) there is a cancellation  For each cancellation event:  Calculate “priority:” max(e x,e y ) - e ans  If above threshold, save event information to log  For some events, record operand values 10

Experiments  Gaussian elimination  Benefits of partial pivoting  Differing runtime behavior of popular algorithms 13

Gaussian EliminationGaussian Elimination A  [L,U] 14  Partial pivoting  Nominally to avoid division by zero  Also avoids inaccurate results from small pivots  This can be detected using cancellation swap

15 cancellation loss of data pivot

Gaussian CancellationGaussian Cancellation 16 Cancellation Counts

Gaussian EliminationGaussian Elimination  This suggests that cancellation can be used to detect the effects of a small pivot  Useful in sparse elimination with limited ability to pivot  Threshold must be kept high enough 17

Gaussian EliminationGaussian Elimination A  [L,U] Classical Bordered 18

19 Size of diagonal elements Iterations of algorithm ClassicalBordered ClassicalBordered threshold1234512345 smallest diag. value 10 -5 14810087654 10 -10 29231611388776 10 -15 393327211799988

Gaussian EliminationGaussian Elimination  Classical method: many small cancellations  Bordered method: fewer but larger cancellations  Our tool can detect these differences and inform the developer, who can then make decisions regarding which algorithm to use 20

Other ResultsOther Results  Approximate nearest neighbor  More cancellations in denser point sets  SPEC benchmarks milc and lbm  Cancellations in error calculations indicate good results  SPEC benchmark povray  Cancellations indicate color black 21

Conclusions  It is important to vary the threshold  Most calculations have background cancellations  Small cancellations can hide large ones  Cancellation results require interpretation by someone who is familiar with the algorithm  Properly employed, cancellation detection can help find “trouble spots” in numerical codes 22

Ongoing ResearchOngoing Research  Shadow value analysis  Replace floating-point numbers with pointers to auxiliary information (higher precision, etc.) 23 double x = 1.0; void func() { double y = 4.0; x = x + y; } printf(“%f”, x); 1.000 4.000 5.000 “shadow value”

Shadow Value AnalysisShadow Value Analysis  Current status: allows programmers to automatically test their entire program in different precisions  Next step: selectively instrument particular code blocks or data structures  Goal: automated floating-point analysis and recommendation framework 24

Thank you!Thank you!  Code available upon request  Questions? 25

26 Size of diagonal elements Iterations of algorithm ClassicalBordered threshold12345 smallest diag. value CBCBCBCBCB 10 -5 14887160504 10 -10 29823816711736 10 -15 399339279218178

Gaussian CancellationGaussian Cancellation 27

Background (Floating-Point Representation 101)  Floating-point represents real numbers as (± sig × 2 exp )  Sign bit  Significand (“mantissa” or “fraction”)

Similar presentations

Presentation on theme: "Background (Floating-Point Representation 101)  Floating-point represents real numbers as (± sig × 2 exp )  Sign bit  Significand (“mantissa” or “fraction”)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Background (Floating-Point Representation 101)  Floating-point represents real numbers as (± sig × 2 exp )  Sign bit  Significand (“mantissa” or “fraction”)

Similar presentations

Presentation on theme: "Background (Floating-Point Representation 101)  Floating-point represents real numbers as (± sig × 2 exp )  Sign bit  Significand (“mantissa” or “fraction”)"— Presentation transcript:

Similar presentations

About project

Feedback