Download presentation
Presentation is loading. Please wait.
Published byKerry Mason Modified over 9 years ago
2
Background (Floating-Point Representation 101) Floating-point represents real numbers as (± sig × 2 exp ) Sign bit Significand (“mantissa” or “fraction”) Exponent Floating-point numbers have finite binary precision Single-precision: 24 binary digits (~7 decimal digits) Double-precision: 53 binary digits (~16 decimal digits) Examples: π 3.141592… 11.0010010 … 1/10 0.1 0.0001100110 … 2 Image from Wikipedia (“Single precision”)
3
Motivation Finite precision causes round-off error Compromises ill-conditioned calculations Hard to detect and diagnose Increasingly important as HPC scales Need to balance speed and accuracy Lower precision is faster Higher precision is more accurate Industry-standard double precision may still fail on long- running computations 3
4
Previous SolutionsPrevious Solutions Analytical Requires numerical analysis expertise Conservative static error bounds are largely unhelpful Ad-hoc Run experiments at different precisions Increase precision where necessary Tedious and time-consuming 4
5
Instrumentation SolutionInstrumentation Solution Automated (vs. manual) Minimize developer effort Ensure consistency and correctness Binary-level (vs. source-level) Include shared libraries without source code Include compiler optimizations Runtime (vs. compile time) Dataset and communication sensitivity 5
6
Solution ComponentsSolution Components Dyninst-based instrumentation utility (“mutator”) Cross-platform No special hardware required Stack walking and binary rewriting Shared library with runtime analysis routines Flexibility and ease of development Java-based log viewer GUI Cross-platform Minimal development effort 6
7
Analysis ProcessAnalysis Process Run mutator Find floating-point instructions Insert calls to shared library Run instrumented program Executes analysis alongside original program Stores results in a log file View output with GUI 7
8
Analysis TypesAnalysis Types Cancellation detection Shadow-value analysis 8
9
Cancellation Loss of significant digits during subtraction operations Cancellation is a symptom, not the root problem Indicates that a loss of information has occurred that may cause problems later 9 1.613647 (7) 1.613647 (7) - 1.613635 (7) - 1.613647 (7) 0.000012 (2) 0.000000 (0) (5 digits cancelled) (all digits cancelled) 1.6136473 - 1.6136467 0.0000006
10
Detecting CancellationDetecting Cancellation For each addition/subtraction: Extract value of each operand Calculate result and compare magnitudes (binary exponents) If e ans < max(e x,e y ) there is a cancellation For each cancellation event: Calculate “priority:” max(e x,e y ) - e ans If above threshold, save event information to log For some events, record operand values 10
11
11
12
12
13
Experiments Gaussian elimination Benefits of partial pivoting Differing runtime behavior of popular algorithms 13
14
Gaussian EliminationGaussian Elimination A [L,U] 14 Partial pivoting Nominally to avoid division by zero Also avoids inaccurate results from small pivots This can be detected using cancellation swap
15
15 cancellation loss of data pivot
16
Gaussian CancellationGaussian Cancellation 16 Cancellation Counts
17
Gaussian EliminationGaussian Elimination This suggests that cancellation can be used to detect the effects of a small pivot Useful in sparse elimination with limited ability to pivot Threshold must be kept high enough 17
18
Gaussian EliminationGaussian Elimination A [L,U] Classical Bordered 18
19
19 Size of diagonal elements Iterations of algorithm ClassicalBordered ClassicalBordered threshold1234512345 smallest diag. value 10 -5 14810087654 10 -10 29231611388776 10 -15 393327211799988
20
Gaussian EliminationGaussian Elimination Classical method: many small cancellations Bordered method: fewer but larger cancellations Our tool can detect these differences and inform the developer, who can then make decisions regarding which algorithm to use 20
21
Other ResultsOther Results Approximate nearest neighbor More cancellations in denser point sets SPEC benchmarks milc and lbm Cancellations in error calculations indicate good results SPEC benchmark povray Cancellations indicate color black 21
22
Conclusions It is important to vary the threshold Most calculations have background cancellations Small cancellations can hide large ones Cancellation results require interpretation by someone who is familiar with the algorithm Properly employed, cancellation detection can help find “trouble spots” in numerical codes 22
23
Ongoing ResearchOngoing Research Shadow value analysis Replace floating-point numbers with pointers to auxiliary information (higher precision, etc.) 23 double x = 1.0; void func() { double y = 4.0; x = x + y; } printf(“%f”, x); 1.000 4.000 5.000 “shadow value”
24
Shadow Value AnalysisShadow Value Analysis Current status: allows programmers to automatically test their entire program in different precisions Next step: selectively instrument particular code blocks or data structures Goal: automated floating-point analysis and recommendation framework 24
25
Thank you!Thank you! Code available upon request Questions? 25
26
26 Size of diagonal elements Iterations of algorithm ClassicalBordered threshold12345 smallest diag. value CBCBCBCBCB 10 -5 14887160504 10 -10 29823816711736 10 -15 399339279218178
27
Gaussian CancellationGaussian Cancellation 27
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.