Avoiding Measurement Errors from Manipulating Data in Software Are you sure your software isn’t adding uncertainty to your measurements??? Speaker/Author: Logan Kunitz Staff Calibration Engineer National Instruments R&D My motivation for this presentation was based on my perception that in the calibration and metrology industry, software is being used more often to automate or execute measurements. As a software engineer who also develops automated calibration software, I was motivated to analyze the errors that could be introduced within code, and come up with best practices to ensure that these errors are not adding additional uncertainty to the measurements.
Sources of Measurement Error Acquire Data ADC + _ Process and Display Reading Signal Source External Conditioning and Cabling Offset / Bias and Noise Errors Digitization and Sampling Errors Software Errors First, let’s review the different sources of error you can encounter when making a measurement. These are the errors that will be included in an uncertainty analysis of your results. With the signal source you can introduce uncertainty in the form of offset/bias errors and noise errors. Then the signal may pick up additional noise / offets from external conditioning and cabling. At the measurement device, additional errors can be introduced when sampling the data from the ADC which can include digitization and sampling errors. Finally, the data must be processed into the final results that will be displayed to the end user. Generally when determining the total uncertainty of a measurement, the source and measurement device errors are considered, and software errors are assumed to be negligible. These are the errors we’ll be discussing, and we’ll quantify the errors to ensure that they are, indeed, negligible.
Sources of Software Errors Coding Mistakes / Logic Errors Often a logic error in software will result in very large errors, but sometimes these errors can be small enough to go undetected Example – units conversion issues (imperial vs. metric) Data Processing Errors i.e. Linear Approximation, Data Extrapolation, Averaging, etc… Data Manipulation Errors Artifacts of how data is handled by the compiler and processor Quantify to ensure negligible impact to the measurement Examples: Rounding Errors Numerical Errors Sources of software errors that can impact a measurement result.
Defining “Negligible” Error Impact Software Error Expectation: “Negligible Impact” Should not need to account for these errors in uncertainty analysis Ok… so, how do I define negligible? “Really Small” Just go a ‘couple’ digits past the least significant digit “10 times smaller is enough” As long as it doesn’t affect the application Unable to find a quantifiable and objective definition So… I came up with my own definition
Logan’s Definition of “Negligible” Error Impact Negligble U(x) Definition A number that does not affect reported measurement uncertainty U(y) with >95% confidence If Uncertainty U(y) = 2.3 U(x) <0.005 95% chance of not affecting uncertainty U(y) Not intended to be a “gold standard” for a definition of negligible. This is meant more as a reference for some of the subsequent calculations. My definition of negligible is a number that won’t affect the reported uncertainty with 95% confidence. Where did 95% come from? Comes from the typical confidence level assigned to most expanded uncertainties. If a typical uncertainty is represented as 2.3 with 2 digits of precision displayed, an error of <0.005 would represent a 95% confidence of not affecting this uncertainty.
Rounding Errors When the number of displayed digits is truncated (and rounded) 1.2468642 -> 1.2469 -> 1.2 -> 1 Error from Rounding +/- ½ Least Significant Digit or 5x10(-Digits) Example: 2 Digits Displayed 1.246 -> 1.2 Error = +/- 0.05 Negligible = Equivalent to rounding to 1 digit past Least Significant Digit of Uncertainty 1.3 1.2 Actual Value Reading = 1.2 1.1 Negligible – equivalent to rounding to 1 digit past the least significant digit of uncertainty. 1.0
Rounding Error Use Cases Numeric -> String Conversions When numeric data gets converted to a string, least significant digits will be rounded/truncated based on the application requirements Examples: Displaying measurement results in a user interface i.e. DMM # of Digits Logging results to a report Logging data to a file (such as .csv, XML, Excel, etc..), to be read back by the same or different application Converting a number to string for data transfer (i.e. GPIB / Serial communication with and instrument, or transferring to another application over TCP/IP) Suggestion: Add reference to Excel here.
Selecting Format for Numeric Representation Format Type Example Advantages/Disadvantages Standard Floating Point Representation 1234.56789 Advantage: Simple Disadvantage: Wasted 0’s displayed for large and small numbers (i.e. 120000000, or 0.000000456) SI notation 1.23456789k Advantage: Reduce # 0’s reported for large and small numbers Disadvantage: Limited set of prefixes (i.e. no prefix for 10^-5) Engineering/ Scientific Notation 1.23456789E3 or 1.23456789x103 Advantage: More flexible exponent over SI notation Advantage: Common format for instrument communication and data storage Binary Representing the binary equivalent of the number in a string (byte array) @“JE„ôÆç Advantage: No precision loss Advantage: Smaller data footprint, lower storage space required Disadvantage: Implementation is application-specific
Numerical Errors Errors resulting from choice of datatype Floating Point Numbers Single Precision Double Precision Extended Precision Integers (i16, i32, i64, u8, etc…) Numeric Datatypes are Approximations Limitation of computers’ representation of the numeric values | 100 150 50 1.25 5.00 2.50 3.75 6.25 7.50 8.75 10.00 High Resolution vs. Low Resolution
Floating Point Numeric Details Single -> Double-> IEEE 754 standardizes the floating point implementation Single (32 bit) and Double (64 bit) are most common and consistent implementations Extended precision (80 bit) Other Numeric Data-types for High Precision in Software 128 bit Extended Precision Arbitrary-Precision Arithmetic Extended 80-bit: Not a significant precision improvement over double. Design optimized to handle overflow conditions for intermediate calculations 128-bit Float and Aribitrary Precision Arithmetic are not standard in most environments, but can be used to extend the precision of numerics when necessary.
Numerical Error Example Error representing Pi using different numeric datatypes Datatype Value of Pi Aprox. Digits Error Actual 3.1415926535897932384626433832795028… - Single Precision Float (32 bit) 3.1415927410125732400 7 2.78E-6% Double Precision Float (64 bit) 3.1415926535897926700 15 1.8E-14% Extended Double Precision Float (80 bit) 3.1415926535897932380 19 1.4E-17% Integer 3 4.5% Make sure to explain that floating numbers can show any arbitrary number of digits, many won’t be accurate. Showing 20 digits for each number.
Machine Epsilon: Quantified error for a floating point number Value is the ratio of maximum error to the stored value Development Environments (ADEs) often have a function to calculate Datatype Machine Epsilon Decimal Digits of Precision Single Precision 2 -23 = 1.19e-07 ~7.2 Digits Double Precision 2 -52 = 2.22e-16 ~15.9 digits Extended Double (80-bit) 2 -63 = 1.08e-19 ~19.2 digits
Using Machine Epsilon The following functions describe basic methods for applying machine epsilon based on the math operations of the components Note: x and y will only be different if X and Y are different datatypes Function Examples Error Equations Error for a Single Value Addition Operations Multiplication Operations Note: Epsilon(x) and Epsilon(y) are the same as long as they have the same floating datatype.
Issues Comparing Floating Point Numbers Using “equals” operation with floating numbers can fail to return expected resulted Example: Evaluates to “False” C version G version Why? Complicated Because 0.1 is an approximation when represented as floating number This example is comparing the operation 0.01/0.1 to 0.1 in software, which should mathematically be equal.
Use Machine Epsilon to Compare Floating Point Numbers Apply calculated machine epsilon error to the floating point results Example: C version G version
Datatype Conversions Errors associated with conversion operations on datatypes in an application Many conversion handled “automatically” by the ADE Example: function z = x + y C Example: Single z; Double x; Double y; z = x + y; G Example (LabVIEW) Double precision numbers get coerced to Single precision after addition operation Use machine epsilon of coerced datatype to determine error Many conversions are handled automatically by the ADE It can be a problem if your datatypes are defined in one end of your application, but the conversion happens somewhere else.
Conclusion Quantify if an error is ‘negligible’ for your applications Avoid rounding errors when converting from numeric -> string Numeric datatype selection should match precision requirements of your data Use machine epsilon to quantify error of floating precision numbers Never compare two floating point numbers using the ‘equals’ operator References: "IEEE Floating Point." Wikipedia. <http://en.wikipedia.org/wiki/IEEE_floating_point>. "Extended Precision." Wikipedia. <http://en.wikipedia.org/wiki/Extended_precision>. "Machine Epsilon." Wikipedia. <http://en.wikipedia.org/wiki/Machine_epsilon>. "Floating-point Errors." UAlberta.ca. <http://www.ualberta.ca/~kbeach/comp_phys/fp_err.html>. "Arbitrary-precision Arithmetic." Wikipedia. <http://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic>. "Digital Multimeter Measurement Fundamentals." NI.com. <http://www.ni.com/white-paper/3295/en/>. Schmidt, Darren. "An Introduction to Floating-Point Behavior in LabVIEW." NI.com. <ftp://ftp.ni.com/pub/devzone/tut/floating_point_in_labview.pdf>.
Questions?