Comparing floating point numbers

There could be roundings involved and/or infinite decimal expansions that can lead to inaccurate representations or better say, losing of some precision off the number. Owing to that, even though you would expect a number to match a definite expansion that the calculation may have lead to, it may not.

An easy way to handle that is to choose a reasonable tolerance in the comparison. For example, we could choose a tolerance of 0.0000001 when comparing two doubles for equality such that the absolute value of their difference if less than this value then, you could consider them to be equal or better say, ‘close enough’ for most reasonable cases. This tolerance can, however, be insufficient if:

  1. The numbers are very large or if the numbers are very small in value for the tolerance to be able to compare to their difference.
  2. If there have been multiple arithematic operations performed that could lead to compounded incorrectness (owing to multiple roundings/loss of precision involved) in the result.

In C++, there is a value std::numeric_limits::epsilon() that is referred to sometimes as machine epsilon value/tolerance or machine accuracy. It gives the difference between 1 and the smallest value greater than 1 that is representable for data type for which the template’s instantiated, which is double here. This helps us with the standardization of the tolerance value that we previously had chosen as 0.0000001. But doesn’t help us with the limitations 1) and 2) pointed out above.

There could more factors than that mentioned in 2) that can cause the rounding error. Apart from arithematic operations, it could be type promotions or binary conversions that can cause more than epsilon() rounding error. So, combination of 2 arithematic operations could lead to more than 2*epsilon. I don’t know how quantifiable that is, but by a safer choice of N (number of arithematic operations involved), we hope to be able to overcome the problem in most practical cases.

For the short-coming mentioned in 1), we would need to bring the epsilon to the degree of large-ness or small-ness of the numbers involved.

  1. Trap Handlers, Sticky Bits, and Floating-point Comparisons
  2. Floating-point comparison algorithms
  3. numeric_limits::epsilon