Modulo Trouble

Missed it by that much.
Maxwell Smart

Simple problems in color science often involve tables of data and numerous calculations just to get back-of-the-envelope results. As a trivial example, determining the chromaticity of an arbitrary spectral sample requires 120 multiplications and sums.¹ That would take a large envelope–not to mention considerable time. Excel is a fine tool for this kind of task. It handles tables well, easily shows intermediate results, and draws nice graphs on a whim. But there are some quirks.

Excel has a few issues with computation. One spreadsheet reported a #VALUE! error due to an out of range table index. This used the simple formula:

=INT(MOD(number,divisor))+1

But even though number and divisor are always positive, the resulting index was sometimes zero, for example INT(MOD(18,0.9))+1 = 0. Impossible! But unfortunately all too real. It turns out Excel reckoned MOD(18,0.9) = -4.4E-16. The correct answer is zero, since there is no remainder when dividing 18 by 0.9. Excel is very close, but unfortunately INT(-4.4E-16) = -1 which is not close at all. It seemed their MOD function was misbehaving. Excel’s documentation says:

The MOD function can be expressed in terms of the INT function:
=number-divisor*INT(number/divisor)

But obviously that is not its behavior. So I silently cursed the unknown programmer, switched explicitly to the documented formula, and watched my problems disappear. But this is not the end of the story. A few months later I caught an error in C++ where fmod(18,0.9) = 0.9. How can the remainder be the same as the divisor? Out of curiosity I checked all the modulo functions easily at hand. Here are my results:

Software	Result
Excel	-4.44089E-16
JavaScript	0.8999999999999996
C++ `fmod`	0.9
C# `Math.IEEERemainder`	-4.44089209850063E-16
perl	Illegal modulus zero at – line 1.
php	Warning: Division by zero in – on line 2
Graphing Calculator	0
bc (version 1.6)	0
awk (version 20070501)	0
Google Calculator	0

The most useful answer is zero, but some of the others are perhaps more reasonable than they appear. The fundamental issue is that floating point numbers become problematic when applied to sharp cutoffs.

Even in a perfect world, 0.9 cannot be exactly represented in floating point. This means there are two exact floats that form a minimum interval containing 0.9, and one will be chosen as the approximation. When a discontinuous function like modulo is applied, one of these two exact floats will be the cutoff point. Depending on the choices, the modulo result will either be zero or just short of the divisor. Modulo is therefore a very noisy function, where the smallest possible change on the input can make the largest possible swing on the output. This abrupt change is also its utility.

What bit me was the failure to preserve customary invariants, namely that the result of the modulo operation on two positive numbers is strictly bound by 0 ≤ result < divisor. By this criteria Excel, C#², and C++ fmod all fail miserably. One really can’t justify the behavior except by noting that “reasonable” for floating point numbers has its own special meaning.

The moral of this story has two parts:

Don’t use floating point numbers unless you must, and
Floating point operation results may not conform to a range inferred by that operation.

At 10 nm accuracy there are approximately 40 samples in the visible spectrum, so with 3 sets of weights for the standard observer, there are a total of 3*40=120 each of multiplications and additions. ↩
See also the Math.EEEERemainder documentation for a detailed explanation of subtle differences with the modulus operator. ↩

Recent Posts

Categories

Archives