Daily Speculations The Web Site of Victor Niederhoffer and Laurel Kenner

15-Jun-2006
Significant Digits, by Philip J. McDonnell

One thinks back with mixed feelings to the days of the first graduating class in Computer Science from UC Berkeley. Our class was the first to officially have the term "Computer Science." Prior to that the relevant degrees were Math with a minor in Numerical Analysis or Electrical Engineering. Berkeley's Computer Science department was one of the first in the country as well.

At Berkeley we had Control Data machines with their 60 bit word size. This naturally allowed floating point arithmetic with plenty of significant digits - about 13 as I recall. As a programmer at Stanford Linear Accelerator Center we had a \$15 million IBM machine which offered both single and double precision arithmetic. It also featured 2 mb of RAM, very roomy for its day. One of the things I learned at Stanford was the difficulty of trying to use 6-digit single precision arithmetic to do calculations with lots of numbers.

Consider the problem of adding numbers using say 3 digits of precision. Suppose we have 1,000 numbers with average values around 500. By the time we get halfway though our calculation we have a running total of about 250,000. But in three digit arithmetic that is really 250,xxx. We effectively drop the last three digits. Thus when we add the next number we are doing the following calculation:

``` 250,xxx
+   500
-------
250,xxx```

Note that adding the 500 has "no" possible effect on the outcome. Because we are only halfway through our computations we need to realize that we are effectively defenestrating half our data!

By convention when academics publish papers and data they are expected not to overstate their findings. In other words if their data are accurate to only three decimal places because of measurement, statistical or round off error they are expected to round those results to the representative number of significant digits. However this leaves us with a quandary as to what happens when others try to use their data. This problem is exacerbated when others try to use large quantities of such data in calculations. This is a strong argument for publishing more significant digits in raw data but accompanied by disclosure of the actual significance of the results. It also argues for the use of the now ubiquitous 16-digit precision arithmetic for all calculations so that we do not add round-off error to whatever uncertainty is already intrinsically in our data.