High Precision Numerical Computing:
The 5th dimension of computing power
The performance of a computing hardware is usually described by
i the CPU/bus clock frequency,
i the size of the instructions dictating both maximum memory size and the
number of parallel execution units,
i the size of the memory and caches
i the speed of the interconnection with other CPU's for parallel system.
This is essentially a 4-D optimization problem. However, there is a 5th dimension,
somewhat overlooked: the floating-point unit (FPU) bit size.

Available both on 32 and 64 bits CPUs, the double precision (64/80 bits) FPU provides an accuracy
good enough for most applications, but
quadruple precision capabilities are becoming more and more
important for many research domains, not only in high energy physics but also for non-linear process
simulation, number theory and also for commercial applications like finite
element modeling CAD, 3-D real-time graphic, statistics or security cryptography.
High precision computation not only gives more precise results but, when large number
cancellations or rounding errors play an important role, leads more surely to the
correct results.
In addition, high precision improves the convergence of some iterative algorithms
(like linear equation solving systems) reducing
the number of necessary iterations. This is virtually equivalent to an overall computational
speed increase.
Unfortunately the gain is compensated by the
additional computing time imposed by the software emulation of higher precision computations.
Conventional libraries implementing quadruple precision floating-point in C++/F90
show performance drop by a factor 10-20 compared to double precision for basic operations on PC.
The Linpack benchmark jumps by a factor 36 when converted to quadruple precision.
Providing hardware 128 bits FPU precision would result, then, in a substantial computing time reduction
and therefore in a better FLOPS figure.
In these cases, numerical accuracy is really the 5th dimension of computer system performance.
This is a very complex issue that has triggered the setting up of a pluri-disciplinary
working group to propose the best software/hardware balance for achieving high
precision numerical computation (quadruple/octuple). The conception and proto-design of a
dedicated co-processor under the proposed acronym
*HAPPY* for ``High Arithmetic
Precision Processor Yoke'' is part of this study.
Many numerical calculations would get a dramatic improvement and the co-processor
would become an important asset in the Petaflops race.
--
DenisPerretGallix - 19 Oct 2005