A New Speculation Technique to Optimize Floating-Point Performance while Preserving Bit-by-Bit Reproducibility
Abstract
The bit-by-bit reproducibility of floating-point results, which is defined by the IEEE 754 standard, prohibits optimizations such as re-association and the use of native operations such as fused multiply-add (FMA), and thus it significantly impairs floating-point performance. Recent network-oriented languages such as Java strictly conform to the standard, and thus their numerical computing performance becomes inherently lower than conventional languages. In this paper, we propose a new software technique, called floating-point (FP) speculation, to optimize floating-point performance while preserving the bit-by-bit reproducibility of the results. We execute the fast unsafe code and the slow verification code in parallel. The unsafe code does not wait for the verification code, and is immediately followed by the subsequent code that uses the probable result from the unsafe code assuming the speculation will succeed. The improvement from FP speculation results from this earlier start of the subsequent code. Unlike other speculation techniques, FP speculation does not require any special instructions or hardware support. Rather, it exploits unused floating-point registers and execution units. Therefore it is generally applicable for processor architectures that have sufficient floating-point resources.