Re: Floating-point formats and standards

Re: Floating-point formats and standards Bradd W. Szonye 05 Jan 2005 12:22 UTC
A couple of corrections to the 754R description:

Bradd W. Szonye wrote:
> New name    Sig   Exp   Old name   Currently implemented by
>
> binary16     11     5
> binary32     23     8   single     all systems (hardware)
> binary64     52    11   double     all systems (hardware)
> binary80     64    15   extended   all x86-based systems (hardware)
> binary128   112    15   quad       most RISC systems (software)

In the current draft, the extended format is called "binaryx" instead of
"binary80." Implementations are supposed to provide at least one high-
precision format for intermediate calculations, either binary128 (a
basic format) or binaryx (an implementation-defined format about 50%
more precise than its best basic format).

One proposal recommends specifying binaryx as the x86 extended format. I
have no way of knowing for certain, but I suspect that it will succeed,
since x86 and quad are the only /de facto/ standards for high-precision
IEEE 754 flonums.

That proposal also states which formats a system should support.

For high-performance technical systems:

    Binary64 is mandatory for computation and storage.
    Binary32 is recommended for low-precision, high-density storage.
    Binaryx is recommended for computation on x86-compatible systems.
    Binary128 is recommended for expression evaluation (i.e., temps).

For commercial and financial systems:

    Decimal128 is mandatory for computation and storage.
    Decimal32 and decimal64 are recommended for storage.

The binary requirements match reality pretty well, except for the
binary128 recommendation. (Currently, only x86 systems use high-
precision temps, and they use binaryx instead of binary128.)

Hope this helps.
--
Bradd W. Szonye
http://www.szonye.com/bradd