Re: Numbers with small mantissa widths

Show/hide message thread

Numbers with small mantissa widths Marc Nieper-Wißkirchen (27 Aug 2024 08:46 UTC)

Re: Numbers with small mantissa widths Michael Sperber (28 Aug 2024 14:18 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (28 Aug 2024 08:08 UTC)

Re: Numbers with small mantissa widths Michael Sperber (28 Aug 2024 15:14 UTC)

Re: Numbers with small mantissa widths Will Clinger (28 Aug 2024 15:12 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (28 Aug 2024 15:27 UTC)

Re: Numbers with small mantissa widths Will Clinger (28 Aug 2024 18:21 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (28 Aug 2024 21:09 UTC)

Re: Numbers with small mantissa widths Will Clinger (29 Aug 2024 04:45 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (29 Aug 2024 11:53 UTC)

Re: Numbers with small mantissa widths Will Clinger (29 Aug 2024 15:25 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (29 Aug 2024 15:57 UTC)

Re: Numbers with small mantissa widths Will Clinger (29 Aug 2024 19:21 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (31 Aug 2024 20:02 UTC)

Re: Numbers with small mantissa widths Will Clinger (01 Sep 2024 01:43 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (02 Sep 2024 15:57 UTC)

Re: Numbers with small mantissa widths Will Clinger (02 Sep 2024 23:48 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (03 Sep 2024 16:45 UTC)

Re: Numbers with small mantissa widths Bradley Lucier (04 Sep 2024 01:30 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (04 Sep 2024 06:11 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (04 Sep 2024 14:03 UTC)

Re: Numbers with small mantissa widths Will Clinger (13 Sep 2024 12:49 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (14 Sep 2024 09:43 UTC)

Re: Numbers with small mantissa widths Will Clinger (13 Sep 2024 12:50 UTC)

Re: Numbers with small mantissa widths Michael Sperber (29 Aug 2024 13:56 UTC)

Re: Numbers with small mantissa widths Marc Nieper-Wißkirchen (29 Aug 2024 11:38 UTC)

Re: Numbers with small mantissa widths Will Clinger 02 Sep 2024 23:29 UTC

This is a lengthy statement of my opinions and some facts
explaining those opinions, as I hope this can be my last
submission to the SRFI 77 discussion of this subject.

BRIEF RESPONSE TO MARC's MOST RECENT EMAIL.

Marc seems to think he can retain both his Claim 1 and his
Claim 2.  He's wrong about that.  To bypass further back
and forth about this, I'll just say his Claim 1 is the one
he cares about.  He made his Claim 2 as part of an argument
that the semantics of his Claim 1 is easier to implement
efficiently than I was saying.  Because he still doesn't
see why he can't retain both claims, I suspect his "simple"
algorithm actually relies on Claim 2 and therefore fails
to implement the semantics of his Claim 1.  All that means
is that, if/when he gets around to creating an algorithm
that truly implements his Claim 1 semantics, his algorithm
will be less efficient than he currently believes.

I will therefore assume Marc will eventually come to
realize he must choose between his Claim 1 and Claim 2,
and that he will, at that point, retain Claim 1 while
abandoning Claim 2.  That assumption allows me to
proceed to my final summary of the situation.

SEMANTICS.

In my previous missive, I somehow deleted an important
SRFI 77 and R6RS loophole from my statement of Marc's
Claim 1.  I'll correct that statement here.

    Claim 1.  The R6RS requires the x|p notation to be read
    as (1) the best possible binary floating-point
    approximation to x that uses p bits of significand (which
    is mathematically well-defined), and then, in systems
    that use binary floating-point representations for
    inexact reals, (2) that best possible p-bit binary
    floating-point approximation is converted to the best
    possible floating-point approximation that uses p bits
    "if practical, or by a greater precision if a p-bit
    significand is not practical, or by the largest available
    precision if p or more bits of significand are not
    practical within the implementation."

In the part of that claim that quotes R6RS, the three
"practical" loopholes are necessary because implementations
of the R6RS are unlikely to support inexact reals of all
possible precisions p.  In at least some implementations,
all inexact reals are represented in IEEE double precision,
with p=53.  Throughout most of this note, I will assume all
inexact reals are represented in IEEE double precision,
because dealing with multiple precisions for inexact reals
complicates the exposition to no "real" purpose.  I will
also assume the implementation uses correct rounding, which
is to say external representations of inexact reals (other
than the x|p notation, which I must treat separately) are
converted into the best possible IEEE double precision
approximation.

I will refer to claim 1 as "Marc's preferred semantics", and
will be comparing it to "Will's preferred semantics", which
is

Will's preferred semantics:

    The R6RS allows implementations to read the x|p notation
    by selecting a precision q that satisfies both (1) and (2):

        (1) the implementation supports q-bit floating point
	as one of its representations for inexact reals

        (2) one of the following is true:

            (a) q = p

	    (b) q > p, and the implementation does not support
	    inexact reals of any precision that lies between
	    p and q

            (c) q < p, but the implementation does not support
	    inexact reals of any precision greater than q

    Then, having selected q, x|p is read as an inexact real
    whose value is the best possible q-bit approximation to x.

It is my opinion that the deliberate ambiguity and deliberate
loopholes of R6RS 4.2.8 allow (but do not require) Marc's
preferred semantics, and also allow implementations of the
R6RS that use Will's preferred semantics.

It is also my opinion that implementations and most users of
the R6RS should prefer Will's preferred semantics.  The rest
of this note explains why I hold that opinion.

****************************************************************

ACCURACY.

Marc's preferred semantics generally implies double rounding,
which is usually regarded as something to be avoided because
it can degrade accuracy.

Theorem.  For all x and for all p, Will's preferred semantics
delivers an inexact real that approximates x at least as well
as Marc's preferred semantics, and the inexact real obtained
via Will's preferred semantics is often a better approximation
to x than the inexact real obtained via Marc's preferred
semantics.

That theorem is uncontroversial.  It underscores the fact that
Marc's preferred semantics is not about using the x|p notation
as a good approximation for x.  Marc's preferred semantics is
about using the x|p notation as an approximation for the p-bit
floating point approximation of x.

Aside from the use case described in Footnote 1, for which the
x|p notation is clearly worthless, I am not aware of any common
use case for which someone would really want to be computing
q-bit floating point approximations to p-bit floating point
approximations.  Such use cases might exist, but they are not
common.

On the other hand, it is not unheard of for someone to want to
compute the inexact real that best approximates x using at
least p bits of precision.  That might not be a common use
case, but I maintain that it is more common than wanting to
compute q-bit approximations to p-bit approximations of x.

That is the main reason why I prefer Will's preferred semantics
to Marc's preferred semantics, and it is why I believe Will's
preferred semantics should be preferred by implementations and
most users to Marc's preferred semantics.

But there are other reasons as well, to which I will turn after
quantifying the probability that the double rounding implied by
Marc's preferred semantics will degrade accuracy.

In the following questions, the interval (1, 1000000) serves
only to guarantee that any real number within the interval
can be approximated by a normalized IEEE double precision
result.

QUESTION:  If x is a real number selected at random from the
interval (1, 1000000), what is the probability that the best
IEEE double precision (53-bit) approximation to x is closer
to x than the best 53-bit approximation to the best 54-bit
approximation of x?

ANSWER: Roughly 1/4.

Sketch of proof:  The set of real numbers representable in
IEEE double precision has measure zero, so we can calculate
the probability without considering them.  Let a be the best
53-bit approximation to x for which a < x, and let b be the
least 53-bit floating point value for which a < b.  With very
high probability, (a+b)/2 is representable using 54 bits of
precision, and is the only real number between a and b that
can be represented using 54 bits.  Dividing the interval (a,b)
into four equal quadrants, we find that an x in the quadrants
(a, (3a+b)/4) or ((a+3b)/4, b) is best approximated by a or b,
respectively, regardless of whether we use 53 or 54 bits of
precision.  If the best 53-bit approximation to (a+b)/2 is b,
however, then a is the best 53-bit approximation to every x
in the interval ((3a+b)/4, (a+b)/2), but converting x to its
best 54-bit approximation yields (a+b)/2, whose best 53-bit
approximation is b, not a.  Thus the processing of converting
x first to 54 bits and then to 53 bits involves double rounding
that degrades accuracy for one quarter of the real numbers in
(a, b).  Similarly if the best 53-bit approximation to (a+b)/2
is a.  QED

Generalizing the question, answer, and proof sketch, we find
that the probability of degraded accuracy becomes 1/8 if we
first convert to 55 bits and then to 53.  That probability
becomes 1/16 if we first convert to 57 bits and then to 53.
And so on.

QUESTION:  If x is a real number selected at random from the
interval (1, 1000000), what is the probability that the best
IEEE double precision (53-bit) approximation to x is closer
to x than the best 53-bit approximation to the best 52-bit
approximation of x?

ANSWER: Roughly 1/2.

Generalizing, we find that the probability of degraded accuracy
becomes 3/4 if we first convert to 51 bits and then to 53.  If
we first convert to 50 bits and then to 53, the probability of
degraded accuracy is 7/8.  And so on.

From the above, it is perfectly clear that Marc's preferred
semantics is not about finding accurate approximations to x,
whereas Will's preferred semantics always delivers the most
accurate q-bit approximation to x that is possible within a
given implementation, for a particular precision q that is
clearly allowed by the constraints stated in R6RS 4.2.8.

****************************************************************

EFFICIENCY.

Marc's preferred semantics implies double rounding, whereas
Will's preferred semantics implies single rounding, so Marc's
preferred semantics is twice as expensive as Will's even if
all rounding steps are equally efficient.

It is likely, however, that rounding to one of the precisions
supported by an implementation's selection of formats for
inexact reals will be less costly than rounding to some other
precision.  There are three reasons for that.  One is that
all external representations for inexact numbers that don't
involve the x|p notation will round to a supported precision,
giving implementations an incentive to optimize that common
case.  Another is that at least some supported precisions
are likely to match a precision supported by floating point
hardware, which allows implementations to use more efficient
algorithms such as "Clinger's method".  (David Gay improved
and implemented that method in C.  Nowadays some variation
of that algorithm is part of multiple subsystems installed
on virtually every general purpose computing system and smart
phone.)

The third reason is that, with Marc's preferred semantics,
implementations that would otherwise use conversion routines
available via standard C libraries would probably have to
implement arbitrary-precision conversions in Scheme.  Chez
Scheme might be fast enough to get by with using bignum
arithmetic to support arbitrary precisions, but many other
implementations are not.  In particular, some implementations
are interpreted, so using Scheme code to implement the
conversions mandated by Marc's preferred semantics might
well be considerably slower than the conversions necessary
for Will's preferred semantics.

Thus Will's preferred semantics is likely to be implemented
more than twice as efficiently as Marc's, and quite possibly
a great deal more than twice as efficiently.

****************************************************************

SUMMARY.

The text of R6RS 4.2.8 allows Will's preferred semantics,
and also allows Marc's preferred semantics.

Will's preferred semantics is more commonly useful, more
accurate, and more efficient than Marc's preferred semantics.

To dispute the "more commonly useful" part of that sentence,
it would be necessary to cite applications of Marc's preferred
semantics that are more common or more useful than expecting
the x|p notation to deliver an accurate approximation for x.

To dispute the "more accurate" part of that sentence, it
would be necessary to adopt a notion of "accurate" that is
not about finding accurate approximations for x.  To argue
for that alternative notion of accuracy, one would have to
explain why that alternative notion is more commonly useful
than what "accurate" means in the context of finding an
accurate approximation for x.

To dispute the "more efficient" part of that sentence would
be silly.

Will Clinger

FOOTNOTE 1.  I find it amusing to point out that "Clinger's
method" for reading floating point numbers accurately is the
only algorithm I can think of that needs to compute q-bit
floating point approximations to p-bit floating point
approximations.  In that algorithm, the precision p *must*
be greater than the ultimate precision q, contrary to what
R6RS 4.2.8 says about the x|p notation, which insists p
*must* be less than or equal to the precision q of the
second approximation whenever that is at all practical.
So "Clinger's method" is not a use case for x|p notation,
regardless of whose semantics you prefer.