Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(27 Aug 2024 08:46 UTC)
|
Re: Numbers with small mantissa widths
Michael Sperber
(28 Aug 2024 14:18 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(28 Aug 2024 08:08 UTC)
|
Re: Numbers with small mantissa widths
Michael Sperber
(28 Aug 2024 15:14 UTC)
|
Re: Numbers with small mantissa widths
Will Clinger
(28 Aug 2024 15:12 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(28 Aug 2024 15:27 UTC)
|
Re: Numbers with small mantissa widths
Will Clinger
(28 Aug 2024 18:21 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(28 Aug 2024 21:09 UTC)
|
Re: Numbers with small mantissa widths
Will Clinger
(29 Aug 2024 04:45 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(29 Aug 2024 11:53 UTC)
|
Re: Numbers with small mantissa widths
Will Clinger
(29 Aug 2024 15:25 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(29 Aug 2024 15:57 UTC)
|
Re: Numbers with small mantissa widths
Will Clinger
(29 Aug 2024 19:21 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(31 Aug 2024 20:02 UTC)
|
Re: Numbers with small mantissa widths
Will Clinger
(01 Sep 2024 01:43 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(02 Sep 2024 15:57 UTC)
|
Re: Numbers with small mantissa widths Will Clinger (02 Sep 2024 23:48 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(03 Sep 2024 16:45 UTC)
|
Re: Numbers with small mantissa widths
Bradley Lucier
(04 Sep 2024 01:30 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(04 Sep 2024 06:11 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(04 Sep 2024 14:03 UTC)
|
Re: Numbers with small mantissa widths
Will Clinger
(13 Sep 2024 12:49 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(14 Sep 2024 09:43 UTC)
|
Re: Numbers with small mantissa widths
Will Clinger
(13 Sep 2024 12:50 UTC)
|
Re: Numbers with small mantissa widths
Michael Sperber
(29 Aug 2024 13:56 UTC)
|
Re: Numbers with small mantissa widths
Marc Nieper-Wißkirchen
(29 Aug 2024 11:38 UTC)
|
This is a lengthy statement of my opinions and some facts explaining those opinions, as I hope this can be my last submission to the SRFI 77 discussion of this subject. BRIEF RESPONSE TO MARC's MOST RECENT EMAIL. Marc seems to think he can retain both his Claim 1 and his Claim 2. He's wrong about that. To bypass further back and forth about this, I'll just say his Claim 1 is the one he cares about. He made his Claim 2 as part of an argument that the semantics of his Claim 1 is easier to implement efficiently than I was saying. Because he still doesn't see why he can't retain both claims, I suspect his "simple" algorithm actually relies on Claim 2 and therefore fails to implement the semantics of his Claim 1. All that means is that, if/when he gets around to creating an algorithm that truly implements his Claim 1 semantics, his algorithm will be less efficient than he currently believes. I will therefore assume Marc will eventually come to realize he must choose between his Claim 1 and Claim 2, and that he will, at that point, retain Claim 1 while abandoning Claim 2. That assumption allows me to proceed to my final summary of the situation. SEMANTICS. In my previous missive, I somehow deleted an important SRFI 77 and R6RS loophole from my statement of Marc's Claim 1. I'll correct that statement here. Claim 1. The R6RS requires the x|p notation to be read as (1) the best possible binary floating-point approximation to x that uses p bits of significand (which is mathematically well-defined), and then, in systems that use binary floating-point representations for inexact reals, (2) that best possible p-bit binary floating-point approximation is converted to the best possible floating-point approximation that uses p bits "if practical, or by a greater precision if a p-bit significand is not practical, or by the largest available precision if p or more bits of significand are not practical within the implementation." In the part of that claim that quotes R6RS, the three "practical" loopholes are necessary because implementations of the R6RS are unlikely to support inexact reals of all possible precisions p. In at least some implementations, all inexact reals are represented in IEEE double precision, with p=53. Throughout most of this note, I will assume all inexact reals are represented in IEEE double precision, because dealing with multiple precisions for inexact reals complicates the exposition to no "real" purpose. I will also assume the implementation uses correct rounding, which is to say external representations of inexact reals (other than the x|p notation, which I must treat separately) are converted into the best possible IEEE double precision approximation. I will refer to claim 1 as "Marc's preferred semantics", and will be comparing it to "Will's preferred semantics", which is Will's preferred semantics: The R6RS allows implementations to read the x|p notation by selecting a precision q that satisfies both (1) and (2): (1) the implementation supports q-bit floating point as one of its representations for inexact reals (2) one of the following is true: (a) q = p (b) q > p, and the implementation does not support inexact reals of any precision that lies between p and q (c) q < p, but the implementation does not support inexact reals of any precision greater than q Then, having selected q, x|p is read as an inexact real whose value is the best possible q-bit approximation to x. It is my opinion that the deliberate ambiguity and deliberate loopholes of R6RS 4.2.8 allow (but do not require) Marc's preferred semantics, and also allow implementations of the R6RS that use Will's preferred semantics. It is also my opinion that implementations and most users of the R6RS should prefer Will's preferred semantics. The rest of this note explains why I hold that opinion. **************************************************************** ACCURACY. Marc's preferred semantics generally implies double rounding, which is usually regarded as something to be avoided because it can degrade accuracy. Theorem. For all x and for all p, Will's preferred semantics delivers an inexact real that approximates x at least as well as Marc's preferred semantics, and the inexact real obtained via Will's preferred semantics is often a better approximation to x than the inexact real obtained via Marc's preferred semantics. That theorem is uncontroversial. It underscores the fact that Marc's preferred semantics is not about using the x|p notation as a good approximation for x. Marc's preferred semantics is about using the x|p notation as an approximation for the p-bit floating point approximation of x. Aside from the use case described in Footnote 1, for which the x|p notation is clearly worthless, I am not aware of any common use case for which someone would really want to be computing q-bit floating point approximations to p-bit floating point approximations. Such use cases might exist, but they are not common. On the other hand, it is not unheard of for someone to want to compute the inexact real that best approximates x using at least p bits of precision. That might not be a common use case, but I maintain that it is more common than wanting to compute q-bit approximations to p-bit approximations of x. That is the main reason why I prefer Will's preferred semantics to Marc's preferred semantics, and it is why I believe Will's preferred semantics should be preferred by implementations and most users to Marc's preferred semantics. But there are other reasons as well, to which I will turn after quantifying the probability that the double rounding implied by Marc's preferred semantics will degrade accuracy. In the following questions, the interval (1, 1000000) serves only to guarantee that any real number within the interval can be approximated by a normalized IEEE double precision result. QUESTION: If x is a real number selected at random from the interval (1, 1000000), what is the probability that the best IEEE double precision (53-bit) approximation to x is closer to x than the best 53-bit approximation to the best 54-bit approximation of x? ANSWER: Roughly 1/4. Sketch of proof: The set of real numbers representable in IEEE double precision has measure zero, so we can calculate the probability without considering them. Let a be the best 53-bit approximation to x for which a < x, and let b be the least 53-bit floating point value for which a < b. With very high probability, (a+b)/2 is representable using 54 bits of precision, and is the only real number between a and b that can be represented using 54 bits. Dividing the interval (a,b) into four equal quadrants, we find that an x in the quadrants (a, (3a+b)/4) or ((a+3b)/4, b) is best approximated by a or b, respectively, regardless of whether we use 53 or 54 bits of precision. If the best 53-bit approximation to (a+b)/2 is b, however, then a is the best 53-bit approximation to every x in the interval ((3a+b)/4, (a+b)/2), but converting x to its best 54-bit approximation yields (a+b)/2, whose best 53-bit approximation is b, not a. Thus the processing of converting x first to 54 bits and then to 53 bits involves double rounding that degrades accuracy for one quarter of the real numbers in (a, b). Similarly if the best 53-bit approximation to (a+b)/2 is a. QED Generalizing the question, answer, and proof sketch, we find that the probability of degraded accuracy becomes 1/8 if we first convert to 55 bits and then to 53. That probability becomes 1/16 if we first convert to 57 bits and then to 53. And so on. QUESTION: If x is a real number selected at random from the interval (1, 1000000), what is the probability that the best IEEE double precision (53-bit) approximation to x is closer to x than the best 53-bit approximation to the best 52-bit approximation of x? ANSWER: Roughly 1/2. Generalizing, we find that the probability of degraded accuracy becomes 3/4 if we first convert to 51 bits and then to 53. If we first convert to 50 bits and then to 53, the probability of degraded accuracy is 7/8. And so on. From the above, it is perfectly clear that Marc's preferred semantics is not about finding accurate approximations to x, whereas Will's preferred semantics always delivers the most accurate q-bit approximation to x that is possible within a given implementation, for a particular precision q that is clearly allowed by the constraints stated in R6RS 4.2.8. **************************************************************** EFFICIENCY. Marc's preferred semantics implies double rounding, whereas Will's preferred semantics implies single rounding, so Marc's preferred semantics is twice as expensive as Will's even if all rounding steps are equally efficient. It is likely, however, that rounding to one of the precisions supported by an implementation's selection of formats for inexact reals will be less costly than rounding to some other precision. There are three reasons for that. One is that all external representations for inexact numbers that don't involve the x|p notation will round to a supported precision, giving implementations an incentive to optimize that common case. Another is that at least some supported precisions are likely to match a precision supported by floating point hardware, which allows implementations to use more efficient algorithms such as "Clinger's method". (David Gay improved and implemented that method in C. Nowadays some variation of that algorithm is part of multiple subsystems installed on virtually every general purpose computing system and smart phone.) The third reason is that, with Marc's preferred semantics, implementations that would otherwise use conversion routines available via standard C libraries would probably have to implement arbitrary-precision conversions in Scheme. Chez Scheme might be fast enough to get by with using bignum arithmetic to support arbitrary precisions, but many other implementations are not. In particular, some implementations are interpreted, so using Scheme code to implement the conversions mandated by Marc's preferred semantics might well be considerably slower than the conversions necessary for Will's preferred semantics. Thus Will's preferred semantics is likely to be implemented more than twice as efficiently as Marc's, and quite possibly a great deal more than twice as efficiently. **************************************************************** SUMMARY. The text of R6RS 4.2.8 allows Will's preferred semantics, and also allows Marc's preferred semantics. Will's preferred semantics is more commonly useful, more accurate, and more efficient than Marc's preferred semantics. To dispute the "more commonly useful" part of that sentence, it would be necessary to cite applications of Marc's preferred semantics that are more common or more useful than expecting the x|p notation to deliver an accurate approximation for x. To dispute the "more accurate" part of that sentence, it would be necessary to adopt a notion of "accurate" that is not about finding accurate approximations for x. To argue for that alternative notion of accuracy, one would have to explain why that alternative notion is more commonly useful than what "accurate" means in the context of finding an accurate approximation for x. To dispute the "more efficient" part of that sentence would be silly. Will Clinger FOOTNOTE 1. I find it amusing to point out that "Clinger's method" for reading floating point numbers accurately is the only algorithm I can think of that needs to compute q-bit floating point approximations to p-bit floating point approximations. In that algorithm, the precision p *must* be greater than the ultimate precision q, contrary to what R6RS 4.2.8 says about the x|p notation, which insists p *must* be less than or equal to the precision q of the second approximation whenever that is at all practical. So "Clinger's method" is not a use case for x|p notation, regardless of whose semantics you prefer.