Re: SRFI-77 with more than one flonum representation

Show/hide message thread

Re: SRFI-77 with more than one flonum representation William D Clinger (26 Jun 2006 23:31 UTC)

Re: SRFI-77 with more than one flonum representation Alan Watson (29 Jun 2006 16:13 UTC)

Re: SRFI-77 with more than one flonum representation John Cowan (30 Jun 2006 15:03 UTC)

Re: SRFI-77 with more than one flonum representation Alan Watson (03 Jul 2006 16:16 UTC)

Re: SRFI-77 with more than one flonum representation Michael Sperber (05 Jul 2006 18:26 UTC)

Re: SRFI-77 with more than one flonum representation Michael Sperber (05 Jul 2006 18:28 UTC)

Re: SRFI-77 with more than one flonum representation Alan Watson (03 Jul 2006 16:17 UTC)

Re: SRFI-77 with more than one flonum representation Alan Watson 03 Jul 2006 02:30 UTC

This is my second reply to John Cowan. In the first, I argue that my
need to use single-precision floating point numbers is based on a desire
for correctness, not efficiency. Here, I address efficiency.

John Cowan wrote:
> I think the burden of persuasion now lies on you (or someone else in
> your position) to show that:
>
> 1) there are still significant architectures in which different kinds
> of floating-point numbers represent a significant tradeoff (as was
> historically the case, single-float being faster but less precise and
> with a smaller range), such that it does not make sense to privilege
> one over the other; and that
>
> 2) this feature warrants support, even if halfhearted, from the Scheme
> standard rather than being left as implementation-dependent.
>
> I believe this will be a difficult burden to meet.

I present two examples. The first is a flight of fantasy, but an
interesting one. The second is real, and something that could be
implemented now.

1. Vectorizable code on an x86 or x86-64 with SSE and SSE2 can run twice
as fast in single precision as double precision. To a large degree this
is true regardless of the size of the vectors (i.e., it is true even for
small vectors that are not limited by memory bandwidth).

I know of no vectorizing Scheme compiler, so it is difficult to argue
that the Scheme standard should be bent to support such a hypothetical
compiler. On the other hand, I do not think that the Scheme standard
should be written in such a way that it is difficult to write an
efficient vectorizing compiler that behaves naturally and can handle
both single- and double-precision representations. So, sure, the
standard should not require implementations to have more that one
floating point representation, but it should not rule out that
possibility either (i.e., please keep s, d, and l exponents).

2. An implementation on a 64-bit machine can probably represent
single-precisions as unboxed types but would probably have to box
double-precisions. This may well make single-precison arithmetic in
Scheme faster than double-precision arithmetic in Scheme, even if both
are equally fast at the hardware level. Of course, more sophisticated
compilers may be able to unbox both types in some circumstances.

I know of no implementation that has unboxed singles, but 64-bit
machines are here and now and this is an obvious optimization. To take
full advantage of this, we would need a version of SRFI 77 that is
specific to s-exponent numbers. Now, let us consider a standard that
mandated a version of SRFI 77 that worked with s-exponent numbers and
another that worked with e-exponent numbers. If the Scheme only uses one
floating-point number, these will be identical. If the Scheme has
unboxed single-precision numbers, the first version will be more
efficient than the second, albeit at some cost in precision.

Is this second example at all convincing?

Regards,

Alan
--
Dr Alan Watson
Centro de Radioastronomía y Astrofísica
Universidad Astronómico Nacional de México