Proposed erratum or PFN: remove f8 storage classes John Cowan (13 Mar 2023 06:27 UTC)
Re: Proposed erratum or PFN: remove f8 storage classes Bradley Lucier (13 Mar 2023 20:36 UTC)
Re: Proposed erratum or PFN: remove f8 storage classes John Cowan (14 Mar 2023 16:14 UTC)
Re: Proposed erratum or PFN: remove f8 storage classes Bradley Lucier (22 Mar 2023 21:18 UTC)
Re: Proposed erratum or PFN: remove f8 storage classes John Cowan (31 Mar 2023 15:31 UTC)
Re: Proposed erratum or PFN: remove f8 storage classes Bradley Lucier (31 Mar 2023 18:22 UTC)
Re: Proposed erratum or PFN: remove f8 storage classes Alex Shinn (28 Mar 2023 15:43 UTC)

Re: Proposed erratum or PFN: remove f8 storage classes Bradley Lucier 13 Mar 2023 20:36 UTC

On 3/13/23 2:27 AM, John Cowan wrote:
>
> Currently, SRFI 231 requires an 8-bit floating point storage class.
> However:
>
> 1) There is no hardware support anywhere for such a thing.
>
> 2) There is no standard of any kind, either IEEE or otherwise, for such
> a thing either.
>
> 3) The sample implementation doesn't provide them.
>
> 4) I researched various articles that discuss them, and basically some
> are in favor of 4-bit exponent and 3-bit mantissa, and others the
> opposite way.  Furthermore, there are different plausible values for the
> exponent bias: 1 is a common value, making 1/8 the minimum positive
> float and 15360 the maximum, but -2 is also a possibility, in which case
> all representable values are integral and the maximum positive float is
> 122880.
>
> It would be trivial to remove the reference from the SRFI and from the
> comments in the implementation file.

Thank you for your proposal, it brings up a number of relevant issues
that I had not considered before.

These two specifically came to mind:

1.  Is an 8-bit floating-point type, arranged in arrays, useful now or
in a reasonably foreseeable future?  (I'll use F8 to denote a general
8-bit floating-point type.)

2.  Can the procedures in SRFI 231 be useful in manipulating arrays of
F8 numbers as things currently stand.

To answer the first, I googled:

https://www.google.com/search?q=training+neural+networks+with+8-bit+floating+point

This brought up a 2018 paper, Training Deep Neural Networks with 8-bit
Floating Point Numbers, that Google Scholar claims has been cited almost
400 times:

https://arxiv.org/abs/1812.08011

and a paper, A Neural Network Training Processor With 8-Bit Shared
Exponent Bias Floating Point and Multiple-Way Fused Multiply-Add Trees,
that describes hardware specifically for 8-bit floating-point numbers:

https://ieeexplore.ieee.org/abstract/document/9515082

and another paper, 8-bit Numerical Formats for Deep Neural Networks,
that investigates one of the issues you mentions, various ways of
interpreting the bit patterns of 8-bit floating point and how useful
each variation may be:

https://deepai.org/publication/8-bit-numerical-formats-for-deep-neural-networks

There are further interesting papers on the subject.

The IEEE floating-point standards process has settled somewhat into a
pattern of decennial reports.  I suspect that, given the evident
interest of F8 in machine learning, the working group will adopt at
least one working format for F8 representation (with the associated
implied operations).

To turn to the second question, whether the procedures in SRFI 231 are
useful in manipulating arrays of F8 numbers, I believe after some
reflection that the answer is a resounding yes!

I have until thought of arrays as manipulating values of a certain type,
that getters and setters, and specifically array-ref and array-set!,
operate on Scheme values.  That 32-bit floating point numbers represent
a subset of 64-bit floating-point numbers, and that there is a
more-or-less accepted way to store 64-bit numbers into 32-bit arrays (by
rounding to nearest) made the issue cloudy for me.  As your comments
imply, there are no 8-bit or 16-bit floating-point numbers in Scheme
implementations, and how to operate on these numbers or best convert
between the formats is unclear.

But your question caused me to start thinking of manipulation of 8-bit
floating-point arrays in terms of manipulating the bit representation of
the elements without attempting any interpretation as Scheme values.

As long as you didn't specifically call an array's getter or setter
(explicitly, or implicitly through array-ref, array-set!, etc.) then all
the Bawden-style transformations of slicing and dicing and rearranging
arrays would work just fine.

These routines would work just fine:

specialized-array-share, array-copy, array-copy!, array-curry,
array-extract, array-tile, array-translate, array-permute,
array-reverse, array-sample,  array-stack, array-stack!, array-decurry,
array-decurry!, array-append, array-append!, array-block, array-block!,
specialized-array-reshape.

array-assign! would not work, because it moves the values of objects,
not their bit representation (so you can assign a u1-storage-class array
to a u8-storage-class array), but without an interpretation of what the
8-bit patterns of an F8 array mean, you couldn't use array-assign!
(unless from one F8 array to another).

So if you defined an f8-storage-class with the same operations as
u8-storage-class (thinking of the elements as bit patterns, without any
attempt at interpreting these bit patterns), then many of these routines
would work just fine.

If all other operations on F8 numbers are relegated to an FFI (to be
executed on a graphics or tensor processor, say), then I can imagine
that this SRFI may be useful in preparing that F8 data to be transferred
to, processed by, and then transferred back from, that external processor.

So, without further evidence and discussion, I wouldn't like to remove
the f8-storage-class.

Brad