Email list hosting service & mailing list manager

character-storage-class Lucier, Bradley J (21 Apr 2022 22:58 UTC)
Re: character-storage-class John Cowan (21 Apr 2022 23:03 UTC)
Re: character-storage-class Alex Shinn (22 Apr 2022 08:41 UTC)
Re: character-storage-class Bradley Lucier (23 Apr 2022 18:05 UTC)
Re: character-storage-class John Cowan (23 Apr 2022 20:15 UTC)
Re: character-storage-class Alex Shinn (25 Apr 2022 13:52 UTC)
Re: character-storage-class Bradley Lucier (25 Apr 2022 13:53 UTC)
Re: character-storage-class John Cowan (25 Apr 2022 16:07 UTC)

Re: character-storage-class Bradley Lucier 25 Apr 2022 13:53 UTC

There's a pull request implementing char-storage-class.

On 4/25/22 9:52 AM, Alex Shinn wrote:
> On Sun, Apr 24, 2022 at 5:15 AM John Cowan <xxxxxx@ccil.org
> <mailto:xxxxxx@ccil.org>> wrote:
>
>     APL does support character arrays of arbitrary dimension, using a
>     convention that trailing spaces are ignored in order to make the
>     array rectangular.  A better convention, assuming that text is being
>     represented, would be to ignore trailing nulls, since nulls should
>     not exist in text.
>
>
> The convention can be up to the user, unless we want utilities to
> convert between native strings and character arrays.
>
> Regarding usefulness - this sort of fixed width with padding is a common
> way to represent strings in neural networks.
>
> --
> Alex
>
>
>     On Sat, Apr 23, 2022 at 2:04 PM Bradley Lucier <xxxxxx@purdue.edu
>     <mailto:xxxxxx@purdue.edu>> wrote:
>
>         On 4/22/22 4:41 AM, Alex Shinn wrote:
>          > The bigger question is, is this useful?  The last dimension
>         would be a
>          > sequence of characters, i.e. a string, but all of the strings
>         in the
>          > array would have to consist of the same number of codepoints,
>         a concept
>          > so restricted it is very close to useless.
>
>         In the sample implementation in 64=bi Gambit, a Unicode string
>         with N
>         codepoints takes 4N bytes, a vector with N characters takes 8N
>         bytes
>         (more or less).
>
>         So this would save some space.
>
>         I'm not insisting that an application interpret the last
>         dimension as
>         strings, but asking whether anyone thinks that applications that
>         work on
>         arrays of characters (if there are any) should be supported by
>         the SRFI.
>
>         Brad
>