unbounded chars considered dangerous (Re: strings and char arrays)

unbounded chars considered dangerous (Re: strings and char arrays) Aubrey Jaffer 02 Jan 2005 19:58 UTC

 | Date: Sun, 2 Jan 2005 09:49:04 -0800 (PST)
 | From: bear <xxxxxx@sonic.net>
 |
 | On Sat, 1 Jan 2005, Shiro Kawai wrote:
 |
 | >The new draft (srfi-58-new.html) still says:
 | >
 | >  "All implementations must support the character array type,
 | >  the rank-1 character arrays being strings."
 | >
 | >As Bear pointed out in <xxxxxx@bolt.sonic.net>,
 | >a string may not be implemented as a simple array of characters.
 | >It is always possible to implement array operations on strings
 | >since they can be accessed by index.  However, having distinct
 | >character array objects may be good in some implementations
 | >where strings have indexed access costs more than O(1).
 |
 | It's more than that, actually: string operations such as length-
 | changing mutation may be horribly inefficient on
 | strings-implemented- as-arrays, and array operations such as
 | indexed-reference may be suboptimal on strings.  Presenting strings
 | that look like arrays invites people to implement string operations
 | in terms of array operations, which could result in "worst of both
 | worlds" performance.

I think it is even worse than that!  Does R5RS constrain the number of
distinct characters to be finite?  Chars are in one-to-one
correspondence with the integers, which are unbounded.

The core property of uniform arrays is that each element fits into the
same fixed size of storage.  If chars are unbounded, then they don't
qualify.

I have removed mention of char-arrays from the new version of SRFI-58.