strings draft Tom Lord (22 Jan 2004 04:58 UTC)
Re: strings draft Shiro Kawai (22 Jan 2004 09:46 UTC)
Re: strings draft Tom Lord (22 Jan 2004 17:32 UTC)
Re: strings draft Shiro Kawai (23 Jan 2004 05:03 UTC)
Re: strings draft Tom Lord (24 Jan 2004 00:31 UTC)
Re: strings draft Matthew Dempsky (24 Jan 2004 03:00 UTC)
Re: strings draft Shiro Kawai (24 Jan 2004 03:27 UTC)
Re: strings draft Tom Lord (24 Jan 2004 04:18 UTC)
Re: strings draft Shiro Kawai (24 Jan 2004 04:49 UTC)
Re: strings draft Tom Lord (24 Jan 2004 18:47 UTC)
Re: strings draft Shiro Kawai (24 Jan 2004 22:16 UTC)
Octet vs Char (Re: strings draft) Shiro Kawai (26 Jan 2004 09:58 UTC)
Strings, one last detail. bear (30 Jan 2004 21:12 UTC)
Re: Strings, one last detail. Shiro Kawai (30 Jan 2004 21:43 UTC)
Re: Strings, one last detail. Tom Lord (31 Jan 2004 00:13 UTC)
Re: Strings, one last detail. bear (31 Jan 2004 20:26 UTC)
Re: Strings, one last detail. Tom Lord (31 Jan 2004 20:42 UTC)
Re: Strings, one last detail. bear (01 Feb 2004 02:29 UTC)
Re: Strings, one last detail. Tom Lord (01 Feb 2004 02:44 UTC)
Re: Strings, one last detail. bear (01 Feb 2004 07:53 UTC)
Re: Octet vs Char (Re: strings draft) bear (26 Jan 2004 19:04 UTC)
Re: Octet vs Char (Re: strings draft) Matthew Dempsky (26 Jan 2004 20:12 UTC)
Re: Octet vs Char (Re: strings draft) Matthew Dempsky (26 Jan 2004 20:40 UTC)
Re: Octet vs Char (Re: strings draft) Ken Dickey (27 Jan 2004 04:33 UTC)
Re: Octet vs Char Shiro Kawai (27 Jan 2004 05:12 UTC)
Re: Octet vs Char Tom Lord (27 Jan 2004 05:23 UTC)
Re: Octet vs Char bear (27 Jan 2004 08:35 UTC)
Re: Octet vs Char (Re: strings draft) bear (27 Jan 2004 08:33 UTC)
Re: Octet vs Char (Re: strings draft) Ken Dickey (27 Jan 2004 15:43 UTC)
Re: Octet vs Char (Re: strings draft) bear (27 Jan 2004 19:06 UTC)
Re: Octet vs Char Shiro Kawai (26 Jan 2004 23:39 UTC)
Re: strings draft bear (22 Jan 2004 19:05 UTC)
Re: strings draft Tom Lord (23 Jan 2004 01:53 UTC)
READ-OCTET (Re: strings draft) Shiro Kawai (23 Jan 2004 06:01 UTC)
Re: strings draft bear (23 Jan 2004 07:04 UTC)
Re: strings draft bear (23 Jan 2004 07:20 UTC)
Re: strings draft Tom Lord (24 Jan 2004 00:02 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 01:59 UTC)
Re: strings draft Tom Lord (26 Jan 2004 02:22 UTC)
Re: strings draft bear (26 Jan 2004 02:35 UTC)
Re: strings draft Tom Lord (26 Jan 2004 02:48 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 03:00 UTC)
Re: strings draft Tom Lord (26 Jan 2004 03:14 UTC)
Re: strings draft Shiro Kawai (26 Jan 2004 04:57 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 04:58 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 18:48 UTC)
Re: strings draft bear (24 Jan 2004 02:21 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 02:10 UTC)
Re: strings draft Tom Lord (23 Jan 2004 02:29 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 02:44 UTC)
Re: strings draft Tom Lord (23 Jan 2004 02:53 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:04 UTC)
Re: strings draft Tom Lord (23 Jan 2004 03:16 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:42 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 02:35 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 02:42 UTC)
Re: strings draft Tom Lord (23 Jan 2004 02:49 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 02:58 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:13 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 03:19 UTC)
Re: strings draft Bradd W. Szonye (23 Jan 2004 19:31 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 02:22 UTC)
Re: strings draft Bradd W. Szonye (06 Feb 2004 23:30 UTC)
Re: strings draft Bradd W. Szonye (06 Feb 2004 23:33 UTC)
Re: strings draft Alex Shinn (09 Feb 2004 01:45 UTC)
specifying source encoding (Re: strings draft) Shiro Kawai (09 Feb 2004 02:51 UTC)
Re: strings draft Bradd W. Szonye (09 Feb 2004 03:39 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:12 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 03:28 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:44 UTC)
Parsing Scheme [was Re: strings draft] Ken Dickey (23 Jan 2004 17:02 UTC)
Re: Parsing Scheme [was Re: strings draft] bear (23 Jan 2004 17:56 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (23 Jan 2004 18:50 UTC)
Re: Parsing Scheme [was Re: strings draft] Per Bothner (23 Jan 2004 18:56 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 20:26 UTC)
Re: Parsing Scheme [was Re: strings draft] Per Bothner (23 Jan 2004 20:57 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 21:44 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 20:07 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (23 Jan 2004 21:22 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 22:38 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (24 Jan 2004 06:48 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (24 Jan 2004 18:41 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (24 Jan 2004 19:34 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (24 Jan 2004 21:48 UTC)
Re: Parsing Scheme [was Re: strings draft] Ken Dickey (23 Jan 2004 21:47 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 23:22 UTC)
Re: Parsing Scheme [was Re: strings draft] Ken Dickey (25 Jan 2004 01:03 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (25 Jan 2004 03:01 UTC)
Re: strings draft Matthew Dempsky (25 Jan 2004 06:59 UTC)
Re: strings draft Tom Lord (25 Jan 2004 07:16 UTC)
Re: strings draft Matthew Dempsky (26 Jan 2004 23:52 UTC)
Re: strings draft Tom Lord (27 Jan 2004 00:30 UTC)

Re: strings draft Shiro Kawai 22 Jan 2004 09:46 UTC

I think the goal of the document is a bit ambiguous.  Specifically,
I feel there are three issues intermixed; they are related each
other, but it'll be clearer to separate the discussion as much
as possible.

1. Defining unicode API: "If a Scheme implementation supports
   API that explicitly deals with Unicode, it should be so and so".

   This goal tries to define a common API to interface the outer
   Unicode world, while allowing the implemenation to choose their
   internal representations.

2. Addressing unicode-specific issues: "If a Scheme implementation
   uses Unicode as its native character representation, it should be
   so and so".

   Some issues raised in the document are based on the assumption
   that the implementation uses Unicode or iso-8859-* CCS (coded
   character set) in its native representation.

   If the document limit its scope to "the implementations that uses
   Unicode/iso8859-* internally", it's fine.  Is that the intention
   of the document?

    * If the implementation uses EUCJP as its internal CES, it
      will face difficulty for the recommendation of INTEGER->CHAR
      to support [0,255], since EUCJP does not have full mappings
      in this range, although it has much more characters than 256.
      I think it's possible that (integer->char #xa1) on such
      implementations returns a "pseudo character", which doesn't
      corresponds to any character in EUCJP CES but is guaranteed
      that to produce #xa1 when passed to char->integer.  But the
      effects would be undefined if such a character is used within
      a string.  (An implementation can also choose different
      integers than the codepoint value to fulfill this "256 character"
      requirements, but it wouldn't be intuitive).

    * "What _is_ a Character" section doesn't refer to an
      implementation where a CHAR? value corresponts to a
      codepoint of non-Unocde, non-iso8859-* CCS/CES.

    * In the portable FFI section, some APIs state the encoding
      must be one of utf-8, iso8859-* or ascii, and I don't see
      the reason of such restrictions.

3. Determining the least common set of assumptions about characters
   and strings the language/FFI spec should define.

   Mostly in "R6RS recommendation" section.  Some of them seem
   try to be codeset-independent, while some of them seem to
   assume Unicode/iso8859-* codeset implicitly.  So I wonder
   which is the intention of the document.

Another issue: is there a rationale about "strong encouragement"
of O(1) access of string-ref and string-set!?   There are
alrogithms that truly need random access, but in many cases,
index is used just to mark certain location of the string;
e.g. if you want (string-ref str 3), it's rare that you know
'3' is significant before you know about str---it's more likely
that somebody (string search function, regexp matcher, or suffix
database...) told you that the 3rd character of a particular
string in str is significant.  In such cases, the reason you
use index is not because the algorithm requires it, but just
one of the possible means to have a reference within a string.
I feel that accessing strings by index is a kind of premature
optimization, which came from the history when strings were
simply an array of bytes.    Some algorithms may still require
random access, but we already have a primitive datatype that
guarantee O(1) access---a vector.   I don't think we should
make a string just a vector with a restricted element type.

--shiro