Email list hosting service & mailing list manager

strings draft Tom Lord (22 Jan 2004 05:11 UTC)
Re: strings draft Shiro Kawai (22 Jan 2004 09:46 UTC)
Re: strings draft Tom Lord (22 Jan 2004 17:45 UTC)
Re: strings draft Shiro Kawai (23 Jan 2004 05:03 UTC)
Re: strings draft Tom Lord (24 Jan 2004 00:45 UTC)
Re: strings draft Matthew Dempsky (23 Jan 2004 20:01 UTC)
Re: strings draft Shiro Kawai (24 Jan 2004 03:26 UTC)
Re: strings draft Tom Lord (24 Jan 2004 04:31 UTC)
Re: strings draft Shiro Kawai (24 Jan 2004 04:49 UTC)
Re: strings draft Tom Lord (24 Jan 2004 19:01 UTC)
Re: strings draft Shiro Kawai (24 Jan 2004 22:15 UTC)
Octet vs Char (Re: strings draft) Shiro Kawai (26 Jan 2004 09:58 UTC)
Strings, one last detail. bear (30 Jan 2004 21:12 UTC)
Re: Strings, one last detail. Shiro Kawai (30 Jan 2004 21:43 UTC)
Re: Strings, one last detail. Tom Lord (31 Jan 2004 00:27 UTC)
Re: Strings, one last detail. bear (31 Jan 2004 20:25 UTC)
Re: Strings, one last detail. Tom Lord (31 Jan 2004 20:56 UTC)
Re: Strings, one last detail. bear (01 Feb 2004 02:28 UTC)
Re: Strings, one last detail. Tom Lord (01 Feb 2004 02:58 UTC)
Re: Strings, one last detail. bear (01 Feb 2004 07:53 UTC)
Re: Octet vs Char (Re: strings draft) bear (26 Jan 2004 19:04 UTC)
Re: Octet vs Char (Re: strings draft) Matthew Dempsky (26 Jan 2004 13:13 UTC)
Re: Octet vs Char (Re: strings draft) Matthew Dempsky (26 Jan 2004 13:41 UTC)
Re: Octet vs Char Shiro Kawai (26 Jan 2004 23:38 UTC)
Re: Octet vs Char (Re: strings draft) Ken Dickey (26 Jan 2004 19:40 UTC)
Re: Octet vs Char Shiro Kawai (27 Jan 2004 05:10 UTC)
Re: Octet vs Char Tom Lord (27 Jan 2004 05:37 UTC)
Re: Octet vs Char bear (27 Jan 2004 08:35 UTC)
Re: Octet vs Char (Re: strings draft) bear (27 Jan 2004 08:32 UTC)
Re: Octet vs Char (Re: strings draft) Ken Dickey (27 Jan 2004 06:50 UTC)
Re: Octet vs Char (Re: strings draft) bear (27 Jan 2004 19:06 UTC)
Re: strings draft bear (22 Jan 2004 19:05 UTC)
Re: strings draft Tom Lord (23 Jan 2004 02:06 UTC)
READ-OCTET (Re: strings draft) Shiro Kawai (23 Jan 2004 06:00 UTC)
Re: strings draft bear (23 Jan 2004 07:04 UTC)
Re: strings draft bear (23 Jan 2004 07:20 UTC)
Re: strings draft Tom Lord (24 Jan 2004 00:15 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 01:58 UTC)
Re: strings draft Tom Lord (26 Jan 2004 02:35 UTC)
Re: strings draft bear (26 Jan 2004 02:35 UTC)
Re: strings draft Tom Lord (26 Jan 2004 03:01 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 03:00 UTC)
Re: strings draft Tom Lord (26 Jan 2004 03:27 UTC)
Re: strings draft Shiro Kawai (26 Jan 2004 04:57 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 04:57 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 18:48 UTC)
Re: strings draft bear (24 Jan 2004 02:21 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 02:09 UTC)
Re: strings draft Tom Lord (23 Jan 2004 02:42 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 02:44 UTC)
Re: strings draft Tom Lord (23 Jan 2004 03:07 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:04 UTC)
Re: strings draft Tom Lord (23 Jan 2004 03:29 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:42 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 02:34 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 02:42 UTC)
Re: strings draft Tom Lord (23 Jan 2004 03:02 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 02:58 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:13 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 03:18 UTC)
Re: strings draft Bradd W. Szonye (23 Jan 2004 19:31 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 02:21 UTC)
Re: strings draft Bradd W. Szonye (06 Feb 2004 23:30 UTC)
Re: strings draft Bradd W. Szonye (06 Feb 2004 23:33 UTC)
Re: strings draft Alex Shinn (09 Feb 2004 01:45 UTC)
specifying source encoding (Re: strings draft) Shiro Kawai (09 Feb 2004 02:51 UTC)
Re: strings draft Bradd W. Szonye (09 Feb 2004 03:39 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:12 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 03:28 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:44 UTC)
Parsing Scheme [was Re: strings draft] Ken Dickey (23 Jan 2004 08:07 UTC)
Re: Parsing Scheme [was Re: strings draft] bear (23 Jan 2004 17:55 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (23 Jan 2004 18:50 UTC)
Re: Parsing Scheme [was Re: strings draft] Per Bothner (23 Jan 2004 18:56 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 20:39 UTC)
Re: Parsing Scheme [was Re: strings draft] Per Bothner (23 Jan 2004 20:57 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 21:57 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 20:20 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (23 Jan 2004 21:22 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 22:52 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (24 Jan 2004 06:48 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (24 Jan 2004 18:55 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (24 Jan 2004 19:34 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (24 Jan 2004 22:02 UTC)
Re: Parsing Scheme [was Re: strings draft] Ken Dickey (23 Jan 2004 12:53 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 23:35 UTC)
Re: Parsing Scheme [was Re: strings draft] Ken Dickey (24 Jan 2004 16:10 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (25 Jan 2004 03:14 UTC)
Re: strings draft Matthew Dempsky (25 Jan 2004 00:00 UTC)
Re: strings draft Tom Lord (25 Jan 2004 07:29 UTC)
Re: strings draft Matthew Dempsky (26 Jan 2004 16:53 UTC)
Re: strings draft Tom Lord (27 Jan 2004 00:44 UTC)

Re: strings draft Tom Lord 24 Jan 2004 00:15 UTC


    > From: bear <xxxxxx@sonic.net>

    > >    > >   While R6RS should not require that CHAR? be a subset of Unicode,
    > >    > >   it should specify the semantics of string indexes for strings
    > >    > >   which _are_ subsets of Unicode.

    > I'll go along with that, and I have no trouble conforming.  In
    > that case the "large characters" may simply be regarded as lying
    > outside unicode.

An interesting interpretation that I hadn't intended to permit but I
don't see any problem with it (for R6RS).   None at all.  Sweet,
sweet, sweet.

Maybe beyond R6RS we can duke it out later over some Unicode SRFIs :-)
(Actually, my hunch is there isn't even _much_ (a little, perhaps) to
disagree about there.)

    > >    > Computing the codepoint-index on demand would require a traversal
    > >    > of the string, an O(N) operation, using my current representation.
    > >    > That's clearly intolerable. But in the same tree structure where I
    > >    > now just keep character indexes, I can add additional fields for
    > >    > codepoint indexes as well, making it an O(log N) operation.

    > > And if you were to use self-balancing trees, it would be an
    > > expected-case O(1) operation.

    > ???  Balanced trees are still trees, and if the tree is exp(n) long
    > there are still O(n) links to follow from the root to a leaf.  Can
    > you explain what you mean?

I mean that you should (maybe, don't know enough about your apps and
environment) use splay trees so that string algorithms displaying a
reasonable degree of locality will be able to locate indexes in
expected-case O(1).  Of course if memory constraints and usage
guarantee that your particular non-splay trees or guaranteed to be
shallow -- that's just as good.

    > > Only because you are in a mindset where you want CHAR? to be a
    > > combining char sequence.  There are so many problems with permitting
    > > that as a conformant Scheme that I think it has to be rejected.  You
    > > need to pick a different type for what you currently call CHAR?.

    > I want char? to be a character, and to never have to care, on the
    > scheme side of things, about how exactly it's represented, whether
    > it's in unicode as a single codepoint or several, or etc.  I don't
    > give a flying leap whether unicode regards it as a combining
    > sequence or not, and I don't want to *have* to give a flying leap
    > about whether it's represented as a combining sequence or not.  If
    > it functions linguistically as a character, I want to be able to
    > write scheme code that treats it as a character.  I don't want to
    > ever have to worry or think about the representation, until I'm doing
    > I/O or passing a value across an FFI and the representation becomes
    > important.

I think that the "In bear's Scheme, a string containing a
(non-unitary) combining sequence simply doesn't count as a string of
Unicode characters" interpretation eliminates all the conflicts I was
worried about between my proposal and your implementation.  Perfect.
Excellent.  Sweet, sweet, sweet.

-t