Email list hosting service & mailing list manager

Re: the "Unicode Background" section Thomas Lord (22 Jul 2005 18:54 UTC)
Re: the "Unicode Background" section John.Cowan (23 Jul 2005 07:57 UTC)

Re: the "Unicode Background" section John.Cowan 23 Jul 2005 07:57 UTC

Thomas Lord scripsit:

> Permitting unpaired surrogates does not damage interoperability
> -- programs need only avoid trying to transmit them on channels
> where strictly well-formed UTF-* is called for.

In fact, it is not ill-formed to have an unpaired surrogate in *any*
UTF encoding; it's just semantically meaningless.

> In my view, DISPLAY (in R6RS, not forever) should be undefined in that
> case (and in all cases where a string contains a non-8-bit-character) --

There are no such things as "8-bit characters" per se.  There are a variety
of 8-bit encodings that allow up to 256 characters, but they are not the
same characters in all cases.

--
John Cowan    http://www.ccil.org/~cowan   <xxxxxx@reutershealth.com>
    "Any legal document draws most of its meaning from context.  A telegram
    that says 'SELL HUNDRED THOUSAND SHARES IBM SHORT' (only 190 bits in
    5-bit Baudot code plus appropriate headers) is as good a legal document
    as any, even sans digital signature." --me