Re: the "Unicode Background" section
John.Cowan 23 Jul 2005 07:57 UTC
Thomas Lord scripsit:
> Permitting unpaired surrogates does not damage interoperability
> -- programs need only avoid trying to transmit them on channels
> where strictly well-formed UTF-* is called for.
In fact, it is not ill-formed to have an unpaired surrogate in *any*
UTF encoding; it's just semantically meaningless.
> In my view, DISPLAY (in R6RS, not forever) should be undefined in that
> case (and in all cases where a string contains a non-8-bit-character) --
There are no such things as "8-bit characters" per se. There are a variety
of 8-bit encodings that allow up to 256 characters, but they are not the
same characters in all cases.
--
John Cowan http://www.ccil.org/~cowan <xxxxxx@reutershealth.com>
"Any legal document draws most of its meaning from context. A telegram
that says 'SELL HUNDRED THOUSAND SHARES IBM SHORT' (only 190 bits in
5-bit Baudot code plus appropriate headers) is as good a legal document
as any, even sans digital signature." --me