Re: the "Unicode Background" section
Thomas Bushnell BSG 21 Jul 2005 23:10 UTC
Thomas Lord <xxxxxx@emf.net> writes:
> The Unicode Background section of the new draft has
>
> > It is thus appropriate to define Scheme characters as Unicode scalar
> > values, which includes all code points except those designated as
> > surrogates.
>
> That seems wrong-headed to me. Characters should simply
> be codepoints, instead.
A second ago you were saying that we should not be arguing about how
high-level characters are. I think charaters should be graphemes.
> If CHARs are codepoints, more basic Unicode algorithms translate
> into Scheme cleanly.
Those algorithms all deal with encodings, and should therefore, it
seems to me, be in the interface between arrays-of-integers and
strings. Strings are not arrays-of-integers!
> If CHARs are codepoints, they have simple algebraic properties
> in relation to integers.
Except characters are not integers. Scheme is not C.
Thomas