Email list hosting service & mailing list manager

the "Unicode Background" section Thomas Lord (21 Jul 2005 22:45 UTC)
Re: the "Unicode Background" section Thomas Bushnell BSG (21 Jul 2005 23:10 UTC)
Re: the "Unicode Background" section Matthew Flatt (21 Jul 2005 23:52 UTC)

Re: the "Unicode Background" section Thomas Bushnell BSG 21 Jul 2005 23:10 UTC

Thomas Lord <xxxxxx@emf.net> writes:

> The Unicode Background section of the new draft has
>
>   > It is thus appropriate to define Scheme characters as Unicode scalar
>   > values, which includes all code points except those designated as
>   > surrogates.
>
> That seems wrong-headed to me.   Characters should simply
> be codepoints, instead.

A second ago you were saying that we should not be arguing about how
high-level characters are.  I think charaters should be graphemes.

> If CHARs are codepoints, more basic Unicode algorithms translate
> into Scheme cleanly.

Those algorithms all deal with encodings, and should therefore, it
seems to me, be in the interface between arrays-of-integers and
strings.  Strings are not arrays-of-integers!

> If CHARs are codepoints, they have simple algebraic properties
> in relation to integers.

Except characters are not integers.  Scheme is not C.

Thomas