Re: constant-time access to variable-width encodings
Thomas Bushnell BSG 14 Jul 2005 07:18 UTC
Per Bothner <xxxxxx@bothner.com> writes:
> Huh? A glyph depends on a specific font. No way can we define Scheme
> characters in terms of glyphs.
Not in Unicode speak.
> Do you mean a (canonicalized) composite (combining) sequence? One
> problem is you can't practially map one of those to a fixed-length
> integer value, so we have to give up char->integer and
> integer->char.
Says who? char->integer does not say anything about a "fixed-length
integer value". You sound like a C programmer! Scheme doesn't have a
concept of "fixed-length integer value" anyhow.
> Nonetheless, Java defines the Strings equals routine in terms of code
> point equality, and Java programmers manage to get useful work done.
Yes, by failing to implement Unicode correctly.
If you don't care about correct Unicode implementation, fine, but
please don't create a messy standard that *prevents* those who do care
from doing it right.