Re: Why are byte ports "ports" as such?
Jonathan S. Shapiro 23 May 2006 18:43 UTC
On Tue, 2006-05-23 at 14:15 -0400, John Cowan wrote:
> Jonathan S. Shapiro scripsit:
> > Unfortunately it is quite wrong, which is something that the UNICODE
> > people go to great lengths to make clear (and, IMO, a serious failing of
> > UNICODE).
>
> It's not *wrong*. It's not a matter of the Right Thing and the Wrong Thing.
> For some purposes, code units (8 or 16 or 32 bits) are the Right Thing;
> for some purposes, codepoints are; for some purposes, higher-level units
> are. It's about appropriate choices.
I did not mean "wrong" in the sense of "immoral, unethical, or
fattening". I meant "wrong" in the sense of "incorrect or inaccurate".
For better or worse, the real world has decided that characters are not
code points. Given that this is true, I am simply suggesting that it is
a mistake to mislabel them by making poor choices about the names of
standard procedures.
READ-CHAR must conceptually be built on top of READ-CODEPOINT, which in
turn must conceptually be built on top of READ-BYTE. From our experience
in BitC, it appears to be the case that READ-CODEPOINT is sufficient for
implementation of the compiler/interpreter, and READ-CHAR can therefore
be implemented as a library procedure.
> And if Unicode is complicated (and it is), it's because it's embedded in
> a complicated world.
Indeed.