Re: Why are byte ports "ports" as such?
John Cowan 23 May 2006 18:15 UTC
Jonathan S. Shapiro scripsit:
> > Unless char is defined to be a code point. Which is IMHO the most
> > reasonable choice: code points are the natural atomic units of Unicode
> > text, and most Unicode algorithms are expressed in terms of code points.
>
> In many respects I agree that this would be sensible from the
> programmer's perspective.
>
> Unfortunately it is quite wrong, which is something that the UNICODE
> people go to great lengths to make clear (and, IMO, a serious failing of
> UNICODE).
It's not *wrong*. It's not a matter of the Right Thing and the Wrong Thing.
For some purposes, code units (8 or 16 or 32 bits) are the Right Thing;
for some purposes, codepoints are; for some purposes, higher-level units
are. It's about appropriate choices.
And if Unicode is complicated (and it is), it's because it's embedded in
a complicated world.
--
John Cowan xxxxxx@ccil.org http://www.ccil.org/~cowan
Most languages are dramatically underdescribed, and at least one is
dramatically overdescribed. Still other languages are simultaneously
overdescribed and underdescribed. Welsh pertains to the third category.
--Alan King