Re: Why are byte ports "ports" as such?

Re: Why are byte ports "ports" as such? Thomas Bushnell BSG 24 May 2006 04:45 UTC
Per Bothner <xxxxxx@bothner.com> writes:

>> Do you ever use C-x =?
>
> No.

Well, I do.  And it's not all that advanced.  It happens a lot when I
have trouble visually distinguishing the glyph under point.

> The argument is that we have nothing better that we can call characters,
> and if we use code-points we can use the historical Scheme functions
> and names.

I can't tell what you're arguing for.

We *do* have something we can call characters: characters.  You might
find them useless, but their semantics are quite clear.

Are you arguing which of the following:

1) We should have neither code points nor characters;
2) We should have code points and not characters, and call code points
   something like "code-points";
3) We should have code points and not characters, and call code points
   something like "characters";
4) We should have both code points and characters, call code points
   something like "characters" and call characters something else.

If you are arguing (1), then fine, let's drop both.  If you are
arguing (3) and (4), there is no defense for your position.  If you
are arguing (2), then great, but I must have misunderstood you
mightily.

>> No, [fonts] are not [indexed by code-point].  They are indexed by
>> character.  Consider an accented character that is represented by
>> several code points.
>
> This can be handled the same way an ffi ligature is handled.  Are you
> proposing that #\ffi be a character?

No.  I'm proposing that *characters* be the
already-well-understood-concept of "character", which is not glyph,
and is not code point.  And which I happen to think is the optimal
unit for understanding text editing, but if you disagree, then feel
free to use something else, but please don't go and call it
"character".

> What does char->integer return?  How does char<? work?  What is your
> proposed implementation for a "character" in the Unicode world, given
> that it is not a code-point?  How would you store characters in a
> string?

Storage is irrelevant.  An implementation would be free to store
characters however it wished.  char->integer and char<? can return
whatever the implementation pleases.  I would rather drop them, since
they have nothing really to do with characters.  They are functions on
*code points*, which are there because the R5RS authors did not bother
to distinguish code points from characters.

If you think R6RS should distinguish thus, *great*.  Let's decide
whether the standard should have code points or characters (obviously
you think the former) and then let's call it whatever it actually is.
Certainly we should not require code points but then go and call them
something else.

Thomas