Email list hosting service & mailing list manager


Re: Why are byte ports "ports" as such? John Cowan 25 May 2006 05:10 UTC

bear scripsit:

> I feel that "unicode default grapheme clusters" more closely
> map to what users call "characters" than codepoints do.  In
> the interests of keeping the abstractions used by the programmer
> as close as possible to the abstractions used by ordinary users,
> I therefore support defining scheme characters as DCG's.

While I disagree with this position, it is entirely coherent and
consistent, and I wouldn't weep if R6RS went with it.  The main
argument against with it is that the "user" view of characters can cross
DCG boundaries.  From the codepoint level, other levels can be
built up, including the DCG, syllable, word, sentence, and paragraph.

> A fourth technical advantage is that it's "future proof."
> There is still dispute about Unicode's appropriateness,
> particularly in asian scripts, and it is reasonable to presume
> that Unicode is no more the Last Encoding Ever than was ASCII.

Those disputes are AFAIK long dead, and the alternatives to Unicode
are openly dependent on it and in any case have no traction.
I see no reason why ISO/IEC 10646 should not last for centuries.

--
John Cowan   xxxxxx@ccil.org    http://ccil.org/~cowan
If a soldier is asked why he kills people who have done him no harm, or a
terrorist why he kills innocent people with his bombs, they can always
reply that war has been declared, and there are no innocent people in an
enemy country in wartime.  The answer is psychotic, but it is the answer
that humanity has given to every act of aggression in history.  --Northrop Frye