Re: Issues with Unicode John Cowan 10 May 2006 19:31 UTC

bear scripsit:

> Didn't want it both ways.  String-set!, with unchanged contract,
> can be implemented on top of purely functional methods for
> manipulating string bodies and an atomic single mutation for
> manipulating the string head.

Ah.  Okay.

> Hah?  Unicode already encompasses, I believe, every living
> language with a writing system.  If you mean that there are
> programmers who can't get meaningful identifiers using the
> character set defined as of Unicode 4.1.0, I want to know
> who those programmers are.

Perhaps "potential programmers" is the correct expression.  I currently
count about 15 scripts commonly used to write living languages in the
pipeline that are not yet in Unicode 4.1, and there are a number of
natural languages that use existing scripts but don't quite have all
their characters: e.g. the Myanmar script currently handles Burmese but
not the minority languages of Myanmar.  It may be that no speakers of
those languages are currently programmers, but this is not fundamental.

Certainly things have improved quite a bit since XML 1.0, which froze
identifiers at Unicode 2.0.

> Meanwhile, allowing identifier syntax to shift with every
> version of Unicode creates the potential for version
> incompatibilities.

I quite agree, which is why I propose a fixed though over-inclusive syntax
along the lines of the "alternative identifiers" documented by Unicode.
Alternative identifiers allow whatever is not explicitly forbidden, while
still providing plenty of symbol characters for read-syntax extensions.

Principles.  You can't say A is         John Cowan <>
made of B or vice versa.  All mass
is interaction.  --Richard Feynman