Email list hosting service & mailing list manager

Re: the "Unicode Background" section Thomas Lord (23 Jul 2005 20:18 UTC)
Re: the "Unicode Background" section John.Cowan (24 Jul 2005 05:35 UTC)

Re: the "Unicode Background" section John.Cowan 24 Jul 2005 05:35 UTC

Thomas Lord scripsit:

> This is a fairly radical proposal.   It means, for example,
> the READ-CHAR will never know squat about UTF-8:  READ-CHAR
> is doomed, under my suggestions, to remain forever a low-level
> procedure.

"Radical" is not the word.  It means that a conformant Scheme will be
*compelled* to interpret text files as Latin-1, even on systems where
that is not the native encoding, unless the user or the system interposes
an interpretive layer that cleans up the characters.

Why privilege Latin-1 in such a fashion?  It's not even the native
encoding of the majority of systems out there.  It merely happens to
be the encoding that contains the bottom 256 Unicode codepoints.

> On the other hand, it's upward compatible and sets a stage
> for experimentation re I/O paradigms.  (Upward compat with
> the standard, not implementations -- the divergence being
> over how procedures are named, not what they do.)

It prescribes behavior that the standard did not; for example, it
compels a 0x80 byte to be interpreted as U+0080, though on most
Windows systems 0x80 encodes U+20A0, the Euro sign.

--
Ambassador Trentino: I've said enough. I'm a man of few words.
Rufus T. Firefly: I'm a man of one word: scram!
        --Duck Soup                     John Cowan <xxxxxx@reutershealth.com>