Re: the "Unicode Background" section
John.Cowan 24 Jul 2005 05:35 UTC
Thomas Lord scripsit:
> This is a fairly radical proposal. It means, for example,
> the READ-CHAR will never know squat about UTF-8: READ-CHAR
> is doomed, under my suggestions, to remain forever a low-level
> procedure.
"Radical" is not the word. It means that a conformant Scheme will be
*compelled* to interpret text files as Latin-1, even on systems where
that is not the native encoding, unless the user or the system interposes
an interpretive layer that cleans up the characters.
Why privilege Latin-1 in such a fashion? It's not even the native
encoding of the majority of systems out there. It merely happens to
be the encoding that contains the bottom 256 Unicode codepoints.
> On the other hand, it's upward compatible and sets a stage
> for experimentation re I/O paradigms. (Upward compat with
> the standard, not implementations -- the divergence being
> over how procedures are named, not what they do.)
It prescribes behavior that the standard did not; for example, it
compels a 0x80 byte to be interpreted as U+0080, though on most
Windows systems 0x80 encodes U+20A0, the Euro sign.
--
Ambassador Trentino: I've said enough. I'm a man of few words.
Rufus T. Firefly: I'm a man of one word: scram!
--Duck Soup John Cowan <xxxxxx@reutershealth.com>