Michael Sperber wrote:
> Per Bothner <xxxxxx@bothner.com> writes:
>
>
>>The benefit is for the implementors: If you specify ports that can
>>arbitrarily mix text and binary then implementors can no longer
>>use common abstractions and existing libraries.
>
> Well, if you don't allow this, you also prevent common abstractions.
The abstraction [or may "functionality"?] of "writing a character or
string to a binary file" is meaningless. The functionality of "writing
a character/string in a specific encoding" does make sense. I agree
the functionality of writing a binary file that can contain embedded
UTF-8 strings is meaningful and useful.
However, this functionality is different from a character-stream
API.
> As to the use of "existing librariers," you can use the reference
> implementation---that's why I wrote it.
> ...
> I'm not sure what you mean by "directly," but SRFI 68 specifies
> transcoders for Latin 1, UTF-16, and UTF-32.
Are you serious? These are all trivial, and few people
(no-one?) uses UTF-16 or UTF-32 for file storage/transmission.
How do you plan to support all the other Latin-X character sets,
JIS, GB 2312-80, and ISO 2022?
> I'm unclear on what your point is---if you want to know how easy or
> hard it is to write custom transcoders, look in the reference
> implementation on how the UTF transcoders are done
I don't want to write transcoders. Other people have already written
then, and it's a lot of work, and a lot of large tables. I want to
use the existing Java trsnacoders or the existing C library
transcoders.
>>The default encoding of a character port *must* be the
>>"native" encoding of the user's locale.
>
> No, it must not be, especially as the idea of a "native encoding"
> associated with the locale is shadowy at best. For example, your OS
> might have a notion of locale, but simply sport separate I/O
> procedures for different text encodings. That's the case on Windows,
> for example.
I'm missing your point, or vice versa. My point is: if I as a
computer create a text file without doing anything special,
it will have a particular encoding that is presumably suitable
for my language and environment. This default encoding may
have been set up by my system administrator or the company
that sold me my computer.
If I then as a beginning Scheme programmer write a program
to read this file, I should not have to use an magic options
or commands to do so - it should just work, even if my
default encoding is different from with UTF-8.
>>I don't see how anything else can even be seriously considered: a
>>beginning Scheme programmer should be able to write a simple program
>>that reads or writes a file without having to set up translators, or
>>specify an encoding. SRFI 68 appears to contradict this
>>requirement.
>
>
> I don't think so---at least that wasn't my intention. Where do you
> think does SRFI 68 contradict you?
I can't comment specifically, since SRFI 68 seems to be unavailable:
http://srfi.schemers.org/srfi-68/srfi-68.html yields a file not found.
But perhaps you could comment on do you expect this to work? Assume
I'm Japanese and using some version of JIS. (Or a German using
Latin-1, for that matter.) How should my Scheme implementation handle
this?
--
--Per Bothner
xxxxxx@bothner.com http://per.bothner.com/