binary vs non-binary ports Per Bothner (16 Sep 2004 04:51 UTC)
Re: binary vs non-binary ports Alex Shinn (16 Sep 2004 05:34 UTC)
Re: binary vs non-binary ports Per Bothner (16 Sep 2004 06:54 UTC)
Re: binary vs non-binary ports Alex Shinn (16 Sep 2004 07:26 UTC)
Re: binary vs non-binary ports Shiro Kawai (16 Sep 2004 08:30 UTC)
Re: binary vs non-binary ports Alex Shinn (17 Sep 2004 03:43 UTC)
Re: binary vs non-binary ports Alex Shinn (17 Sep 2004 05:32 UTC)
Re: binary vs non-binary ports Per Bothner (17 Sep 2004 17:22 UTC)
Re: binary vs non-binary ports Shiro Kawai (17 Sep 2004 20:44 UTC)
Re: binary vs non-binary ports Hans Oesterholt-Dijkema (17 Sep 2004 21:26 UTC)
Re: binary vs non-binary ports Alex Shinn (18 Sep 2004 02:15 UTC)
Re: binary vs non-binary ports Per Bothner (18 Sep 2004 16:31 UTC)
Re: binary vs non-binary ports Bradd W. Szonye (18 Sep 2004 17:43 UTC)
Re: binary vs non-binary ports Per Bothner (18 Sep 2004 19:51 UTC)
Re: binary vs non-binary ports Hans Oesterholt-Dijkema (18 Sep 2004 18:04 UTC)
Re: binary vs non-binary ports Bradd W. Szonye (18 Sep 2004 19:21 UTC)
Re: binary vs non-binary ports Alex Shinn (20 Sep 2004 02:06 UTC)
Re: binary vs non-binary ports Per Bothner (20 Sep 2004 02:46 UTC)
Re: binary vs non-binary ports Alex Shinn (18 Sep 2004 02:21 UTC)
Re: binary vs non-binary ports Per Bothner (18 Sep 2004 20:04 UTC)
Re: binary vs non-binary ports Hans Oesterholt-Dijkema (17 Sep 2004 21:37 UTC)
Re: binary vs non-binary ports Hans Oesterholt-Dijkema (17 Sep 2004 22:40 UTC)
Re: binary vs non-binary ports Hans Oesterholt-Dijkema (17 Sep 2004 22:48 UTC)

Re: binary vs non-binary ports Per Bothner 16 Sep 2004 06:54 UTC

Alex Shinn wrote:

> Ideally, as Bear mentioned earlier, I like to think of the byte-level
> operations as the only primitives on top of which character-level
> operations are defined, but that is an implementation detail.

Yes, but you don't want to force every Scheme implementor to have
to manage this char<->byte mapping in the Scheme run-time, as opposed
to being able to use existing C/C++/Java APIs which don't work the
way you want them to work.

> "Complicated" should not prevent us from adding language features, and
> I don't see this as any more complex than having additional primitive
> port types.

Byte<->Char conversion is complicated.  Not conceptually, but
there are big tables and and a good chunk of code if you want to
support many languages.  Most operating systems and "core libraries"
these days can do the translation.  You really don't want to implement
this code in your Scheme runtime, but instead you want to build on
existing libraries and APIs.

Existing APIs (Java, C++, C) disinguish byte I/o from chracter I/O,
generally using different types.  They may not support easy on-the-fly
switching between binary mode and character mode.

So the proposed model means Scheme run-times have to open ports in
binary mode and do their own byte<->char conversion.  That is not a
nice to ask of Scheme implementors.

>>It makes no sense to mix character and binary I/O on the same port.
>>Anyone who tries it is in a state of sin.
>
> I work very often with binary file formats, including Scheme libraries
> for handling ELF, TIFF, and the gettext .mo format among others.
> Every one of these mixes binary and character data.

I did not say character data - I said character I/O.  It is perfectly
feasable to read/write character and string data from/to a binary
stream - but then you have to define how they are encoded or do the
mapping before/after you write/read them.  If you're in a Japanese
locale, and write a string to an ELF file, what happens?  What happens
when I call (newline) in a Windows environment - should it write "\n"
or "\r\n"?

 > Apparently almost
> everyone who has ever designed a binary format is a sinner :)

Most of these formats don't support general characters.  Of course
you can have general characters encoded in a ELF section, but ELF
views that as just binary data.  ELF does know about labels and
section names, but there is no support for multiple encodings or
wide characters.

> 3) Extract character data in binary ports as binary first then convert
>    with utility procedures to character/string.

Yes, conceptually that is what should be going on.  But if you want to
be able to do binary I/O on an arbitary port (that was opened in default
mode) then that constrains the implementation unacceptably.  Existing
code that implements ports may have to be extensively rewritten.
--
	--Per Bothner
xxxxxx@bothner.com   http://per.bothner.com/