binary vs non-binary ports Per Bothner (16 Sep 2004 04:51 UTC)
Re: binary vs non-binary ports Alex Shinn (16 Sep 2004 05:34 UTC)
Re: binary vs non-binary ports Per Bothner (16 Sep 2004 06:54 UTC)
Re: binary vs non-binary ports Alex Shinn (16 Sep 2004 07:26 UTC)
Re: binary vs non-binary ports Shiro Kawai (16 Sep 2004 08:30 UTC)
Re: binary vs non-binary ports Alex Shinn (17 Sep 2004 03:43 UTC)
Re: binary vs non-binary ports Alex Shinn (17 Sep 2004 05:32 UTC)
Re: binary vs non-binary ports Per Bothner (17 Sep 2004 17:22 UTC)
Re: binary vs non-binary ports Shiro Kawai (17 Sep 2004 20:44 UTC)
Re: binary vs non-binary ports Hans Oesterholt-Dijkema (17 Sep 2004 21:26 UTC)
Re: binary vs non-binary ports Alex Shinn (18 Sep 2004 02:15 UTC)
Re: binary vs non-binary ports Per Bothner (18 Sep 2004 16:31 UTC)
Re: binary vs non-binary ports Bradd W. Szonye (18 Sep 2004 17:43 UTC)
Re: binary vs non-binary ports Per Bothner (18 Sep 2004 19:51 UTC)
Re: binary vs non-binary ports Hans Oesterholt-Dijkema (18 Sep 2004 18:04 UTC)
Re: binary vs non-binary ports Bradd W. Szonye (18 Sep 2004 19:21 UTC)
Re: binary vs non-binary ports Alex Shinn (20 Sep 2004 02:06 UTC)
Re: binary vs non-binary ports Per Bothner (20 Sep 2004 02:46 UTC)
Re: binary vs non-binary ports Alex Shinn (18 Sep 2004 02:21 UTC)
Re: binary vs non-binary ports Per Bothner (18 Sep 2004 20:04 UTC)
Re: binary vs non-binary ports Hans Oesterholt-Dijkema (17 Sep 2004 21:37 UTC)
Re: binary vs non-binary ports Hans Oesterholt-Dijkema (17 Sep 2004 22:40 UTC)
Re: binary vs non-binary ports Hans Oesterholt-Dijkema (17 Sep 2004 22:48 UTC)

Re: binary vs non-binary ports Alex Shinn 20 Sep 2004 02:06 UTC

At Sat, 18 Sep 2004 09:31:02 -0700, Per Bothner wrote:
>
> * Most file formats that mix text and binary i/o do *not* handle
> general strings: often they only support whatever character encoding
> the "creative" engineers are most familiar with.

I think relatively few formats assume a single encoding.  Either they
tend to treat strings agnostically as a sequence of bytes (leaving
encoding interpretation up to the programmer), or they allow a means
to specify the encoding.  Gettext and databases specify the encoding
within the file itself.  HTTP, MIME, and most internet standards also
provide a way to specify the encoding.  MIME allows multi-part
messages which may include files of multiple different encodings
within the same byte stream, and not just character encodings but
compression, encryption and other filters.  HTTP uses a chunked
encoding which requires you to switch back and forth between ASCII (to
read the chunk size) and the chunked data encoding within the same
byte stream, with chunks possibly splitting in the middle of a
character, or in the middle of a state in stateful encodings such as
ISO-2022.  These are common cases of the most commonly used protocols.
Mixing encodings is a fact of life.

Oleg has pointed out that Haskell is also in the process of looking
into binary I/O - the discussion is a good reference and comparison:

  http://www.haskell.org/pipermail/haskell-cafe/2004-September/006801.html

--
Alex