Re: Encodings. Paul Schlie (13 Feb 2004 02:18 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 03:35 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 05:59 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 06:36 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 08:00 UTC)
Re: Encodings. Robby Findler (13 Feb 2004 15:01 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 17:16 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 18:19 UTC)
Re: Encodings. Robby Findler (16 Feb 2004 01:03 UTC)
Re: Encodings. Paul Schlie (16 Feb 2004 03:21 UTC)
Re: Encodings. Paul Schlie (16 Feb 2004 04:18 UTC)
Re: Encodings. Robby Findler (16 Feb 2004 04:33 UTC)
Re: Encodings. bear (13 Feb 2004 17:40 UTC)
Re: Encodings. Per Bothner (13 Feb 2004 18:34 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 19:02 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 19:05 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 19:48 UTC)
Re: Encodings. Per Bothner (13 Feb 2004 19:11 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 19:44 UTC)
Re: Encodings. bear (13 Feb 2004 21:42 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 21:54 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 23:45 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 00:04 UTC)
Re: Encodings. bear (14 Feb 2004 01:06 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 01:08 UTC)
Re: Encodings. Paul Schlie (14 Feb 2004 02:35 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 03:00 UTC)
Re: Encodings. Paul Schlie (14 Feb 2004 03:04 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 03:08 UTC)
Re: Encodings. Paul Schlie (14 Feb 2004 03:29 UTC)
Re: Encodings. Paul Schlie (14 Feb 2004 02:19 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 03:04 UTC)
Re: Encodings. Paul Schlie (14 Feb 2004 03:10 UTC)
Re: Encodings. Bradd W. Szonye (14 Feb 2004 03:12 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 22:41 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 17:55 UTC)
Re: Encodings. Paul Schlie (13 Feb 2004 18:42 UTC)
Re: Encodings. Bradd W. Szonye (13 Feb 2004 18:53 UTC)
Re: Encodings. Ken Dickey (13 Feb 2004 21:53 UTC)
RESET [was Re: Encodings] Ken Dickey (14 Feb 2004 16:19 UTC)
Re: RESET [was Re: Encodings] bear (14 Feb 2004 18:02 UTC)
Re: RESET [was Re: Encodings] Bradd W. Szonye (14 Feb 2004 19:38 UTC)

Re: Encodings. bear 13 Feb 2004 17:40 UTC

It has long been my opinion that scheme needs binary I/O capabilities
to be standardized.  Character ports should be distinguished from
binary ports at a low level, and binary I/O operations should be
different operations from character I/O operations.

I'm entirely happy with (read-char) and (write-char) reading "a
character" (although I'll note that opinions vary on what a character
is, nuff said) or having (read) or (write) read or write a data object
in the external representation form which is made of characters.

Those are character operations, and when dealing strictly with
character operations, the appropriate place for concerns about
encoding, endianness, and external canonicalization are below the
level of the program's notice.  Fold all that stuff into the port code
for character ports and don't bother the programmer with it.  As far
as text processing code is concerned, a character is a character is a
character, or at least that's how it should be.

That leaves implementors the freedom to implement their character
ports in terms of whatever abstraction their particular platform uses
for characters, whether its utf8be or utf32le or ascii or ebcdic or
latin1 or ISO-foo or iscii or PETscii or the ZX81 charset or some
other embedded processor weirdness.  This is not a bug, it's a
feature.  If there are multiple encodings/canonicalizations/etc in use
on a system, let schemes on those systems implement multiple kinds of
character ports.

But it follows that there is NO WAY we should rely on I/O of
characters through character ports to read or write a particular
binary representation for "raw" data such as sound and image files.
Attempting to do so is bad design, because it breaks an abstraction
barrier and presumes things which are beyond the program's proper
control or knowledge.

The only reason programmers want to write characters that aren't in
the "normal" encoding/canonicalization/etc, is when they need really
close control of the exact format of I/O.  But when you need control
*that* close, you're not talking about a "character" port at all any
more; you're talking about binary I/O.  Rather than breaking the
abstraction barrier on character ports, you need a different kind of
port.  We need binary ports that support operations like (read-bytes)
and (write-bytes).

It may be needful to read and write "characters" on these ports; but
character fields inside binary data formats tend to be both very rigid
and diverse in their encoding/etc, so character operations on binary
ports, if supported at all, should IMO have mandatory arguments to
specify their encoding/etc.

				Bear