Email list hosting service & mailing list manager


Re: Mixing characters and bytes Per Bothner 24 Aug 2005 18:42 UTC

Michael Sperber wrote:
 > Per Bothner <xxxxxx@bothner.com> writes:
 >
 >
 >>Note that neither use-case is supported by SRFI-68 unless
 >>you stick to UTF-8 or go down into the stream level.
 >
 >
 > The operative word is "stream level" here.  You've argued throughout
 > that its usefulness is questionable, but when its usefulness becomes
 > apparent, you seem reluctant to use it.

A couple of reasons:
* Performace:  Using streams is expensive - at the very least it puts
a lot of pressure on the garbage collector.  A program that uses
imperative ports should not have to use streams just because it
wants to use a non-utf-8 encoding.
* Complexity:  Having to deal with a very different abstraction
just because I want to use a non-utf-8 encoding complicates my
program and my mental model.
* Switching encodings: You can't switch encodings without switching
ports.

To use other encodings you have to use streams.  To mix encodings or
non-utf-8 text and binary you have to use streams exclusively.  People
should use streams if they want to use the stream (functional) model,
but I don't think requiring people to use streams just because they're
not using utf-8 will fly.

 >>Second, some ports are text-only, in the sense that they cannot
 >>meaningfully support byte operations.  This includes the string ports
 >>specified by SRFI 6.
 >
 > The string ports specified in SRFI 6 can support byte operations
 > perfectly meaningfully.  I believe SRFI 68 contains a variation of it.

I guess so, by essentially converting the string to a blob and
vice versa.  I.e. a stream of characters becomes a stream of
the utf-8 encoding of the characters.  That makes string ports
slightly trickier to implemented but probably not much so.
Ok, I'll buy that.
--
	--Per Bothner
xxxxxx@bothner.com   http://per.bothner.com/