Email list hosting service & mailing list manager

strings draft Tom Lord (22 Jan 2004 05:11 UTC)
Re: strings draft Shiro Kawai (22 Jan 2004 09:46 UTC)
Re: strings draft Tom Lord (22 Jan 2004 17:45 UTC)
Re: strings draft Shiro Kawai (23 Jan 2004 05:03 UTC)
Re: strings draft Tom Lord (24 Jan 2004 00:45 UTC)
Re: strings draft Matthew Dempsky (23 Jan 2004 20:01 UTC)
Re: strings draft Shiro Kawai (24 Jan 2004 03:26 UTC)
Re: strings draft Tom Lord (24 Jan 2004 04:31 UTC)
Re: strings draft Shiro Kawai (24 Jan 2004 04:49 UTC)
Re: strings draft Tom Lord (24 Jan 2004 19:01 UTC)
Re: strings draft Shiro Kawai (24 Jan 2004 22:15 UTC)
Octet vs Char (Re: strings draft) Shiro Kawai (26 Jan 2004 09:58 UTC)
Strings, one last detail. bear (30 Jan 2004 21:12 UTC)
Re: Strings, one last detail. Shiro Kawai (30 Jan 2004 21:43 UTC)
Re: Strings, one last detail. Tom Lord (31 Jan 2004 00:27 UTC)
Re: Strings, one last detail. bear (31 Jan 2004 20:25 UTC)
Re: Strings, one last detail. Tom Lord (31 Jan 2004 20:56 UTC)
Re: Strings, one last detail. bear (01 Feb 2004 02:28 UTC)
Re: Strings, one last detail. Tom Lord (01 Feb 2004 02:58 UTC)
Re: Strings, one last detail. bear (01 Feb 2004 07:53 UTC)
Re: Octet vs Char (Re: strings draft) bear (26 Jan 2004 19:04 UTC)
Re: Octet vs Char (Re: strings draft) Matthew Dempsky (26 Jan 2004 13:13 UTC)
Re: Octet vs Char (Re: strings draft) Matthew Dempsky (26 Jan 2004 13:41 UTC)
Re: Octet vs Char Shiro Kawai (26 Jan 2004 23:38 UTC)
Re: Octet vs Char (Re: strings draft) Ken Dickey (26 Jan 2004 19:40 UTC)
Re: Octet vs Char Shiro Kawai (27 Jan 2004 05:10 UTC)
Re: Octet vs Char Tom Lord (27 Jan 2004 05:37 UTC)
Re: Octet vs Char bear (27 Jan 2004 08:35 UTC)
Re: Octet vs Char (Re: strings draft) bear (27 Jan 2004 08:32 UTC)
Re: Octet vs Char (Re: strings draft) Ken Dickey (27 Jan 2004 06:50 UTC)
Re: Octet vs Char (Re: strings draft) bear (27 Jan 2004 19:06 UTC)
Re: strings draft bear (22 Jan 2004 19:05 UTC)
Re: strings draft Tom Lord (23 Jan 2004 02:06 UTC)
READ-OCTET (Re: strings draft) Shiro Kawai (23 Jan 2004 06:00 UTC)
Re: strings draft bear (23 Jan 2004 07:04 UTC)
Re: strings draft bear (23 Jan 2004 07:20 UTC)
Re: strings draft Tom Lord (24 Jan 2004 00:15 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 01:58 UTC)
Re: strings draft Tom Lord (26 Jan 2004 02:35 UTC)
Re: strings draft bear (26 Jan 2004 02:35 UTC)
Re: strings draft Tom Lord (26 Jan 2004 03:01 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 03:00 UTC)
Re: strings draft Tom Lord (26 Jan 2004 03:27 UTC)
Re: strings draft Shiro Kawai (26 Jan 2004 04:57 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 04:57 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 18:48 UTC)
Re: strings draft bear (24 Jan 2004 02:21 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 02:09 UTC)
Re: strings draft Tom Lord (23 Jan 2004 02:42 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 02:44 UTC)
Re: strings draft Tom Lord (23 Jan 2004 03:07 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:04 UTC)
Re: strings draft Tom Lord (23 Jan 2004 03:29 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:42 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 02:34 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 02:42 UTC)
Re: strings draft Tom Lord (23 Jan 2004 03:02 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 02:58 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:13 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 03:18 UTC)
Re: strings draft Bradd W. Szonye (23 Jan 2004 19:31 UTC)
Re: strings draft Alex Shinn (26 Jan 2004 02:21 UTC)
Re: strings draft Bradd W. Szonye (06 Feb 2004 23:30 UTC)
Re: strings draft Bradd W. Szonye (06 Feb 2004 23:33 UTC)
Re: strings draft Alex Shinn (09 Feb 2004 01:45 UTC)
specifying source encoding (Re: strings draft) Shiro Kawai (09 Feb 2004 02:51 UTC)
Re: strings draft Bradd W. Szonye (09 Feb 2004 03:39 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:12 UTC)
Re: strings draft Alex Shinn (23 Jan 2004 03:28 UTC)
Re: strings draft tb@xxxxxx (23 Jan 2004 03:44 UTC)
Parsing Scheme [was Re: strings draft] Ken Dickey (23 Jan 2004 08:07 UTC)
Re: Parsing Scheme [was Re: strings draft] bear (23 Jan 2004 17:55 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (23 Jan 2004 18:50 UTC)
Re: Parsing Scheme [was Re: strings draft] Per Bothner (23 Jan 2004 18:56 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 20:39 UTC)
Re: Parsing Scheme [was Re: strings draft] Per Bothner (23 Jan 2004 20:57 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 21:57 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 20:20 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (23 Jan 2004 21:22 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 22:52 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (24 Jan 2004 06:48 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (24 Jan 2004 18:55 UTC)
Re: Parsing Scheme [was Re: strings draft] tb@xxxxxx (24 Jan 2004 19:34 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (24 Jan 2004 22:02 UTC)
Re: Parsing Scheme [was Re: strings draft] Ken Dickey (23 Jan 2004 12:53 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (23 Jan 2004 23:35 UTC)
Re: Parsing Scheme [was Re: strings draft] Ken Dickey (24 Jan 2004 16:10 UTC)
Re: Parsing Scheme [was Re: strings draft] Tom Lord (25 Jan 2004 03:14 UTC)
Re: strings draft Matthew Dempsky (25 Jan 2004 00:00 UTC)
Re: strings draft Tom Lord (25 Jan 2004 07:29 UTC)
Re: strings draft Matthew Dempsky (26 Jan 2004 16:53 UTC)
Re: strings draft Tom Lord (27 Jan 2004 00:44 UTC)

READ-OCTET (Re: strings draft) Shiro Kawai 23 Jan 2004 06:00 UTC

>From: Tom Lord <xxxxxx@emf.net>
Subject: Re: strings draft
Date: Thu, 22 Jan 2004 18:06:24 -0800 (PST)

>     > In fact, READ-OCTET and WRITE-OCTET would in that case become primitive,
>     > since READ-CHAR and DISPLAY could be implemented in terms of them but
>     > the reverse would not be true.
>
>     > This neatly sidesteps the issue of needing character mappings for
>     > every member of the range 128-255, and separates the ideas of octet
>     > and character at the lowest level.
>
> Hmm.  Well, an example of what it fails to sidestep is the issue of
> making the values representable by the C `char' type a subset of CHAR?
> It's also a fairly sorry approach to take for implementing many
> network protocols in a way that is simple, direct, "tolerant of what
> it receives".

Hm, I now see an advantage in Tom's approach.

I've written code for an email filter program (with bayesian
spam filtering, of course :-)  I read a RFC2822 message header
and build an assoc list of header field, dealing with folded
header lines.  Although RFC2822 defines the field body of the
message headers should include only US-ASCII characters
(except CR and LF), there are messages that has other octets
within the header.

With Tom's approach that a character can be used to represent
an octet as well, probably one can set the input port encoding mode
to "raw" or something (assume the port has a feature of character
set conversion), then let read-char to retrieve each octet.
In such case, you need to do the "encoding conversion"
over the string afterwards (potentially performing "encode guessing"
before that).   The string that contains range 128-255 characters
might be unprintable as is, but the implementation can have
some escaped format for them.

The approach I'm taking is to read the header field as an
octet stream, and construct an octet string, which is a special
type of string that can contain any octet sequences.  After
I do necessary processing, I make a conversion on octet string
to produce a valid string, which contains legal characters.

The benefit of octet string is that (1) it is fast to convert
to underlying byte string (2) you can always tell the string
is "safe and normal" or not.
(1) is important for some applications, for example a program
that does lots of UDP packet sending and returning---such "block"
read/write is done in either octet string or uniform vector
in my Scheme, and they are fast because it directly grabs C
buffer.

However, I do feel the presense of octet string ad-hoc.  Tom's
approach does have conceptual cleanness, although probably
the programmer has to be careful about the state of the string
object she is dealing with (i.e. whether it has been converted
or not).

--shiro