<applause>
In higher-level code I'll want to stick character encoding
information and endianness into the port objects, but you have
correctly identified the real primitives of I/O.
read-byte
write-byte
peek-byte
byte-ready?
are what we need, because at this moment in history there
are several competing "standard" ways to write characters (and
unicode is multiplicitous, which means that reading/writing "a
character" can never have a fundamental, unambiguous meaning
in terms of binary I/O ever again).
I would, in fact, advocate that any and all definitions
of read-char, write-char, etc, be defined in terms of these
operations rather than the other way 'round, so that they can
be redefined for different character environments by loading
different libraries.
One issue; how much of a standard is BER? 12.5 percent
protocol lossage seems like a lot to me. I'd rather use 1 bit
out of 16 than 1 bit out of 8 to carry the "continuing" information.
bits to encode various "actual" integer lengths:
length 8-bit BER 16-bit modified BER
32-36 40* 48 >
37-41 48 48 =
42-48 56* 64 >
49-55 64 64 =
56-59 72 *64 <
60-63 72* 80 >
64-69 80 80 =
70-74 88 *80 <
75-76 88* 96 >
77-83 96 96 =
84-90 104 *96 <
91-97 112 112 =
As you can see, for small values 8-bit BER is more
efficient, but the difference between 8 and 16 bit BER breaks
even right around 64 bits, and after we hit 77 bits in real
length, 8 bit BER is never more efficient again.
Since hardware is increasingly supporting reads and
writes of 16 bits as faster than reads and writes of 8 bits,
and since numeric formats up to around 64 bits are often
supported by special purpose instructions, and since in the
ranges where we're forced to a BER type representation we'll
probably use fewer bits with a 16-bit BER, I think we should
prefer a 16-bit unit with a continuation bit rather than an
8-bit unit with a continuation bit.
But the difference is awfully small in importance.
If there are existing tools out there that support the 8-bit
BER format, I'd say go with it.
Bear