Names and primitives in SRFI 56
Hans Oesterholt-Dijkema
(17 Sep 2004 21:53 UTC)
|
Re: Names and primitives in SRFI 56
Alex Shinn
(18 Sep 2004 02:33 UTC)
|
Re: Names and primitives in SRFI 56
Hans Oesterholt-Dijkema
(18 Sep 2004 09:53 UTC)
|
Re: Names and primitives in SRFI 56 bear (18 Sep 2004 19:46 UTC)
|
Re: Names and primitives in SRFI 56
Bradd W. Szonye
(19 Sep 2004 18:08 UTC)
|
Re: Names and primitives in SRFI 56
bear
(19 Sep 2004 19:28 UTC)
|
Re: Names and primitives in SRFI 56
Dave Mason
(28 Sep 2004 11:36 UTC)
|
Re: Names and primitives in SRFI 56
Alex Shinn
(28 Sep 2004 15:11 UTC)
|
Re: Names and primitives in SRFI 56
bear
(18 Sep 2004 02:58 UTC)
|
On Sat, 18 Sep 2004, Hans Oesterholt-Dijkema wrote: > Alex Shinn wrote: >> Apart from further conflicting with possible binary/character port >> distinctions, >Hmm. I'm not sure I agree on that. Binary I/O simply means there's >no interpretation given to the I/O; As I see it, the primitives >to write and read provide the interpretation (see also my earlier >e-mail about doors and what goes through them). The thing is, "string", and even "character," is precisely what binary I/O does not do: an interpretation of binary data. I expect, from a proposal for binary I/O, to get the family of primitives I need to then go and *write* libraries that handle reading and writing strings and characters. This is one of the points where there has been a 'castle built in the air' in R5RS; we assume the ability to read and write "characters", but have had no way of accessing what actual binary forms are read or written. As a result, the few binary-handling libraries we have (manipulating, for instance, executable files, graphics files, audio files, etc), all rely on the implementation using particular encodings and character formats, which are in no way guaranteed by the standard. Worse, as implementations move into the wide weird world of fully supporting Unicode, those vital utilities are becoming less portable, not more portable. Worse still, Unicode has sufficiently complicated the representation of characters (by having lots of different encodings itself, as well as by increasing the number of standards that some application a scheme program needs to interoperate with might be using, that the simple assumption that we can read and write "characters" without specifying their binary form has failed since R5RS was written. Let me say that again, for emphasis. The conditions on which R5RS was predicated have broken. Our standard is now broken. R5RS makes sense, sort of, in a world where any environment could be assumed to have *some* character encoding so dominant as for character encoding to be a nonissue. By leaving such issues unspecified, R5RS left room for different choices to be made, which would produce sensible systems for "standalone" use in such environments. That is no longer the world in which we live. By failing to specify any means of purely binary I/O, R5RS left portable scheme unable to cope with a networked world rich in purely binary formats and a world where characters from different sources can be encountered in many different binary formats. And that is the world in which we live now. This SRFI clearly aims to lay some foundation stones for building on that will hold in the current world. In order to do so, it *MUST* specify rigid, purely binary, I/O. Character I/O, being subject to interpretation and reinterpretation, is for a different layer, or, as Alex says, >> this is beyond the scope of this SRFI. A general text >> parsing library with procedures for reading delimited or terminated >> strings with an optional size limit would be the right place for this. > >That's OK with me, but let's start with such an SRFI right away, >because A binary i/o srfi without primitives for character strings >seems to me a littlebit, ah how does one say that in english, "disabled?". Character strings are not binary. They are characters. Binary is data without interpretation. Characters are an interpretation of binary data. These are different ideas. We can sweep interpretation like FLOAT32 into "binary" I/O at this point, but only because of the efforts of the IEEE which have provided an encoding for such interpretations that is so universal and dominant as to be a nonissue. The encoding is sign-bit in the MSB, followed by 8 bits of exponent, followed by 23 bits of normalized mantissa; the exponent measures powers of two, not powers of ten or four or any of the other strange things we had to cope with fifteen years ago. The mantissa is normalized base-2, not BCD. All that craziness has (mostly) passed away and the world is a better place for its absence. But the point is that if the IEEE standard were not now so universal as to be a nonissue, floating-point encoding would be a matter for a separate SRFI, too. And another point is that if we need to create a different binary floating-point encoding, for whatever reason, we're going to look to this SRFI's *BINARY* primitives to give us the basic tools to implement a way to read and write it. Encodings can pass out of use, however unlikely it seems; bits won't. But character encoding is not, and never has been, universal. In fact, for the next couple of decades it's looking damned hairy. It's way too big and way too unstable to try to build into a foundation. Any character I/O SRFI is going to have to be built *ON* the routines that this SRFI is trying to provide. It is as futile to attempt one without first getting purely binary I/O as it is to attempt to build a castle without first laying foundation stones. Bear