Splitting SRFI-10 into three? oleg@xxxxxx (13 Oct 1999 19:59 UTC)
|
Re: Splitting SRFI-10 into three?
Richard Kelsey
(13 Oct 1999 21:25 UTC)
|
I indeed agree with what you said in the last two messages; yet I want to push a tad bit farther. Let me try to restate your suggestions, to make sure I know what you were saying. Suppose a future SRFI proposes a new distinct type of Scheme values. For example, it can introduce "path values" or a "void value" (which is implicitly present in many Scheme systems). It's nice to have an external representation for these datatypes, so they can be written and read. Therefore, the formal syntax of Scheme ("Lexical structure" and "External representations", R5RS 7.1.1 and 7.1.2) has to be extended correspondingly. The most common way of such an extension is as "#" <discriminator> <other-char>* where <discriminator> is one character, which should be different from characters '(', '\', 'i', 'e', 'b', 'o', 'x', 'd' and possibly 'f', '!' and 't'. As we see there isn't too much choice for the <discriminator>, especially if we want to make it mnemonic. SRFI-10 proposes another, formal and generic, way of extending external representations of scheme values, namely, via a #,(<symbol> <datum>*) form. A particular SRFI-X that introduces a new data type simply should pick up the appropriate <symbol> and decide upon <datum>-arguments. The SRFI-X does not need to extend the external syntax any more, nor does it need to fight for the remaining characters that may be used as the <discriminator>. Of course the implementation of SRFI-X should support writing of values of the new datatype in the #,() form, and reading it back. The exact way of doing that is up to SRFI-X or its implementation, it is not SRFI-10's concern. SRFI-10 shouldn't even care how exactly a particular #,(<symbol> <datum>*) form is interpreted. Am I right it representing your position? I contend however that the #,() notation is useful not only for new datatypes. Existing Scheme datatypes can benefit from #,() specification as well. For example, consider the following external forms: #,(pi) #,(epsilon) #,(Infinity) #,(NaN) They all represent (inexact) numbers. They are quite useful for doing (IEEE floating point) arithmetics. Furthermore, consider "variable constants", for example, #,(os-type). When the reader scans the code, it replaces #,(os-type) with a literal symbol, e.g., Solaris, HP-UX. Because this "binding" occurs very early, the corresponding symbol can be analyzed in various macros, syntax and other special forms. Another useful notation of the same kind is #,(srfi-features), which can be replaced by the list of the feature identifiers the current implementation supports. You're saying that SRFI-10 has nothing to implement. Yet it extends the syntax of Scheme, and thus requests that a Scheme reader at least recognize the #,() form. Currently it does not. > Every Scheme implementation already has this: it implements SRFI-10 > with an empty set of symbolic tags. I'm afraid I don't see this. When a non-SRFI-10 compliant reader comes across a #,(foo) form, it immediately throws an error. When a SRFI-10 compliant reader sees the same form and does not know what to make of the 'foo' tag, the reader reports generally a different kind of error. Actually, the is no requirement to _report_ an error at all. The reader may simply replace the #,(foo) form with something (in)appropriate, for example, #f. Speaking of errors, I think I have to make clearer what a reader should do when it encounters a #,(foo bar...) form with an unknown tag 'foo'. I think the reader should read this form as a list (foo bar ...) (as if the #, characters were not present). If the #,(foo bar...) was a part of Scheme code, then foo might be bound to a syntax rule, which thus gets activated. Or 'foo' may be bound to a procedure. In any case, the reader simply shifts the burden of dealing with an unknown #,() form to its caller. > What is being proposed is that the following rule be added to the > grammar for external representations (see section 7.1.2 in R5RS): > <compound datum> --> #,(<symbol> <datum>*) As stated earlier, a #,() form may as well denote a simple <datum>. It appears therefore that the grammar for external representations should be changed as follows: <datum> ---> <simple datum> | <compound datum> | <hash-comma-datum> <hash-comma-datum> ---> "#,(" <symbol> <datum>* ")" Furthermore, the production for <token> in R5RS Sec. 7.1.1 should be extended to read <token> ---> <identifier> | ... | #,( The only problem with the above grammar is that it outlaws forms like #,(#,(foo) bar). Somehow I want to keep it. The only way to achieve that will be to define <hash-comma-datum> as <hash-comma-datum> ---> "#,(" <datum> <datum>* ")" A note has to be attached to this grammar stating that if a reader fails to make sense of <hash-comma-datum> (because the first datum after the "#,(" token is not a recognizable identifier), the reader should return <hash-comma-datum> as a compound datum (<datum> <datum>*) (that is, pretend that the "#,(" token was a "(" token). I guess you were saying that SRFI-10 was way overloaded. It attempts to define a new extensible syntax, propose an implementation that uses this syntax for read-time-applications, and even mentions conditional compilation. It seems to make sense therefore to split SRFI-10 into three, for each of these subjects. The first one should be a meta-SRFI, S(RF)2I-0 in your notation. The read-time-application should be a SRFI-X that implements S(RF)2I-0 and proposes to interpret the #,() form as a read-time application. Every SRFI is required to have an implementation (thus an implementation of a meta-SRFI should be a regular SRFI). I wonder what SRFI editors may say to all this...