Splitting SRFI-10 into three?

Show/hide message thread
Splitting SRFI-10 into three? oleg@xxxxxx (13 Oct 1999 19:59 UTC)
Re: Splitting SRFI-10 into three? Richard Kelsey (13 Oct 1999 21:25 UTC)
Splitting SRFI-10 into three? oleg@xxxxxx 13 Oct 1999 19:58 UTC
	I indeed agree with what you said in the last two messages;
yet I want to push a tad bit farther.

	Let me try to restate your suggestions, to make sure I know
what you were saying.

	Suppose a future SRFI proposes a new distinct type of Scheme
values.  For example, it can introduce "path values" or a "void
value" (which is implicitly present in many Scheme systems). It's nice
to have an external representation for these datatypes, so they can be
written and read. Therefore, the formal syntax of Scheme ("Lexical
structure" and "External representations", R5RS 7.1.1 and 7.1.2) has
to be extended correspondingly. The most common way of such an
extension is as
	"#" <discriminator> <other-char>*
where <discriminator> is one character, which should be different from
characters '(', '\', 'i', 'e', 'b', 'o', 'x', 'd' and possibly 'f',
'!' and 't'. As we see there isn't too much choice for the
<discriminator>, especially if we want to make it mnemonic.

	SRFI-10 proposes another, formal and generic, way of extending
external representations of scheme values, namely, via a #,(<symbol>
<datum>*) form. A particular SRFI-X that introduces a new data type
simply should pick up the appropriate <symbol> and decide upon
<datum>-arguments. The SRFI-X does not need to extend the external
syntax any more, nor does it need to fight for the remaining
characters that may be used as the <discriminator>. Of course the
implementation of SRFI-X should support writing of values of the new
datatype in the #,() form, and reading it back. The exact way of doing
that is up to SRFI-X or its implementation, it is not SRFI-10's
concern. SRFI-10 shouldn't even care how exactly a particular
#,(<symbol> <datum>*) form is interpreted.

	Am I right it representing your position?

	I contend however that the #,() notation is useful not only for
new datatypes. Existing Scheme datatypes can benefit from
#,() specification as well. For example, consider the following
external forms:
	#,(pi) #,(epsilon) #,(Infinity) #,(NaN)

They all represent (inexact) numbers. They are quite useful for doing
(IEEE floating point) arithmetics.

	Furthermore, consider "variable constants", for example,
#,(os-type). When the reader scans the code, it replaces #,(os-type)
with a literal symbol, e.g., Solaris, HP-UX. Because this "binding"
occurs very early, the corresponding symbol can be analyzed in various
macros, syntax and other special forms. Another useful notation of the
same kind is #,(srfi-features), which can be replaced by the list of
the feature identifiers the current implementation supports.

	You're saying that SRFI-10 has nothing to implement. Yet it
extends the syntax of Scheme, and thus requests that a Scheme reader
at least recognize the #,() form. Currently it does not.

> Every Scheme implementation already has this: it implements SRFI-10
> with an empty set of symbolic tags.

I'm afraid I don't see this. When a non-SRFI-10 compliant reader comes
across a #,(foo) form, it immediately throws an error. When a SRFI-10
compliant reader sees the same form and does not know what to make of
the 'foo' tag, the reader reports generally a different kind of
error. Actually, the is no requirement to _report_ an error at
all. The reader may simply replace the #,(foo) form with something
(in)appropriate, for example, #f.

	Speaking of errors, I think I have to make clearer what a
reader should do when it encounters a #,(foo bar...) form with an
unknown tag 'foo'. I think the reader should read this form as a list
(foo bar ...) (as if the #, characters were not present). If the
#,(foo bar...) was a part of Scheme code, then foo might be bound to a
syntax rule, which thus gets activated. Or 'foo' may be bound to a
procedure. In any case, the reader simply shifts the burden of dealing
with an unknown #,() form to its caller.

> What is being proposed is that the following rule be added to the
> grammar for external representations (see section 7.1.2 in R5RS):
>  <compound datum>  -->  #,(<symbol> <datum>*)

As stated earlier, a #,() form may as well denote a simple <datum>.

It appears therefore that the grammar for external representations
should be changed as follows:

<datum> ---> <simple datum> | <compound datum> | <hash-comma-datum>

<hash-comma-datum> ---> "#,(" <symbol> <datum>* ")"

Furthermore, the production for <token> in R5RS Sec. 7.1.1 should be
extended to read
	<token> ---> <identifier> | ... | #,(

The only problem with the above grammar is that it outlaws forms like
#,(#,(foo) bar). Somehow I want to keep it. The only way to achieve
that will be to define <hash-comma-datum> as
	<hash-comma-datum> ---> "#,(" <datum> <datum>* ")"

A note has to be attached to this grammar stating that if a reader fails to
make sense of <hash-comma-datum> (because the first datum after the
"#,(" token is not a recognizable identifier), the reader should
return <hash-comma-datum> as a compound datum (<datum> <datum>*)
(that is, pretend that the "#,(" token was a "(" token).

	I guess you were saying that SRFI-10 was way overloaded. It
attempts to define a new extensible syntax, propose an implementation
that uses this syntax for read-time-applications, and even mentions
conditional compilation. It seems to make sense therefore to split
SRFI-10 into three, for each of these subjects. The first one should
be a meta-SRFI, S(RF)2I-0 in your notation. The read-time-application
should be a SRFI-X that implements S(RF)2I-0 and proposes to interpret
the #,() form as a read-time application. Every SRFI is required
to have an implementation (thus an implementation of a meta-SRFI should
be a regular SRFI). I wonder what SRFI editors may say to all this...