Re: More comments, and the ANTLR code is too complex

Re: More comments, and the ANTLR code is too complex Alan Manuel Gloria 13 Jun 2013 04:14 UTC
Thanks for the reply Mark!  We'll consider your suggestions.
Improving adoption for this syntax is an important goal for us; we
want others to at least try using it before judging the (de)merits of
the syntax, and having it implemented in a major Scheme implementation
(guile) would help tremendously.

Please wait a while, we'll think about how best to go about showing a
"simple and easy way to add SRFI-110 on top of SRFI-105/random Scheme
implementation".

On 6/13/13, Mark H Weaver <xxxxxx@netris.org> wrote:
> Hi David,
>
> "David A. Wheeler" <xxxxxx@dwheeler.com> writes:
>> Below is a first shot at breaking up it_expr, currently 1 long rule, into
>> 2 rules.
>> This could obviously be repeated to make more rules, each one simpler.
>> Not saying it's done, but would it help to break the current longer rules
>> into more but smaller rules?
>
> No.  This doesn't help at all, because it doesn't reduce the total
> complexity of the specification.  My concern is the amount of mental
> effort required to understand the precise specification.
>
> Part of the problem is that your specification is actually an
> _implementation_, which is made more complex by efficiency concerns.
> For example, constraining yourself to an LL(1) grammar probably rules
> out a more elegant presentation.
>
> Another big problem is the amount of redundancy in this grammar.  For
> example, the pattern "scomment hspace*" is repeated in many places.
> Sometimes it's a prefix wrapped in (...)*, and other times it's iterated
> over by tail recursion.  The pattern "COLLECTING hspace* collecting_tail
> hspace*" is also repeated in several places.  These redundancies make
> more work for the reader, and make me wonder "are all these actually the
> same, or are there slight differences?"
>
> I suspect that the key to simplifying this grammar (apart from moving
> away from ANTLR for purposes of the specification) is to choose a
> different set of non-terminals.
>
> Please take a look at section 7.1 of the R5RS (or the R7RS draft).
> Understanding that grammar is almost effortless, and there's almost no
> redundancy.  Now take a look at the specifications of SRFI-10, SRFI-30,
> and SRFI-38.  All of them are expressed as a list of modifications to
> the R5RS grammar.  That's the kind of thing I'd like to see in the
> SRFI-110 specification.
>
> One more nit while I'm on this subject: In the BNF conventions section,
> you write "a sweet-expression reader MUST act as if it preprocessed its
> input as follows", but as far as I can tell it's not actually possible
> to implement this as a preprocessor.  This "preprocessing" must be
> interleaved with parser, because several syntactic elements affect the
> preprocessing.  For example, the <* and *> markers manipulate the
> preprocessor's stack, and yet you need a full parser to recognize those
> markers.  Also, if I understand correctly, indentation is only processed
> outside of n-expressions.
>
> I also think that there needs to be a much simpler sample
> implementation: one which does not attempt to be fully featured
> (e.g. omit support for source location tracking), and which is not a
> fully self-contained reader, but is instead expressed in terms of
> existing procedures which are likely already present in an SRFI-105
> reader (or which could be easily created from existing code).
>
> In other words, you should help implementors understand how to add
> SRFI-110 to their existing readers with a minimal amount of code
> changes.  The resulting code needs to be as simple as reasonably
> possible.
>
> Here's one possible strategy: Assume the existence of an n-expression
> reader.  Now write a t-expression reader in terms of it, in the most
> elegant Scheme code possible.  It turns out this is not quite possible,
> but hopefully the problems can be patched up by assuming the existence
> of some other helpers, and/or by adding some functionality to the
> n-expression reader.
>
> After our last email exchange, I spent some time thinking about this,
> and identified a few additional things you might need:
>
> * In order to recognize the special markers, you'll need either (1) a
>   way to "unread" characters, or (2) a way for the n-expression reader
>   to tell you that e.g. the symbol '<*' was "by itself" for purposes of
>   SRFI-110.
>
> * You might need a helper to read special comments without consuming the
>   following datum.
>
> I'm sorry that I cannot be more complete in my analysis of what needs to
> be done, but my time (and motivation) is limited.  Reformulating this
> code will be a lot of work, but I suspect that adoption will be very low
> unless you can show implementors how to add SRFI-110 easily and with a
> small amount of code.
>
>      Regards,
>        Mark
>
>