Re: More comments, and the ANTLR code is too complex Alan Manuel Gloria 13 Jun 2013 04:14 UTC
Thanks for the reply Mark! We'll consider your suggestions. Improving adoption for this syntax is an important goal for us; we want others to at least try using it before judging the (de)merits of the syntax, and having it implemented in a major Scheme implementation (guile) would help tremendously. Please wait a while, we'll think about how best to go about showing a "simple and easy way to add SRFI-110 on top of SRFI-105/random Scheme implementation". On 6/13/13, Mark H Weaver <xxxxxx@netris.org> wrote: > Hi David, > > "David A. Wheeler" <xxxxxx@dwheeler.com> writes: >> Below is a first shot at breaking up it_expr, currently 1 long rule, into >> 2 rules. >> This could obviously be repeated to make more rules, each one simpler. >> Not saying it's done, but would it help to break the current longer rules >> into more but smaller rules? > > No. This doesn't help at all, because it doesn't reduce the total > complexity of the specification. My concern is the amount of mental > effort required to understand the precise specification. > > Part of the problem is that your specification is actually an > _implementation_, which is made more complex by efficiency concerns. > For example, constraining yourself to an LL(1) grammar probably rules > out a more elegant presentation. > > Another big problem is the amount of redundancy in this grammar. For > example, the pattern "scomment hspace*" is repeated in many places. > Sometimes it's a prefix wrapped in (...)*, and other times it's iterated > over by tail recursion. The pattern "COLLECTING hspace* collecting_tail > hspace*" is also repeated in several places. These redundancies make > more work for the reader, and make me wonder "are all these actually the > same, or are there slight differences?" > > I suspect that the key to simplifying this grammar (apart from moving > away from ANTLR for purposes of the specification) is to choose a > different set of non-terminals. > > Please take a look at section 7.1 of the R5RS (or the R7RS draft). > Understanding that grammar is almost effortless, and there's almost no > redundancy. Now take a look at the specifications of SRFI-10, SRFI-30, > and SRFI-38. All of them are expressed as a list of modifications to > the R5RS grammar. That's the kind of thing I'd like to see in the > SRFI-110 specification. > > One more nit while I'm on this subject: In the BNF conventions section, > you write "a sweet-expression reader MUST act as if it preprocessed its > input as follows", but as far as I can tell it's not actually possible > to implement this as a preprocessor. This "preprocessing" must be > interleaved with parser, because several syntactic elements affect the > preprocessing. For example, the <* and *> markers manipulate the > preprocessor's stack, and yet you need a full parser to recognize those > markers. Also, if I understand correctly, indentation is only processed > outside of n-expressions. > > I also think that there needs to be a much simpler sample > implementation: one which does not attempt to be fully featured > (e.g. omit support for source location tracking), and which is not a > fully self-contained reader, but is instead expressed in terms of > existing procedures which are likely already present in an SRFI-105 > reader (or which could be easily created from existing code). > > In other words, you should help implementors understand how to add > SRFI-110 to their existing readers with a minimal amount of code > changes. The resulting code needs to be as simple as reasonably > possible. > > Here's one possible strategy: Assume the existence of an n-expression > reader. Now write a t-expression reader in terms of it, in the most > elegant Scheme code possible. It turns out this is not quite possible, > but hopefully the problems can be patched up by assuming the existence > of some other helpers, and/or by adding some functionality to the > n-expression reader. > > After our last email exchange, I spent some time thinking about this, > and identified a few additional things you might need: > > * In order to recognize the special markers, you'll need either (1) a > way to "unread" characters, or (2) a way for the n-expression reader > to tell you that e.g. the symbol '<*' was "by itself" for purposes of > SRFI-110. > > * You might need a helper to read special comments without consuming the > following datum. > > I'm sorry that I cannot be more complete in my analysis of what needs to > be done, but my time (and motivation) is limited. Reformulating this > code will be a lot of work, but I suspect that adoption will be very low > unless you can show implementors how to add SRFI-110 easily and with a > small amount of code. > > Regards, > Mark > >