More comments, and the ANTLR code is too complex Mark H Weaver (29 May 2013 07:04 UTC)
|
Re: More comments, and the ANTLR code is too complex
David A. Wheeler
(29 May 2013 17:39 UTC)
|
Re: More comments, and the ANTLR code is too complex
David A. Wheeler
(31 May 2013 17:03 UTC)
|
Re: More comments, and the ANTLR code is too complex
David A. Wheeler
(01 Jun 2013 02:27 UTC)
|
Re: More comments, and the ANTLR code is too complex
David A. Wheeler
(10 Jun 2013 00:21 UTC)
|
Re: More comments, and the ANTLR code is too complex
Alan Manuel Gloria
(10 Jun 2013 02:01 UTC)
|
Re: More comments, and the ANTLR code is too complex
David A. Wheeler
(12 Jun 2013 00:25 UTC)
|
Re: More comments, and the ANTLR code is too complex
Mark H Weaver
(12 Jun 2013 20:13 UTC)
|
I've made another attempt to understand SRFI-110 clearly enough to implement it, and once again I've failed to do so before losing patience. This is the first time I've attempted to read and understand the ANTLR grammar, and I'm sorry to say that I'm very unhappy with it. If it cannot be made simpler and more easily comprehensible than it is now, then I'm unlikely to implement SRFI-110 in Guile. I suspect other implementors would feel similarly. In the interest of encouraging implementors, I'd recommend making a serious effort to rewrite the grammar to be as conceptually simple and clear as possible. Here are some specific comments about the ANTLR code: * "BLOCK_COMMENT : '#|' // This is #| ... #|" That should be "#| ... |#" * EOL_SEQUENCE is never used. EOL is used instead, even though it is not defined. * APOSW, QUASIQUOTEW, UNQUOTEW, and UNQUOTE_SPLICEW are not defined. * Inconsistent syntax is used within {} in the ANTLR. In most places standard Scheme syntax is used, but in 'collecting_tail', the syntax is more like C. * Why are the action rules in 'n_expr' simply expressions that refer to values such as '$n1', but the action rules of 'collecting_tail' are instead assignment statements that refer to values such as '$more.v'? * Why is there special handling of (FF | VT)+ EOL ? * What does 'isperiodp' do exactly? What if the datum really is "." or the symbol whose name is a single period? (written #{.}# in Guile). * The non-terminals 'body' and 'it_expr' use the symbol 'same' even though the text implies that no extra symbol is generated by the preprocessing step in that case. Where does 'same' come from? And here are some comments about the tutorial: * "Scheme’s datum comments (#;datum) comment out the next neoteric expression, not the next sweet expression (and please don’t follow the semicolon with whitespace)." I often put "#;" on the preceeding line, which you're now asking me not to do. What is the purpose of this request? Also, "#;" becomes much less useful if it cannot comment out an entire sweet expression. Perhaps "#;" should have a similar rule as the traditional abbreviations: if it is followed by whitespace, then the following /sweet expression/ is ignored, otherwise the following /neoteric expression/ is ignored. What do you think? * I'd like to see a few more examples for improper lists, such as: f a . b and: f a b . c * In the tutorial, I found the examples of $ (SUBLIST) a bit confusing: a b $ c d ==> (a b (c d)) a b $ c d e f $ g ==> (a b (c d e f g)) ; Not (a b (c d e f (g))) This leaves me uncertain of whether the second case is somehow caused by two $'s on one line, or because there's only one item after the $. I'd like to see an example like "a b $ c" or "a b $ c d e $ f g" to clarify. * "A sweet-expression reader MUST support three modes: indentation processing, enclosed, and initial indent." [...] "A marker MUST only have its special meaning when indentation processing is enabled," This sounds as if "*>" MUST not be recognized, because the reader will be in "enclosed" mode at that point, no? * "2. If top is the empty string and the indentation length is nonzero, symbol INITIAL_INDENT is generated and the reader changes to initial indent mode. When an end-of-line sequence is reached the mode changes back to indentation processing." If the reader was in "enclosed" mode, then presumably the mode should not change back to indentation processing, right? * "1. If an end-of-line sequence immediately follows the indentation and the indentation length is nonzero: a. If the indentation contains “!”, it is ignored; an implementation MUST consume the end-of-line sequence and start applying these rules again, from the beginning, with the next line. b. If the indentation does not contain “!”, it is considered a line with no characters (thus indentation has zero length) and the rest of these rules are applied." I vaguely recall that the distinction here was going to be removed as a simplification of the rules. What that idea scrapped? * "A marker MUST only have its special meaning when indentation processing is enabled, it is preceded by indentation or hspace, it is followed by an hspace or end-of-line, and when it starts with the character shown (e.g., neither |$| nor '$ contains a marker)." The last clause here, "when it starts with the character shown", is poorly worded IMO, and redundant with the requirement that "it is preceded by indentation or hspace". Regards, Mark