|
More comments, and the ANTLR code is too complex Mark H Weaver (29 May 2013 07:04 UTC)
|
|
Re: More comments, and the ANTLR code is too complex
David A. Wheeler
(29 May 2013 17:39 UTC)
|
|
Re: More comments, and the ANTLR code is too complex
David A. Wheeler
(31 May 2013 17:03 UTC)
|
|
Re: More comments, and the ANTLR code is too complex
David A. Wheeler
(01 Jun 2013 02:27 UTC)
|
|
Re: More comments, and the ANTLR code is too complex
David A. Wheeler
(10 Jun 2013 00:21 UTC)
|
|
Re: More comments, and the ANTLR code is too complex
Alan Manuel Gloria
(10 Jun 2013 02:01 UTC)
|
|
Re: More comments, and the ANTLR code is too complex
David A. Wheeler
(12 Jun 2013 00:25 UTC)
|
|
Re: More comments, and the ANTLR code is too complex
Mark H Weaver
(12 Jun 2013 20:13 UTC)
|
I've made another attempt to understand SRFI-110 clearly enough to
implement it, and once again I've failed to do so before losing
patience.
This is the first time I've attempted to read and understand the ANTLR
grammar, and I'm sorry to say that I'm very unhappy with it. If it
cannot be made simpler and more easily comprehensible than it is now,
then I'm unlikely to implement SRFI-110 in Guile. I suspect other
implementors would feel similarly.
In the interest of encouraging implementors, I'd recommend making a
serious effort to rewrite the grammar to be as conceptually simple and
clear as possible.
Here are some specific comments about the ANTLR code:
* "BLOCK_COMMENT : '#|' // This is #| ... #|"
That should be "#| ... |#"
* EOL_SEQUENCE is never used. EOL is used instead, even though it is
not defined.
* APOSW, QUASIQUOTEW, UNQUOTEW, and UNQUOTE_SPLICEW are not defined.
* Inconsistent syntax is used within {} in the ANTLR. In most places
standard Scheme syntax is used, but in 'collecting_tail', the syntax
is more like C.
* Why are the action rules in 'n_expr' simply expressions that refer to
values such as '$n1', but the action rules of 'collecting_tail' are
instead assignment statements that refer to values such as '$more.v'?
* Why is there special handling of (FF | VT)+ EOL ?
* What does 'isperiodp' do exactly? What if the datum really is "." or
the symbol whose name is a single period? (written #{.}# in Guile).
* The non-terminals 'body' and 'it_expr' use the symbol 'same' even
though the text implies that no extra symbol is generated by the
preprocessing step in that case. Where does 'same' come from?
And here are some comments about the tutorial:
* "Scheme’s datum comments (#;datum) comment out the next neoteric
expression, not the next sweet expression (and please don’t follow the
semicolon with whitespace)."
I often put "#;" on the preceeding line, which you're now asking me
not to do. What is the purpose of this request? Also, "#;" becomes
much less useful if it cannot comment out an entire sweet expression.
Perhaps "#;" should have a similar rule as the traditional
abbreviations: if it is followed by whitespace, then the following
/sweet expression/ is ignored, otherwise the following /neoteric
expression/ is ignored. What do you think?
* I'd like to see a few more examples for improper lists, such as:
f
a .
b
and:
f
a b
. c
* In the tutorial, I found the examples of $ (SUBLIST) a bit confusing:
a b $ c d ==> (a b (c d))
a b $ c d e f $ g ==> (a b (c d e f g))
; Not (a b (c d e f (g)))
This leaves me uncertain of whether the second case is somehow
caused by two $'s on one line, or because there's only one item
after the $. I'd like to see an example like "a b $ c" or
"a b $ c d e $ f g" to clarify.
* "A sweet-expression reader MUST support three modes: indentation
processing, enclosed, and initial indent."
[...]
"A marker MUST only have its special meaning when indentation
processing is enabled,"
This sounds as if "*>" MUST not be recognized, because the reader
will be in "enclosed" mode at that point, no?
* "2. If top is the empty string and the indentation length is nonzero,
symbol INITIAL_INDENT is generated and the reader changes to initial
indent mode. When an end-of-line sequence is reached the mode changes
back to indentation processing."
If the reader was in "enclosed" mode, then presumably the mode
should not change back to indentation processing, right?
* "1. If an end-of-line sequence immediately follows the indentation and
the indentation length is nonzero:
a. If the indentation contains “!”, it is ignored; an
implementation MUST consume the end-of-line sequence and start
applying these rules again, from the beginning, with the next
line.
b. If the indentation does not contain “!”, it is considered a
line with no characters (thus indentation has zero length) and
the rest of these rules are applied."
I vaguely recall that the distinction here was going to be removed
as a simplification of the rules. What that idea scrapped?
* "A marker MUST only have its special meaning when indentation
processing is enabled, it is preceded by indentation or hspace, it is
followed by an hspace or end-of-line, and when it starts with the
character shown (e.g., neither |$| nor '$ contains a marker)."
The last clause here, "when it starts with the character shown", is
poorly worded IMO, and redundant with the requirement that "it is
preceded by indentation or hspace".
Regards,
Mark