Re: datum comments of sweet-expressions Alan Manuel Gloria 12 Jul 2013 02:14 UTC

On 7/11/13, David A. Wheeler <xxxxxx@dwheeler.com> wrote:
> On 29 May 2013 02:31:25 -0400, Mark H Weaver posted a long set of comments.
> One recommendation was to supporting datum comments of sweet-expressions
> (#; + whitespace).  The idea makes sense, and I did anticipate this.
> However, the obvious ways imply some additional trickiness in grammar
> and implementation.  Here's how I'm thinking about tackling this, but
> if anyone has a better idea, *please* speak up!!
>
>
> The current SRFI-110 says:
> "Scheme’s datum comments (#;datum) comment out the next neoteric
> expression, not the next sweet expression (and please don’t follow the
> semicolon with whitespace)."
>
> Mark H Weaver recommends:
> "I often put "#;" on the preceeding line, which you're now asking me
> not to do. What is the purpose of this request? Also, "#;" becomes
> much less useful if it cannot comment out an entire sweet expression.
> Perhaps "#;" should have a similar rule as the traditional
> abbreviations: if it is followed by whitespace, then the following
> /sweet expression/ is ignored, otherwise the following /neoteric
> expression/ is ignored. What do you think?"
>
> I have *definitely* thought about this.  Indeed, I wrote the text
> "don't follow the semicolon with whitespace" so that supporting
> datum comments of sweet-expressions could be added as a future addition.
>
> But if we add this as a *requirement*
> to SRFI-110, then the grammar rules and sample implementation
> have to be modified to handle it.  For example:
> a b
>   c
>   #; e
>      f
>   g
> => (a b c g)
>
>
> THE CHALLENGE: Properly supporting this requires properly supporting
> datum comments of a sweet-expression if it is the *last* item, e.g.:
> fee fie
>   foe
>   fum
>   #; blood
>     Englishman
> => (fee fie foe fum)
>
> Handling *last* items turns out to be trickier to do, and I think
> that trickiness has nothing to do with whether or not the grammar is LL(1).
> Currently there isn't a good way to handle lines that produce no value.
> In particular, the "it_expr" rule *must* return a datum.
> In the case of lines that begin with "#!sweet", the grammar rules
> recurse so they can have something to return.  This recursion
> is why the GROUP_SPLIT rule is so complicated.  That approach
> won't work here, because the datum comment might be the last group
> at that indent level.
>
> So for the moment, let's say that we'll try to fix up the existing
> LL(1) rules instead of rewriting the grammar rules in a completely
> different notation.  Even if we do that, I want to do that as a separate
> stage, and I think we should explore simplification further first.
> So...  how could we do this?
>
> One approach would be to fiddle with all the grammar rules that
> invoke it_expr.  However, I think that would be really ugly and involve
> a lot of repetition in the rules.  The problem is that the calling
> rules each have to handle identification of the situation AND
> invoke a different action rule for that case.  Ugh.
>
> I think a better approach would be to modify the
> key production "it_expr" so that it can return an "EMPTY" value,
> distinct from a valid datum like (), that indicates
> "no value at all".  This would require some the action rules
> to handle "EMPTY" values.  I think that could be handled by
> a few tweaked procedures, e.g., some "cons" can be replaced with "econs"
> (aka "empty-handling cons"):
> (define (econs x y)
>   (cond
>     ((eq? y EMPTY) x)
>     ((eq? x EMPTY) y)
>     (#t (cons x y))))
>
> If we do this, one side-effect is that the GROUP_SPLIT rules could
> probably become much simpler.  We'd no longer need to recurse deeply,
> because there'd be a way to signal that we saw an empty result.
>
> Thoughts?  Comments?  Is there a better way I'm not seeing?

Haha more nasty tagging values hahahaha!  We never seem to get rid of them!

; unique tagging value
(define EMPTY (cons '() '()))

Basically, our previous (before SRFI-110) implementations made use of
lots of these objects.  There's even a dangling "special tag to denote
comment return from hash-processing" which no longer comments anything
at all, the special tag having been removed.

This is problematic for Scheme implementations that support some kind
of extension for dispatching on "#".  Although I guess that's the
problem of the implementation.  Such # extensions are cute but make
life hard for us indentation-formatting guys.

That said our old implementations of sweet-expressions used such
unique nasty tagging values, so I don't see why we can't use them
again if it greatly simplifies our code.

Sincerely,
AmkG