SRFI 149 is inconsistent with Al Petrofsky's examples William D Clinger (13 Jul 2017 12:42 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (14 Jul 2017 09:44 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (14 Jul 2017 10:20 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples William D Clinger (14 Jul 2017 17:10 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (14 Jul 2017 20:01 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples William D Clinger (15 Jul 2017 00:55 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (15 Jul 2017 10:23 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples William D Clinger (15 Jul 2017 11:17 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (15 Jul 2017 12:26 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Arthur A. Gleckler (15 Jul 2017 18:47 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (16 Jul 2017 13:10 UTC)

Re: SRFI 149 is inconsistent with Al Petrofsky's examples William D Clinger 14 Jul 2017 17:10 UTC

Thank you, Marc.

For reasons explained below, I think the SRFI 149 semantics is more
intuitive than the R6RS semantics with respect to nested ellipsis
patterns.  I think the R6RS semantics is more intuitive with respect
to reducing the rank of all pattern variables in a systematic way.
I think it's fairly likely that the incompatibility seen between the
R6RS semantics and the SRFI 149 semantics even when there aren't any
nested ellipsis patterns is likely to be an accidental artifact of how
particular macro expanders have been written, but I haven't confirmed
that.

If the incompatibility between R6RS and SRFI 149 semantics is indeed
just an artifact of the SRFI 149 sample implementation, as I suspect,
then it should be possible for WG2 to adopt a semantics that disagrees
with both the R6RS and SRFI 149 because it is superior to both.  In
particular, I believe it should be possible to describe an intuitive
semantics for which both Petrofsky's foo2 example and your gen-extract*
example work as desired.

As seen below, all six of the R6RS systems I tested implement the R6RS
semantics.  At the moment, there don't appear to be any R7RS systems
that implement the SRFI 149 semantics for programs run from the command
line, but Kawa implements the SRFI 149 semantics in its R7RS REPL, and
several R7RS systems implement the SRFI 149 semantics for example 1 but
generate an error for example 2.  Larceny (in R7RS mode) implements the
R6RS semantics, and Foment implements the R6RS semantics for example 1
but generates an error for example 2.

None of the R5RS systems I tested support either semantics.  Chicken
and Larceny (in R5RS mode) implement the SRFI 149 semantics for example
1 but generate an error for example 2.

Example 1:

(define-syntax foo2
  (syntax-rules ()
    ((foo2 (axis ...) (coordinate ...) ...)
     '(((axis coordinate) ...) ...))))

(foo2 (x y) (0 0) (0 3))

Example 2:

(define-syntax gen-extract*
  (syntax-rules ()
    ((gen-extract* (((x ...) generator) ...) . body)
     (let* ((x (generator)) ... ...) . body))))

(gen-extract* (((x y) +)
	       ((p q) *))
  (list x y p q))

                Example 1                       Example 2
R7RS
  Chibi         (((x 0) (x 0)) ((y 0) (y 3)))   error
  Chicken       (((x 0) (x 0)) ((y 0) (y 3)))   error
  Cyclone                                       error
  Foment        (((x 0) (y 0)) ((x 0) (y 3)))   error
  Gauche        (((x 0) (x 0)) ((y 0) (y 3))) 	error
  Kawa 	      	error                           (0 0 1 1)
    at REPL   	(((x 0) (x 0)) ((y 0) (y 3))) 	(0 0 1 1)
  Larceny     	(((x 0) (y 0)) ((x 0) (y 3))) 	(0 1 0 1)
  Picrin      	(((x 0) (x 0)) ((y 0) (y 3))) 	error
  Sagittarius 	error 	      	      	      	error
    at REPL   	(((x 0) (x 0)) ((y 0) (y 3))) 	error

R6RS
  Larceny       (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
  Petite Chez   (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
  Racket        (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
  Sagittarius   (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
  Vicare        (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
  Ypsilon       (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)

R5RS
  Chicken       (((x 0) (x 0)) ((y 0) (y 3)))   error
  Gambit        error                           error
  Larceny       (((x 0) (x 0)) ((y 0) (y 3)))   error

Marc Nieper-Wißkirchen wrote:

> The gen-extract* example does not work with Larceny R7RS, but with Larceny
> R5RS.

According to my testing, the gen-extract* works with Larceny in R7RS
mode (giving the result to be expected with the R6RS semantics) but
generates an error in R5RS mode:

    ERROR detected during macro expansion:
    Too few ellipses follow pattern variable in template

> This SRFI made a choice between the two semantics. The choice it made was
> in favor of the latter semantics (that is making the gen-extract* example
> work, not the foo2 example). It is spelled out in the SRFI as follows: "If
> a pattern variable in a subtemplate is followed by more instances of
> <ellipsis> than the subpattern in which it occurs is followed, the input
> elements by which it is replaced in the output are repeated for the
> innermost excess instances of <ellipsis>". In fact, this wording was
> suggested by Al Petrofsky in his post on this mailing list. In the
> gen-extract* example, the innermost occurrence of <ellipsis> in the
> template replicate a particular instance of (generator), while the outer
> occurrence iterates over the instances of (generator).
>
> SRFI 149 does not cite the gen-extract* macro to justify its choice (such
> a justification would have been arbitrary, anyway, because this SRFI could
> have cited the foo2 macro to justify the other choice). However, it gives
> an abstract reason in the rationale why the choice made by SRFI 149 was not
> an arbitrary one but one founded on theoretical considerations, namely
> functoriality:
>
> If ((_ . <pattern>) <template>) is a valid syntax rule of the
> syntax-rules-pattern language, the "lifted" rule ((_ <pattern> ...)
> (<template> ...)) becomes a valid and meaningful syntax-rule with the
> semantics of this SRFI.
>
> So, if we view a syntax rule as a function from, say, widgets to gadgets,
> I can easily lift this syntax rule to a function from list of widgets to
> list of gadgets with the semantics of the SRFI 149, that is the latter
> semantics from above.

Doesn't the R6RS semantics have the same lifting property?  The R6RS
semantics certainly delivers different results from the SRFI 149 semantics,
but I think both of those semantics support the lifted patterns, while
interpreting them differently.

In mathematics, there are often several distinct but useful functors
between two categories.  (I am not sure functors make sense here, because
you haven't described the categories that would be their domain and codomain,
but if functors do make sense I'm pretty sure more than one of them would be
useful.)

> So much for reviewing. As for options for the community to proceed, I do
> see the following:
>
> (1) Scheme implementers do not agree on one semantics, e.g. Larceny sticks
> to its semantics; Chibi, Kawa and Sagittarius stick to SRFI 149's semantics.
>
> This is probably the worst outcome because it would mean that one can
> neither write a portable "desirable" macro like foo2 nor gen-extract*. (Of
> course, it is possible to write macros with the same desired effect as foo2
> and gen-extract* using only the R7RS, but not as easy.)

This is the likely outcome of SRFI 149, at least for the short term.  Larceny
pretty much has to stick close to the R6RS semantics, because interoperability
between R6RS and R7RS code is one of Larceny's primary goals.

The R6RS semantics was prescribed by an RnRS report and is supported by all
viable implementations of the R6RS.  (That includes Sagittarius, whose R7RS
version of syntax-rules, exported by (scheme base), must be different from
its R6RS version, exported by (rnrs base).  More on this below.)

As a SRFI that proposes a new semantics that's incompatible with current
practice, SRFI 149 does not have the standing of a ratified standard that
has been implemented consistently by multiple implementations.  SRFIs are
optional.  As implementers, we all have to decide which SRFIs we want to
support.  No implementation supports all of them.  (Larceny probably comes
the closest; its next release will support at least 83 SRFIs in R7RS mode.)

> (2) Larceny abandons its semantics and joins the SRFI 149 camp. There is
> the risk of existing code that would break but such code can't be portable
> Scheme code.

The R6RS made it possible to write portable Scheme code, and I know for a
fact that portable Scheme code has been written to that standard.  You sound
as though you were unaware that R6RS section 11.19 requires syntax-rules to
allow patterns and templates with nested or repeated ellipses.

If the SRFI 149 semantics were to become a ratified and accepted standard,
Larceny would of course try to support it, even at the cost of making the
version of syntax-rules exported by (scheme base) incompatible with the
version exported by (rnrs base).  That is already true in Sagittarius, and
the world wouldn't come to an end if it were to happen in Larceny as well.

Indeed, the version of syntax-rules exported by both (scheme base) and by
(rnrs base) in Larceny is incompatible with the version used in Larceny's
R5RS mode.  We've logged about half a dozen bug reports concerning those
incompatibilities, but have mostly ignored them because we want to put our
effort into supporting the R7RS.  We are unlikely to enhance Larceny's R5RS
mode by adding syntax-rules features that go beyond the R5RS specification,
as SRFI 149 proposes.

> (3) The community agrees on that SRFI 149 made a suboptimal choice and
> that the former choice of semantics from above is a better one. In that
> case, a revision of SRFI 149 will have to be written. An argument against
> SRFI 149's choice should include a justification why the other choice is
> more naturally despite of SRFI 149's arguing on theoretical grounds and
> functoriality.

> There is one outermost ellipsis in the pattern of gen-extract*, following
> a subpattern containing both pattern variables x and generator, while there
> is no such outermost ellipsis in the pattern of foo2 that follows a
> subpattern containing both pattern variables axis and coordinate. For
> functoriality, only outermost ellipses as in gen-extract* matter.
>
> In gen-extract*, the patterns (x ...) and generator have a common ellipsis
> following them so it makes sense that generator is iterated whenever x is
> iterated twice. On the other hand, in foo2, the patterns axis and
> (coordinate ...) do not share a common ellipsis following them both, so it
> makes sense to choose a different ways for template expansion here,

It might help to state what you mean by "functoriality".  Both the R6RS
and SRFI 149 semantics lift, so lifting alone can't decide the issue.

From Saunders MacLane, Categories for the Working Mathematician, page 13:

    A functor is a morphism of categories.  In detail, for categories
    C and B a functor T: C -> B with domain C and codomain B consists
    of two suitably related functions: The object function T, which
    assigns to each object c of C an object Tc of B and the arrow
    function (also written T) which assigns to each arrow f: c -> c'
    of C an arrow Tf: Tc -> Tc' of B, in such a way that

        T(1_c) = 1_{Tc},     T(g∘f) = Tg ∘ Tf,              (1)

    the latter whenever the composite g∘f is defined in C.

Before I even try to guess what you mean by "functoriality", I'd want
to know what the categories C and B are in your use of that word.

Using specific examples and programming intuition instead of mathematical
concepts whose application to SRFI 149 is unclear to me, I'd say the
intuitive meaning of an ellipsis pattern of the form  <pat> ...  is that
it matches a part of the macro use iff that part of the use consists of
zero or more subparts, each of which match <pat>.  I'd also say the
intuitive meaning of an ellipsis template for the form  <template> ...
is that it produces output obtained by replicating the <template> as
many times as was necessary when replicating the patterns used to match
the pattern variables appearing within <template> whose values were
obtained by matching an ellipsis pattern; there must be at least one
such pattern variable.

With that intuition, consider Example 1 above.  The "(axis ...)" pattern
matches (x y), and the "(coordinate ...) ..." must match the rest of the
use and does with "(coordinate ...)" matching first (0 0) and then (0 3).
The template is therefore to be rewritten with the axis pattern variable
ranging over (x y) and the coordinate pattern variable ranging first over
(0 0) and then over (0 3).  Thus

    '(((axis coordinate) ...) ...)

rewrites to

    '(((axis coordinate) ...)      ; with axis ranging over (x y) and
                                   ; with coordinate ranging over (0 0)
      ((axis coordinate) ...)      ; with axis ranging over (x y) and
                                   ; with coordinate ranging over (0 3)
      )

Rewriting axis first for no particular reason, that rewrites to

    '(((x coordinate) ...)      ; with coordinate ranging over (0 0)
      ((y coordinate) ...)      ; with coordinate ranging over (0 0)
      ((x coordinate) ...)      ; with coordinate ranging over (0 3)
      ((y coordinate) ...)      ; with coordinate ranging over (0 3)
      )

which rewrites to

    '(((x 0) (x 0))
      ((y 0) (y 0))
      ((x 0) (x 3))
      ((y 0) (y 3))
      )

If we instead rewrite coordinate first, we get

    '(((axis 0) ...)      ; with axis ranging over (x y)
      ((axis 0) ...)      ; with axis ranging over (x y)
      ((axis 0) ...)      ; with axis ranging over (x y)
      ((axis 3) ...)      ; with axis ranging over (x y)
      )

which rewrites to

    '(((x 0) (y 0))
      ((x 0) (y 0))
      ((x 0) (y 0))
      ((x 3) (y 3))
      )

Whether we get the R6RS result or the SRFI 149 result apparently
depends upon which of the two pattern variables is rewritten first.
We can imagine different rules for deciding that order, such as
"rewrite the leftmost pattern variable first" (which yields the
SRFI 149 result in this example) or "rewrite the highest ranking
pattern variable first" (which yields the R6RS result).  I would
not be terribly surprised if implementations are just making
arbitrary choices here and it's pure happenstance that all R6RS
systems deliver the R6RS result, but I think it's more likely that
the R6RS systems are rewriting the highest ranking pattern variable
first as part of an algorithm that systematically reduces the rank
of all pattern variables that appear within a subtemplate.

For Example 2, the "((x ...) generator) ..." subpattern matches
(((x y) +) ((p q) *)).  Intuitively, that means "(x ...) generator)"
matches ((x y) +) and then matches ((p q) *), so

    ((x (generator)) ... ...)

should rewrite to

    ((x (+)) ...    ; with x ranging over (x y)
     (x (*)) ...    ; with x ranging over (p q)
     )

which rewrites to

    ((x (+))
     (y (+))
     (p (*))
     (q (*))
     )

which yields the SRFI 149 result, not the R6RS result.

Will