Re: SRFI 149 is inconsistent with Al Petrofsky's examples

Show/hide message thread
SRFI 149 is inconsistent with Al Petrofsky's examples William D Clinger (13 Jul 2017 12:42 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (14 Jul 2017 09:44 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (14 Jul 2017 10:20 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples William D Clinger (14 Jul 2017 17:10 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (14 Jul 2017 20:01 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples William D Clinger (15 Jul 2017 00:55 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (15 Jul 2017 10:23 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples William D Clinger (15 Jul 2017 11:17 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (15 Jul 2017 12:26 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Arthur A. Gleckler (15 Jul 2017 18:47 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-Wißkirchen (16 Jul 2017 13:10 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples John Cowan (16 Jul 2017 17:53 UTC)
Re: SRFI 149 is inconsistent with Al Petrofsky's examples Marc Nieper-WiÃkirchen 14 Jul 2017 20:01 UTC
Am 14.07.2017 um 19:10 schrieb William D Clinger:

> Thank you, Marc.
And thank you, Will, for bringing all this to my attention.

> For reasons explained below, I think the SRFI 149 semantics is more
> intuitive than the R6RS semantics with respect to nested ellipsis
> patterns.  I think the R6RS semantics is more intuitive with respect
> to reducing the rank of all pattern variables in a systematic way.
> I think it's fairly likely that the incompatibility seen between the
> R6RS semantics and the SRFI 149 semantics even when there aren't any
> nested ellipsis patterns is likely to be an accidental artifact of how
> particular macro expanders have been written, but I haven't confirmed
> that.
Thanks for referring to R6RS. This SRFI was based on what I found in
R7RS (and thus R5RS). Thus, so far, I was unfamiliar with the semantics
of R6RS insofar as it extends the R5RS semantics.

The R6RS says
(http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-13.html#node_sec_12.4):
"If a pattern variable is followed by more ellipses in the subtemplate
than in the associated subpattern, the input form is replicated as
necessary. The subtemplate must contain at least one pattern variable
from a subpattern followed by an ellipsis, and for at least one such
pattern variable, the subtemplate must be followed by exactly as many
ellipses as the subpattern in which the pattern variable appears.
(Otherwise, the expander would not be able to determine how many times
the subform should be repeated in the output.)"

So far, I haven't found further specification of what it means that "the
input form is replicated". It seems to mean that the input form that is
matched by the subpattern is replicated as a whole thus "simulating"
extra ellipses following the subpattern.

In other words, the difference between the R6RS semantics and SRFI 149
semantics can be paraphrased in saying that the former semantics
replicates on the input side, while the latter semantics replicates on
the output side.

> If the incompatibility between R6RS and SRFI 149 semantics is indeed
> just an artifact of the SRFI 149 sample implementation, as I suspect,
> then it should be possible for WG2 to adopt a semantics that disagrees
> with both the R6RS and SRFI 149 because it is superior to both.  In
> particular, I believe it should be possible to describe an intuitive
> semantics for which both Petrofsky's foo2 example and your gen-extract*
> example work as desired.
That sounds interesting.
> As seen below, all six of the R6RS systems I tested implement the R6RS
> semantics.  At the moment, there don't appear to be any R7RS systems
> that implement the SRFI 149 semantics for programs run from the command
> line, but Kawa implements the SRFI 149 semantics in its R7RS REPL, and
> several R7RS systems implement the SRFI 149 semantics for example 1 but
> generate an error for example 2.  Larceny (in R7RS mode) implements the
> R6RS semantics, and Foment implements the R6RS semantics for example 1
> but generates an error for example 2.
>
> None of the R5RS systems I tested support either semantics.  Chicken
> and Larceny (in R5RS mode) implement the SRFI 149 semantics for example
> 1 but generate an error for example 2.
The problem with example 2 are the consecutive ellipses in the template
(the other extension of this SRFI). With a helper macro, gen-extract*
could be rewritten so that the output ellipses are nested.

(define-syntax gen-extract*
  (syntax-rules ()
    ((gen-extract* (((x ...) generator) ...) . body)
     (%let* (((x (generator)) ...) ...) () . body))))

(define-syntax %let*
  (syntax-rules ()
    ((%let* () bindings . body)
     (let bindings . body))
    ((%let* (bindings1 bindings2 ...) (binding ...) . body)
     (%let* (bindings2 ...) (binding ... . bindings1) . body))))

This should, hopefully, remove the errors below:
>
>                 Example 1                       Example 2
> R7RS
>   Chibi         (((x 0) (x 0)) ((y 0) (y 3)))   error
>   Chicken       (((x 0) (x 0)) ((y 0) (y 3)))   error
>   Cyclone                                       error
>   Foment        (((x 0) (y 0)) ((x 0) (y 3)))   error
>   Gauche        (((x 0) (x 0)) ((y 0) (y 3))) 	error
>   Kawa 	      	error                           (0 0 1 1)
>     at REPL   	(((x 0) (x 0)) ((y 0) (y 3))) 	(0 0 1 1)
>   Larceny     	(((x 0) (y 0)) ((x 0) (y 3))) 	(0 1 0 1)
>   Picrin      	(((x 0) (x 0)) ((y 0) (y 3))) 	error
>   Sagittarius 	error 	      	      	      	error
>     at REPL   	(((x 0) (x 0)) ((y 0) (y 3))) 	error
>
> R6RS
>   Larceny       (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
>   Petite Chez   (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
>   Racket        (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
>   Sagittarius   (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
>   Vicare        (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
>   Ypsilon       (((x 0) (y 0)) ((x 0) (y 3)))   (0 1 0 1)
>
> R5RS
>   Chicken       (((x 0) (x 0)) ((y 0) (y 3)))   error
>   Gambit        error                           error
>   Larceny       (((x 0) (x 0)) ((y 0) (y 3)))   error

I am wondering, however, why Chibi gives an error because I chose
Chibi's implementation as the sample implementation for this SRFI, and
my previous tests showed that Chibi implemented SRFI 149's semantics. I
will have to investigate further, but this is not relevant for the
current discussion.

>
> Marc Nieper-Wißkirchen wrote:
>
>> The gen-extract* example does not work with Larceny R7RS, but with Larceny
>> R5RS.
> According to my testing, the gen-extract* works with Larceny in R7RS
> mode (giving the result to be expected with the R6RS semantics) but
I meant: Does not give the wanted (e.g. SRFI 149-semantics) result.

> generates an error in R5RS mode:
>
>     ERROR detected during macro expansion:
>     Too few ellipses follow pattern variable in template
This is also due to the consecutive ellipses. The extended example with
the helper macro works.

I have also made a test with Scheme48. Scheme48 implements the semantics
of SRFI 149.
>> This SRFI made a choice between the two semantics. The choice it made was
>> in favor of the latter semantics (that is making the gen-extract* example
>> work, not the foo2 example). It is spelled out in the SRFI as follows: "If
>> a pattern variable in a subtemplate is followed by more instances of
>> <ellipsis> than the subpattern in which it occurs is followed, the input
>> elements by which it is replaced in the output are repeated for the
>> innermost excess instances of <ellipsis>". In fact, this wording was
>> suggested by Al Petrofsky in his post on this mailing list. In the
>> gen-extract* example, the innermost occurrence of <ellipsis> in the
>> template replicate a particular instance of (generator), while the outer
>> occurrence iterates over the instances of (generator).
>>
>> SRFI 149 does not cite the gen-extract* macro to justify its choice (such
>> a justification would have been arbitrary, anyway, because this SRFI could
>> have cited the foo2 macro to justify the other choice). However, it gives
>> an abstract reason in the rationale why the choice made by SRFI 149 was not
>> an arbitrary one but one founded on theoretical considerations, namely
>> functoriality:
>>
>> If ((_ . <pattern>) <template>) is a valid syntax rule of the
>> syntax-rules-pattern language, the "lifted" rule ((_ <pattern> ...)
>> (<template> ...)) becomes a valid and meaningful syntax-rule with the
>> semantics of this SRFI.
>>
>> So, if we view a syntax rule as a function from, say, widgets to gadgets,
>> I can easily lift this syntax rule to a function from list of widgets to
>> list of gadgets with the semantics of the SRFI 149, that is the latter
>> semantics from above.
> Doesn't the R6RS semantics have the same lifting property?  The R6RS
> semantics certainly delivers different results from the SRFI 149 semantics,
> but I think both of those semantics support the lifted patterns, while
> interpreting them differently.
>
> In mathematics, there are often several distinct but useful functors
> between two categories.  (I am not sure functors make sense here, because
> you haven't described the categories that would be their domain and codomain,
> but if functors do make sense I'm pretty sure more than one of them would be
> useful.)
I am going to say something about this below.
>> So much for reviewing. As for options for the community to proceed, I do
>> see the following:
>>
>> (1) Scheme implementers do not agree on one semantics, e.g. Larceny sticks
>> to its semantics; Chibi, Kawa and Sagittarius stick to SRFI 149's semantics.
>>
>> This is probably the worst outcome because it would mean that one can
>> neither write a portable "desirable" macro like foo2 nor gen-extract*. (Of
>> course, it is possible to write macros with the same desired effect as foo2
>> and gen-extract* using only the R7RS, but not as easy.)
> This is the likely outcome of SRFI 149, at least for the short term.  Larceny
> pretty much has to stick close to the R6RS semantics, because interoperability
> between R6RS and R7RS code is one of Larceny's primary goals.
>
> The R6RS semantics was prescribed by an RnRS report and is supported by all
> viable implementations of the R6RS.  (That includes Sagittarius, whose R7RS
> version of syntax-rules, exported by (scheme base), must be different from
> its R6RS version, exported by (rnrs base).  More on this below.)
>
> As a SRFI that proposes a new semantics that's incompatible with current
> practice, SRFI 149 does not have the standing of a ratified standard that
> has been implemented consistently by multiple implementations.  SRFIs are
> optional.  As implementers, we all have to decide which SRFIs we want to
> support.  No implementation supports all of them.  (Larceny probably comes
> the closest; its next release will support at least 83 SRFIs in R7RS mode.)
As I said above, I wasn't aware of the incompatibility with the R6RS. In
principle, coexisting could be possible because syntax-rules can behave
differently when imported from different libraries, e.g. Larceny could
make a difference between syntax-rules imported from (scheme base) or
(srfi 149) or (rnrs syntax-case (6)).

(That said, SRFI 149 says that a system supporting it should export the
same syntax-rules from (scheme base) and (srfi 149). But this still
leaves room for (rnrs syntax-case (6)).)

Whether this would be a good resolution, is another question.

>> (2) Larceny abandons its semantics and joins the SRFI 149 camp. There is
>> the risk of existing code that would break but such code can't be portable
>> Scheme code.
> The R6RS made it possible to write portable Scheme code, and I know for a
> fact that portable Scheme code has been written to that standard.  You sound
> as though you were unaware that R6RS section 11.19 requires syntax-rules to
> allow patterns and templates with nested or repeated ellipses.
Yes, this was my ignorance.

> If the SRFI 149 semantics were to become a ratified and accepted standard,
> Larceny would of course try to support it, even at the cost of making the
> version of syntax-rules exported by (scheme base) incompatible with the
> version exported by (rnrs base).  That is already true in Sagittarius, and
> the world wouldn't come to an end if it were to happen in Larceny as well.
>
> Indeed, the version of syntax-rules exported by both (scheme base) and by
> (rnrs base) in Larceny is incompatible with the version used in Larceny's
> R5RS mode.  We've logged about half a dozen bug reports concerning those
> incompatibilities, but have mostly ignored them because we want to put our
> effort into supporting the R7RS.  We are unlikely to enhance Larceny's R5RS
> mode by adding syntax-rules features that go beyond the R5RS specification,
> as SRFI 149 proposes.
This gives us at least a hint that it may confuse people or stress
implementers if the semantics of the R6RS and a future version of the
R7RS diverge too much unnecessarily.
>> (3) The community agrees on that SRFI 149 made a suboptimal choice and
>> that the former choice of semantics from above is a better one. In that
>> case, a revision of SRFI 149 will have to be written. An argument against
>> SRFI 149's choice should include a justification why the other choice is
>> more naturally despite of SRFI 149's arguing on theoretical grounds and
>> functoriality.
>> There is one outermost ellipsis in the pattern of gen-extract*, following
>> a subpattern containing both pattern variables x and generator, while there
>> is no such outermost ellipsis in the pattern of foo2 that follows a
>> subpattern containing both pattern variables axis and coordinate. For
>> functoriality, only outermost ellipses as in gen-extract* matter.
>>
>> In gen-extract*, the patterns (x ...) and generator have a common ellipsis
>> following them so it makes sense that generator is iterated whenever x is
>> iterated twice. On the other hand, in foo2, the patterns axis and
>> (coordinate ...) do not share a common ellipsis following them both, so it
>> makes sense to choose a different ways for template expansion here,
> It might help to state what you mean by "functoriality".  Both the R6RS
> and SRFI 149 semantics lift, so lifting alone can't decide the issue.
>
> From Saunders MacLane, Categories for the Working Mathematician, page 13:
>
>     A functor is a morphism of categories.  In detail, for categories
>     C and B a functor T: C -> B with domain C and codomain B consists
>     of two suitably related functions: The object function T, which
>     assigns to each object c of C an object Tc of B and the arrow
>     function (also written T) which assigns to each arrow f: c -> c'
>     of C an arrow Tf: Tc -> Tc' of B, in such a way that
>
>         T(1_c) = 1_{Tc},     T(g∘f) = Tg ∘ Tf,              (1)
>
>     the latter whenever the composite g∘f is defined in C.
>
> Before I even try to guess what you mean by "functoriality", I'd want
> to know what the categories C and B are in your use of that word.
My category C is the category whose objects are patterns P and whose
morphisms are macro transformers f: P -> Q. My functor T: C -> C is the
endofunctor that maps a pattern P to the pattern (P ...) and that wants
to map a macro transformer f: P -> Q to "the" macro transformer (f ...):
(P ...) => (Q ...) that (intuitively) transform each instance of P in
the list on the left into the corresponding instance of Q in the list on
the right.

With the SRFI 149 semantics, I can say what "the" macro transformer (f
...) is:

Say f: P -> Q is given (syntax-rules () ((_ . P) Q)), then (f ...): (P
...) => (Q ...) is given by (syntax-rules () ((_ P ...) (Q ...))).

I don't see how I can easily define such a functor T on the level of
macro transformers with the R6RS semantics.

In the SRFI document I wrote that this is how Scheme's map procedure
work. In that case, the category C is the category whose objects are all
Scheme types (where the exact definition of type is vague) and whose
morphisms are (unary) Scheme procedures. The endofunctor T: C -> C then
maps each type P to the type of lists of P's and a procedure f: P -> Q
to the procedure (lambda (p*) (map f p*)).

(While I wrote the (non exactly trivial) implementation of SRFI 148,
such a semantics would have helped me in one or two places.)

> Using specific examples and programming intuition instead of mathematical
> concepts whose application to SRFI 149 is unclear to me, I'd say the
> intuitive meaning of an ellipsis pattern of the form  <pat> ...  is that
> it matches a part of the macro use iff that part of the use consists of
> zero or more subparts, each of which match <pat>.  I'd also say the
> intuitive meaning of an ellipsis template for the form  <template> ...
> is that it produces output obtained by replicating the <template> as
> many times as was necessary when replicating the patterns used to match
> the pattern variables appearing within <template> whose values were
> obtained by matching an ellipsis pattern; there must be at least one
> such pattern variable.
>
> With that intuition, consider Example 1 above.  The "(axis ...)" pattern
> matches (x y), and the "(coordinate ...) ..." must match the rest of the
> use and does with "(coordinate ...)" matching first (0 0) and then (0 3).
> The template is therefore to be rewritten with the axis pattern variable
> ranging over (x y) and the coordinate pattern variable ranging first over
> (0 0) and then over (0 3).  Thus
>
>     '(((axis coordinate) ...) ...)
>
> rewrites to
>
>     '(((axis coordinate) ...)      ; with axis ranging over (x y) and
>                                    ; with coordinate ranging over (0 0)
>       ((axis coordinate) ...)      ; with axis ranging over (x y) and
>                                    ; with coordinate ranging over (0 3)
>       )
>
> Rewriting axis first for no particular reason, that rewrites to
>
>     '(((x coordinate) ...)      ; with coordinate ranging over (0 0)
>       ((y coordinate) ...)      ; with coordinate ranging over (0 0)
>       ((x coordinate) ...)      ; with coordinate ranging over (0 3)
>       ((y coordinate) ...)      ; with coordinate ranging over (0 3)
>       )
>
> which rewrites to
>
>     '(((x 0) (x 0))
>       ((y 0) (y 0))
>       ((x 0) (x 3))
>       ((y 0) (y 3))
>       )
>
> If we instead rewrite coordinate first, we get
>
>     '(((axis 0) ...)      ; with axis ranging over (x y)
>       ((axis 0) ...)      ; with axis ranging over (x y)
>       ((axis 0) ...)      ; with axis ranging over (x y)
>       ((axis 3) ...)      ; with axis ranging over (x y)
>       )
>
> which rewrites to
>
>     '(((x 0) (y 0))
>       ((x 0) (y 0))
>       ((x 0) (y 0))
>       ((x 3) (y 3))
>       )
>
> Whether we get the R6RS result or the SRFI 149 result apparently
> depends upon which of the two pattern variables is rewritten first.
> We can imagine different rules for deciding that order, such as
> "rewrite the leftmost pattern variable first" (which yields the
> SRFI 149 result in this example) or "rewrite the highest ranking
> pattern variable first" (which yields the R6RS result).  I would
> not be terribly surprised if implementations are just making
> arbitrary choices here and it's pure happenstance that all R6RS
> systems deliver the R6RS result, but I think it's more likely that
> the R6RS systems are rewriting the highest ranking pattern variable
> first as part of an algorithm that systematically reduces the rank
> of all pattern variables that appear within a subtemplate.
SRFI 149's semantics has nothing to do with the relative positions of
the pattern variables of different ranks. E.g. it would give the same
result if the pattern was (foo2 (coordinate ...) ... (axis ...)). The
way I read the R6RS, it either doesn't resolve the  ambiguity or it
intends to specify that the input forms of the lower-ranked pattern
variables are repeated if necessary.

E.g. axis is a lower-ranked pattern variable. Assume that (axis ...) is
matched by (x y) as in the example. For template expansion, it will then
silently be assumed that a pattern like ((axis ...) ...) was matched by
((x y) (x y) (x y)) with as many (x y) repeated as necessary. Thus, in
effect, axis will get the same rank as does coordinate. (But this is
just my reading of the R6RS.)
> For Example 2, the "((x ...) generator) ..." subpattern matches
> (((x y) +) ((p q) *)).  Intuitively, that means "(x ...) generator)"
> matches ((x y) +) and then matches ((p q) *), so
That intuition was my starting point when I began thinking about this SRFI.
>     ((x (generator)) ... ...)
>
> should rewrite to
>
>     ((x (+)) ...    ; with x ranging over (x y)
>      (x (*)) ...    ; with x ranging over (p q)
>      )
>
> which rewrites to
>
>     ((x (+))
>      (y (+))
>      (p (*))
>      (q (*))
>      )
>
> which yields the SRFI 149 result, not the R6RS result.
>
Things may become complicated when we have a pattern like (((x ...) y)
... ((z ...) ...) and the template is something like (x y z) ... ... or
(x y) ... ... or (x z) ... ... or (y z) ... .... Would the "intuition"
give consistent results? (For example, removing a variable from the
output template should not change the form of the output.)

Can we formalize the "intuition" so that it does not give arbitrary
results in the Example 1?

> Will

--

Marc