Shiro Kawai <xxxxxx@gmail.com> schrieb am Di., 17. Okt. 2017 um 20:00 Uhr:
I only know Python's generator as such that it is created by a generator function (a function that has yield expr in it), and itself is a generator object. 

That's certainly the better terminology. When I used the term "generator" I was talking about a generator object like the one returned from invoking a generator function.
 
Since it is an object I see it can have more than one code paths, but I'm unfamiliar with its usage than the trivial tasks (generating one value at at time via next method).  Could you point some references to see advanced generator usages?  Or, could you show an idea that how the API would look if we integrate generators and accumulators in Scheme?

Sure. Let us start with the following simple generator function:

def f1():
  yield 1
  yield 2
  yield 3

The object returned by f() is a generator object, say g. Then you can invoke next(g), next(g), next(g), which yields the values 1, 2, 3. The fourth invokation of next(g) raises a StopIteration signalling that the generator object is exhausted (I have called this way of return a second code path).

The SRFI 158 equivalent is clear:

(define (f1)
  (make-coroutine-generator
    (lambda (yield)
      (yield 1)
      (yield 2)
      (yield 3))))

The only difference is that the Scheme version signals the end of the generated sequence by using a special sentinel object (the eof object) instead of raising an exception.

Now let's extend the Python version:

def f2():
  yield 1
  yield 2
  yield 3
  return 0

Generator objects created by this generator function differ from generator objects returned by f1 in that the StopIteration raised at the end carries the value 0 as a payload. SRFI 158 does not yet allow this, but it could be added easily by attaching a payload to eof objects (implementing this portable would mean to redefine eof-object? in the standard library).

In order to be precise, I should mention that there is a third way how a generator object in Python can yield a value to the caller, namely by throwing exceptions:

def f3():
  yield 1
  yield 2
  yield 3
  raise e

But this is nothing that we can't do in Scheme. (In fact, we can do more in Scheme because we can capture any continuations.)

Dually, there are several ways to feed values into a Python generator object:

def f4():
  x = yield 1
  yield x

Let g be the generator object returned by invoking f4. Invoking next(g) yields 1 and execution of the generator is halted at the first yield expression. Invoking g.send(2) afterwards feeds the value 2 into the generator, assigns that value to x and finally yields it. Any further invocation of next(g) or g.send(...) will cause a StopIteration.

There is another way to feed information into the generator by invoking g.close(). In that case, a GeneratorExit exception is raised at the suspended expression of the generator. The generator may or may not handle the exception, but it must not yield values after the exception was raised. As g.close() does not return a value, it is not usable for accumulators, but it can be used to clean up state in generators that have not yet been exhausted (but are not needed anymore). I think such a functionality is yet missing with SRFI 158 generators, namely stopping them prematurely to clean up state and free used resources. 

Finally, one can invoke g.throw(exc, value), which raises the an exception of type exc and payload value inside the generator. The generator can handle this and yield or return a value, which will become the return value of the invokation of throw (a StopIteration in case of return vs yield).

Now, this allows us to implement accumulators, say:

def sum():
  i = 0
  while True:
    try:
      i += yield None
    except MyExcepti
      return i

Let c be the result of sum(). It is a generator object suitable for summing (an accumulator in SRFI 158's terminology). Calling next(c) lets the generator proceed to the first yield. Afterwards, we can do c.send(1), c.send(2), c.send(3), and finally c.throw(MyException, None). This in turn raises a StopIteration exception whose value will be 6.

There is even more to know about Python's generators, like the "yield from" from PEP 380 (see https://docs.python.org/3/whatsnew/3.3.html#pep-380-syntax-for-delegating-to-a-subgenerator), and it is probably better to take a look at that document before I try to explain it.

How can all this be applied to the generator protocol of SRFI 158? First of all, an exhausted generator has to be able to return a non-trivial value. This can be done by adding a payload to eof objects. The beauty of this approach is that it is compatible with SRFI 121. Secondly, generators should be able to receive values. This can be implemented by allowing to call a generator that wants to accept values with one argument. E.g. if g is a generator (object), then the expression (g) is equivalent to Python's next(g), and the expression (g v) is equivalent to Python's g.send(v). The Scheme equivalent of stopping or closing a generator would then be to feed the eof object (possibly with a payload) into the generator. In that way, a SRFI 158 accumulator will just be a special type of generator with one exception: As a (generalized) generator the accumulator would return its final result packed into an eof object while the current protocol for SRFI 158 accumulators returns the bare value.

Although it is a bit more complicated to retrieve the value when it is packed into another object, it allows for many more applications. For example when implementing something like a pipe, which acts as an accumulator and generator at the same time and may want to return some accumulated value at the end of its life.

The only issue I still have with this proposed change (to the accumulators and to the general description of generators in SRFI 158) is that it still does not deal well with multiple values. (This makes it no worse than the draft's proposal because it has the same issues). It makes perfect sense to yield multiple values at each step of a generator's life. It makes even more sense to feed multiple values into an accumulator at each step. On the other hand, the eof object is just a single object, so an accumulator would have to be a procedure with a variable number of arguments (which is aesthetically not very pleasing and may have performance problems) and the consumer of a generator employing multiple values would need to be able to deal with different numbers of values coming out of the generator (same problems).

A more radical change (fitting more into Scheme's model of evaluation) would be to deliver an explicit continuation to each invocation of a generator. The generator would then invoke that continuation in tail context to signal that it is exhausted instead of returning normally. The explicit continuation could simply be (lambda () (eof-object)) giving back the behavior as specified in the draft. In general, however, the two different code paths (yielding a value and signalling the end of the sequence) would be clearly separated.

The same would have to be done for the feeding of values into a generalized generator (again, think of an accumulator): Whenever a generator is constructed, two procedures would have to be returned instead of one. Invoking the first procedure (with or without arguments) would simply send values into the generator. Invoking the second procedure would cause the generator to stop. By wrapping the first procedure into a procedure which calls the second whenever an eof object is received, we get back the protocol of SRFI 158.

Let me give two examples to illustrate my point where I use the prefix x- to denote the new versions of the SRFI 158 procedures:

(define g
  (x-make-coroutine-generator
    (lambda (yield)
      (yield 1 2)
      (yield 3 4)
      5)))

Then, invoking (g f) would yield the multiple values 1, 2. Afterwards, 3, 4 are yielded. By invoking (g f) for a third time, finally (f 5) would be called in tail context. We can define make-coroutine-generator in terms of this model:

(define (make-coroutine-generator c)
  (let ((g (x-make-coroutine-generator c)))
    (lambda ()
      (g (lambda args (eof-object))))))

(Here, due to the limitation of the SRFI 158 protocol, the return value 5 will have to be ignored if eof objects have no payload.)

An accumulator example would be:

(define-values (a s) (x-count-accumulator))

Invoking (a f 1), (a f 2), (a f 3) would feed the values 1, 2, 3 into the accumulator (the continuation f is just in case the accumulator, which would be a generalized generator in disguise, stops prematurely). Afterwards, we can invoke (s f). This would call f in tail context with the number of values fed in, that is 3.

We can get back (count-accumulator) as follows:

(define (count-accumulator)
  (let-values ((a s) (x-count-accumulator))
    (lambda (v)
      (if (eof-object? v)
        (s values)
        (a values v)))))

If (x-)make-coroutine-generator is extended so that yield can return values into the generator, one can even implement (x-)count-accumulator very cleanly using (x-)make-coroutine-generator.

Marc




On Tue, Oct 17, 2017 at 12:28 AM, Marc Nieper-Wißkirchen <xxxxxx@nieper-wisskirchen.de> wrote:
Shiro Kawai <xxxxxx@gmail.com> schrieb am Sa., 14. Okt. 2017 um 20:30 Uhr:
Allowing a generator g to accept arguments to "add" items into that generator is still a generator.  The point of accumulator is that you can get the result at once, and you don't need to know the type of the result to use it.

Python's generators allow to throw an exception into it. This makes Python's generator protocol flexible enough to model accumulators. Let me bring it face to face:

Python                                           | SRFI 158
=====================================================================================================================
Generator                                        | Generator or accumulator or something strictly more powerful
Generators yield values                          | Generators yield values
Values can be sent into generators               | Values can be sent into accumulators
Exhausted generator can have a non-trivial value | Exhausted generator has only a trivial value (eof-object)
Generators can be stopped by non-trivial values  | Accumulators can be stopped only with a trivial value (eof-object)

In other words, Python's generators accept values through two code paths (sending or throwing into) and return values through two code paths (yielding and throwing out of). In contrast, SRFI 158's generators accept values only through one code path (the sending equivalent) and return values only through one (the throwing equivalent, but restricted to a unique value). SRFI 158's accumulators accept values through two code paths (the sending and the throwing equivalent, but the latter restricted to a unique value), and return values only through one (the throwing equivalent).

One way to make SRFI 158's generators more powerful could be to add an optional payload to eof objects, which does not seem to contradict their specification in the R7RS. Then, accumulators can be come a derived concept of the generator concept.

It remains to discuss how Scheme's concept of multiple values fit into the proposed protocol. Generators yielding multiple values make perfect sense (think of a generator that pairs two other generators in parallel) but the current protocol makes this a bit hard to implement efficiently: Because the eof object will only be returned as a single value, the generator would not produce a constant number of values in each step.


The use case of accumulator I see is a generic gatherer---by passing accumulator, one can parameterize sequence operations.  For example, from an infinite generator of integers (size-gen) and an

I am not saying that accumulators (as those in the current draft of SRFI 158) are not useful, quite to the contrary. It's just that they shouldn't be called "duals" of generators when they are both subsumed by the more powerful generator concept of Python or ECMAScript (and which should make us think whether we can spec something at least as powerful now that SRFI 121 is being revised). Following the slogan: Few unified concepts, but powerful ones.

Marc

infinite generator of random items, you can write an infinite generator of random sequences, whose return type can be customized by passing suitable accumulators.

(define (random-sequence make-acc size-gen item-gen)
  (lambda ()
    (let ((acc (make-acc)))
      (let loop ((n (size-gen)))
         (if (zero? n)
            (acc (eof-object))
            (begin (acc (item-gen)) (loop (- n 1))))))))

Suppose (integers-poisson$ L) returns an infinite generator that generates random integers accoring to poisson distribtuion of mean L, and (chars$ CHAR-SET) returns an infinite generator that returns random chars from the given char-set.

(random-sequence string-accumulator (integer-poisson$ 5) (chars$ char-set:ascii))

returns a generator that generates random string.

Gauche's dara.random binary uses similar approach, although it uses classes (the callee can derive accumulator from a class object).  For example: http://practical-scheme.net/gauche/man/?p=sequences-of









On Sat, Oct 14, 2017 at 2:46 AM, Marc Nieper-Wißkirchen <xxxxxx@nieper-wisskirchen.de> wrote:
I am not so much concerned about the usability of the accumulator approach in this SRFI, but more about saying that the SRFI 158 concept of an accumulator is dual to the concept of a generator.

In fact, after some thinking, I have come to the belief that the SRFI 158 generators and accumulators are instances of the same, slightly more general concept.

Consider Python's (or ES6's) generators. Such a generator can both be fed a value, and values can be retrieved from such a generator. Also, such a generator has a return value. In other words, it can act as a SRFI 158 generator and as a SRFI 158 accumulator at the same time. Besides showing the applicability of Python's generator concept, it shows that SRFI 158's generator and accumulator concepts are not dual, but are shadows of the same unified concept.

Thus I would like to propose to extend SRFI 158's concept of a generator so that accumulators fit into the concept as well. Say, if "g" is a generator, invoking "g" with no argument will yield the same value, invoking "g" with one or more arguments, will feed these values into the generator (so that it can act as an accumulator). The only issue is about the return value. SRFI 158 dictates that generators return an eof-object when their states are exhausted. This leaves no room to return a meaningful value (as Python's generators can do so that they can act as accumulators). Another shortcoming is that the protocol of SRFI 158's generators and accumulators does not make it straight-forward to feed multiple values into an accumulator or to yield multiple values from a generator. I am not yet sure how to fix it but I do think that fixing it would make SRFI 158 better fit into Scheme's evaluation model (to which multiple values come naturally).

Marc

John Cowan <xxxxxx@ccil.org> schrieb am Fr., 13. Okt. 2017 um 01:52 Uhr:
Marc Nieper-Wißkirchen scripsit:

Adding duals of generators in form of accumulators is a great idea. However, the accumulator protocol does not look as if it is the right one:

The *protocol* is simply this:  If the argument is not an eof-object, do something with it and return an undefined value.  If the argument is an eof-object, return a value which depends in some way on the previously passed objects.  As I understand you, you are not objecting to this protocol, but to the way in which predefined accumulator constructors are specified.
 
In the current proposal, the accumulator now returns its state, meaning that the producer suddenly becomes concerned with the accumulator's state, not the constructor of the accumulator. This looks wrong.

In an accumulator constructed with make-accumulator, it returns some function of the state rather than the state itself, and the creator gets to specify the function (finalizer).    But in a prespecified constructor, the transformation done is a consequence of the way the constructor is implemented.  For example, the list-accumulator constructor's state is a *reversed* list of objects, and the finalizer is `reverse`.

Now I think your argument is that it is useful to be able to specify in advance, at construction time, whatever further finalization the user desires to have done  I think, however, that this is not necessary.  The continuation of an accumulator call with a normal object is inherently different from the continuation of a call with an eof-object. In fhe first case, the continuation will normally discard the returned value, whereas in the second case the returned value is meaningful to the caller.  

So just as the first thing to be done with a value returned from a generator is to test it with `eof-object?`, so any calls to an accumulator with a normal object will normally appear at a distinct place in the control flow from the final call with an eof-object.  As such, the last call can readily be wrapped in any finalizer that seems desirable without interfering with regular calls.  This eliminates the requirement for a double finalization, one inherent to the constructor and one supplied by the user.

I hope this helps clarify my line of thinking.

-- 
John Cowan          http://vrici.lojban.org/~cowan        xxxxxx@ccil.org
                if if = then then then = else else else = if;

To unsubscribe from this list please go to http://archives.simplelists.com