Re: [PATCH] add SRFI: srfi-121; generators
John Cowan
(21 Jan 2021 18:40 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators Mark H Weaver (23 Jan 2021 00:59 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
Shiro Kawai
(23 Jan 2021 02:15 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
Arthur A. Gleckler
(23 Jan 2021 02:18 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
Mark H Weaver
(23 Jan 2021 06:38 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
John Cowan
(26 Jan 2021 03:30 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
Linus Björnstam
(26 Jan 2021 07:08 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
Marc Nieper-Wißkirchen
(26 Jan 2021 07:15 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
Linus Björnstam
(26 Jan 2021 16:03 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
Arthur A. Gleckler
(08 Apr 2021 15:54 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
Linus Björnstam
(11 Apr 2021 15:52 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
Arthur A. Gleckler
(11 Apr 2021 16:17 UTC)
|
Re: [PATCH] add SRFI: srfi-121; generators
Linus Björnstam
(27 Jan 2021 15:33 UTC)
|
Hi John, John Cowan <xxxxxx@ccil.org> writes: > Mark: I'm interested to know if you have a sketch of ideas for a more > efficient implementation of SRFI 121/158. You say it requires specific > knowledge of Guile internals, but are you willing to sketch how you might > do it? Thanks. Here are a few suggestions off the top of my head, looking only at the latest SRFI-121 reference implementation: * In 'gcombine', 'generator-fold', 'generator-for-each', and possibly also 'generator-unfold', it would be helpful to use 'case-lambda' to provide specialized code for cases where the 'dotted' tail in the argument list consists of 1 or 2 arguments. When a procedure with a dotted tail is called, it forces allocation of a new list, and last I checked Guile does not include optimizations to avoid that allocation where possible. Worse, the general case requires calling 'map' and allocating a new list every time a new item is requested. It's a great help to avoid these expenses in the most common cases. For example, see the definitions of 'map', 'for-each', and 'fold' in Guile: https://git.savannah.gnu.org/cgit/guile.git/tree/module/ice-9/boot-9.scm?id=75b0db1a286f936a90683973efc2315a07c03b21#n214 https://git.savannah.gnu.org/cgit/guile.git/tree/module/srfi/srfi-1.scm?id=75b0db1a286f936a90683973efc2315a07c03b21#n451 * Avoid using 'define-values' in internal lexical contexts in Guile for now. Hopefully some day Guile's implementation of 'define-values' will be more efficient, but for now it's implemented as a simple macro that expands into code that mutates variables, which prevents several other optimizations that could otherwise be done by Guile's compiler. In particular, in 'gcombine', better use 'call-with-values' or 'receive'. * Several procedures are defined in terms of more general higher-order procedures, or create intermediate lists/generators unnecessarily, for the sake of making the code simpler. In most contexts I would applaud this practice, but there will be a significant price to pay in efficiency. For example, 'generator->reverse-list' with 1 argument is implemented in terms of 'generator-fold', and with 2 arguments by creating an intermediate generator using 'gtake'. * To make matters worse: 'gtake', as well as 'make-for-each-generator' and 'make-unfold-generator', are defined in terms of coroutine generators. It's good to have coroutine generators available, but they are quite expensive, and best avoided where efficiency is desirable. * 'generator->vector' is defined using 'generator->list' and 'list->vector'. That's bad enough, but digging deeper we see that 'generator->list' is implemented by reversing the result of 'generator->reverse-list', which as I've already pointed out is defined using 'generator-fold', which currently involves calling 'map' and 'append' and allocating temporary lists for each item generated. In general, it's helpful to go through the mental exercise of tracing an evaluation of each of these procedures, and more importantly to trace calls to the produced generators, to get some idea of the expense involved. This is just a sampling of the problems I noticed from a quick skim of the code, by no means comprehensive. I acknowledge that following the suggestions above will make the code much larger, uglier, and more difficult to understand. I would not recommend such practices except in cases where efficiency is paramount. In this case, I think efficiency *is* paramount. My rationale is that, like hash tables, generators inevitably force code that uses them to be written in an imperative style, and therefore they are best avoided in my opinion, *except* in cases where efficiency is paramount. To avoid being forced to write code in an imperative style, I suggest that in most cases streams are preferable to generators. Regards, Mark