Re: Make code safe for continuation capture, part the first. · gambiteer/srfi-231@2876863
Marc Feeley 04 Aug 2022 13:19 UTC
> On Aug 4, 2022, at 3:46 AM, Marc Nieper-Wißkirchen <xxxxxx@gmail.com> wrote:
>
> If I read the spec correctly, the following should be a correct implementation of vector-map (restricted to one input vector):
>
> (define vector-map
> (lambda (proc in)
> (define n (vector-length in))
> (define out (make-vector n))
> (do ((i 0 (fx+ i 1))
> ((fx=? i n) (vector-copy out))
> (vector-set! out (proc (vector-ref in i)) i))))
>
> As the algorithm is iterative, the stack does not grow. As one can see, compared with the "naive" algorithm not respecting call/cc, there is just one vector copy more (which could be eliminated in a native implementation using, e.g., the ideas I posted earlier).
>
Here is the non-tail recursive algorithm used by Gambit for (single vector) vector-map.
(define (vector-map-1 proc input-vect)
(define (vmap-1 proc input-vect i)
(if (< i (vector-length input-vect))
(let* ((result (proc (vector-ref input-vect i)))
(output-vect (vmap-1 proc input-vect (+ i 1))))
(vector-set! output-vect i result)
output-vect)
(make-vector i)))
(vmap-1 proc input-vect 0))
It uses the stack for temporary storage. It has the nice features of not adding pressure on the GC (no heap garbage generated) and being call/cc and thread “safe”. The naive version that preallocates the result vector and mutates it with vector-set! allows user-code to detect the mutation. The version which does a vector-copy allows user-code in a multithreaded Scheme implementation to detect the mutation.
The analysis of the peak stack/heap space usage gives an overhead of 3 words per element (the stack frame for the recursive call to vmap-1 contains a return address and the 2 live variables result and i). So if the input and output vectors are counted here’s how the algorithms compare:
- naive algorithm: peak space = 2 words per element, no garbage generated, call/cc exposes mutation
- vector-copy algorithm: peak space = 3 words per element, 1 word garbage generated per element, call/cc exposes mutation in a multithreaded Scheme
- non-tail recursive algorithm: peak space = 5 words per element, no garbage generated, call/cc does not expose mutation even in a multithreaded Scheme
If the input vector is not live after the call to vector-map, then 1 word per element of peak space usage should be subtracted from the vector-copy and non-tail recursive algorithms (because when the output vector is allocated the input vector is no longer live). So the non-tail recursive algorithm has a 1.66x to 2x higher peak usage than the vector-copy algorithm. Note that the algorithm can be tweaked so that vmap-1 returns 2 values: the output vector and i. This would remove one word from the stack frame (variable i would no longer be live). The non-tail recursive algorithm would then compare at 1.33x to 1.5x the peak space usage of the vector-copy version. This isn’t done in the Gambit vector-map because multiple-values cause heap allocation.
Marc