(missing)
Re: Make code safe for continuation capture, part the first. · gambiteer/srfi-231@2876863 Marc Feeley (04 Aug 2022 13:20 UTC)

Re: Make code safe for continuation capture, part the first. · gambiteer/srfi-231@2876863 Marc Feeley 04 Aug 2022 13:19 UTC

> On Aug 4, 2022, at 3:46 AM, Marc Nieper-Wißkirchen <xxxxxx@gmail.com> wrote:
>
> If I read the spec correctly, the following should be a correct implementation of vector-map (restricted to one input vector):
>
> (define vector-map
>   (lambda (proc in)
>     (define n (vector-length in))
>     (define out (make-vector n))
>     (do ((i 0 (fx+ i 1))
>         ((fx=? i n) (vector-copy out))
>       (vector-set! out (proc (vector-ref in i)) i))))
>
> As the algorithm is iterative, the stack does not grow.  As one can see, compared with the "naive" algorithm not respecting call/cc, there is just one vector copy more (which could be eliminated in a native implementation using, e.g., the ideas I posted earlier).
>

Here is the non-tail recursive algorithm used by Gambit for (single vector) vector-map.

 (define (vector-map-1 proc input-vect)

   (define (vmap-1 proc input-vect i)
     (if (< i (vector-length input-vect))
         (let* ((result (proc (vector-ref input-vect i)))
                (output-vect (vmap-1 proc input-vect (+ i 1))))
           (vector-set! output-vect i result)
           output-vect)
         (make-vector i)))

   (vmap-1 proc input-vect 0))

It uses the stack for temporary storage.  It has the nice features of not adding pressure on the GC (no heap garbage generated) and being call/cc and thread “safe”.  The naive version that preallocates the result vector and mutates it with vector-set! allows user-code to detect the mutation.  The version which does a vector-copy allows user-code in a multithreaded Scheme implementation to detect the mutation.

The analysis of the peak stack/heap space usage gives an overhead of 3 words per element (the stack frame for the recursive call to vmap-1 contains a return address and the 2 live variables result and i).  So if the input and output vectors are counted here’s how the algorithms compare:

- naive algorithm: peak space = 2 words per element, no garbage generated, call/cc exposes mutation

- vector-copy algorithm: peak space = 3 words per element, 1 word garbage generated per element, call/cc exposes mutation in a multithreaded Scheme

- non-tail recursive algorithm: peak space = 5 words per element, no garbage generated, call/cc does not expose mutation even in a multithreaded Scheme

If the input vector is not live after the call to vector-map, then 1 word per element of peak space usage should be subtracted from the vector-copy and non-tail recursive algorithms (because when the output vector is allocated the input vector is no longer live).  So the non-tail recursive algorithm has a 1.66x to 2x higher peak usage than the vector-copy algorithm.  Note that the algorithm can be tweaked so that vmap-1 returns 2 values: the output vector and i.  This would remove one word from the stack frame (variable i would no longer be live).  The non-tail recursive algorithm would then compare at 1.33x to 1.5x the peak space usage of the vector-copy version.  This isn’t done in the Gambit vector-map because multiple-values cause heap allocation.

Marc