array-ref and array-set! (was Re: Some reshape operators are affine, should they be in this SRFI?) Bradley Lucier 04 May 2020 18:39 UTC

On 5/4/20 10:14 AM, John Cowan wrote:
> On reflection, I think that it would be worthwhile, for the benefit of
> programmers or algorithms ported from other languages, to provide
> array-ref and array-set! directly rather than just documenting how to
> define them yourself.

This comment is CC'ed to the SRFI 122 mail list because the issues are
relevant there.

Here is some work in that direction.

Here are two implementations of array-ref, one that avoids "apply" when
the array dimension is no greater than 4 and a simpler one:

(define (array-ref A i0
                    #!optional
                    (i1 (macro-absent-obj))
                    (i2 (macro-absent-obj))
                    (i3 (macro-absent-obj))
                    #!rest
                    i-tail)
   (cond ((not (array? A))
          (error "array-ref: The first argument is not an array: " A))
         ((eq? i1 (macro-absent-obj))
          ((%%array-getter A) i0))
         ((eq? i2 (macro-absent-obj))
          ((%%array-getter A) i0 i1))
         ((eq? i3 (macro-absent-obj))
          ((%%array-getter A) i0 i1 i2))
         ((null? i-tail)
          ((%%array-getter A) i0 i1 i2 i3))
         (else
          (apply (%%array-getter A) i0 i1 i2 i3 i-tail))))

(define (array-ref-simple A i0 #!rest i-tail)
   (cond ((not (array? A))
          (error "array-ref: The first argument is not an array: " A))
         (else
          (apply (%%array-getter A) i0 i-tail))))

Here's a simple program for comparing access times using array-ref,
array-ref-simple, and array-getter, just accessing elements in an array
with 100,000,000 elements:

(include "generic-arrays.scm")

(define N 10000)

(define A (make-array (make-interval (vector N N))
                       (lambda (i j)
                         0)))

(define (test-array-ref)
   (let i-loop ((i 0)
                (result 0))
     (if (fx= i N)
         result
         (let j-loop ((j 0)
                      (result result))
           (if (fx= j N)
               (i-loop (fx+ i 1)
                       result)
               (j-loop (fx+ j 1)
                       (fx+ result (array-ref A i j))))))))

(define (test-array-ref-simple)
   (let i-loop ((i 0)
                (result 0))
     (if (fx= i N)
         result
         (let j-loop ((j 0)
                      (result result))
           (if (fx= j N)
               (i-loop (fx+ i 1)
                       result)
               (j-loop (fx+ j 1)
                       (fx+ result (array-ref-simple A i j))))))))

(define (test-getter)
   (let ((A_ (array-getter A)))
     (let i-loop ((i 0)
                  (result 0))
       (if (fx= i N)
           result
           (let j-loop ((j 0)
                        (result result))
             (if (fx= j N)
                 (i-loop (fx+ i 1)
                         result)
                 (j-loop (fx+ j 1)
                         (fx+ result (A_ i j)))))))))

(define (test-loops)
   (let i-loop ((i 0)
                (result 0))
     (if (fx= i N)
         result
         (let j-loop ((j 0)
                      (result result))
           (if (fx= j N)
               (i-loop (fx+ i 1)
                       result)
               (j-loop (fx+ j 1)
                       (fx+ result 0)))))))

And these are the results using Gambit on my Linux box:

 > (time (test-getter))
(time (test-getter))
     0.348451 secs real time
     0.348399 secs cpu time (0.346625 user, 0.001774 system)
     no collections
     no bytes allocated
     1 minor fault
     no major faults
0
 > (time (test-array-ref-simple))
(time (test-array-ref-simple))
     5.428409 secs real time
     5.428461 secs cpu time (5.421353 user, 0.007108 system)
     1985 collections accounting for 0.190582 secs real time (0.190827
user, 0.000217 system)
     9600127040 bytes allocated
     1113 minor faults
     no major faults
0
 > (time (test-array-ref))
(time (test-array-ref))
     0.896669 secs real time
     0.896687 secs cpu time (0.893922 user, 0.002765 system)
     no collections
     no bytes allocated
     no minor faults
     no major faults
0
 > (time (test-loops))
(time (test-loops))
     0.027988 secs real time
     0.027980 secs cpu time (0.027979 user, 0.000001 system)
     no collections
     no bytes allocated
     no minor faults
     no major faults
0

Now, Gambit does a particularly good job of compiling test-loops (short
of eliminating the loops entirely) and a particularly poor job of
compiling test-getter, etc.

So if the difference is 0.348399 versus 0.896669 for using

(let ((A_ (array-getter A))) ... (A_ i j))

versus
( ... (array-ref A i j))

then it's worth investigating.

But array-set! will be

(array-set! A v i0 ...)

I can't see how to make it efficient any other way.

> They are also convenient for arrays that need to
> be accessed and changed randomly, such as a multidimensional histogram.

I don't get this, it would seem that you'd have to choose randomly among
the arrays, and not among the indices, to benefit from array-ref and
array-set! over (let ((A_ (array-getter A)) (A! (array-setter A))) ...)

> Even the APL family of languages provide these.

When I was 15 I went to "math nerd" camp for a week at the University of
Waterloo, where one of the activities was programming in APL.  (We even
had terminals with the special character set.) I didn't "get" it, I used
the equivalents of array-ref and array-set! everywhere, I was thinking
too much in Fortran II terms (which I had just learned that year) and my
program (which was a simulation of some sort, I believe) was, in APL
terms, a failure.  (In other words, I think the exercise was to get me
to think in a different way---the APL way---and that didn't happen.)

Using array-ref and array-set! extensively in programs that use this
SRFI will result in slower, clumsier, code that will compare (more)
unfavorably to codes written in a similar way in other languages (for
which compile-time bounds checks, number of arguments checks, etc., can
be elided at run time using compile-time type declarations, etc.).

I believe that a high-performance implementation of this SRFI is
possible with enough work---see the new bulk copying of array-assign!,
which runs 50 times faster than element-by-element copying when moving
u32 elements between specialized arrays whose elements are in-order and
adjacent.  We can do a similar test with u8 elements (see the previous
email for code, change u32 to u8):

(time (array-assign! b ((array-getter curried-a) 0)))
     0.009943 secs real time
     0.009942 secs cpu time (0.009615 user, 0.000327 system)
     no collections
     896 bytes allocated
     no minor faults
     no major faults
(time (array-assign! b d))
     1.587081 secs real time
     1.587101 secs cpu time (1.583597 user, 0.003504 system)
     no collections
     64 bytes allocated
     no minor faults
     no major faults

That's about 160 times as fast using bulk move instead of
element-by-element move.

I don't believe that that kind of improvement is possible in programs
using array-ref and array-set! extensively for array operations.  (A
similar problem arises regularly in Matlab, too.)

Brad