Generalized arrays should not be broadcast Bradley Lucier 01 Nov 2025 02:49 UTC

I posted a version of this email to the Racket Discourse at

https://racket.discourse.group/t/arrays-proposal-array-broadcasting-and-adding-new-axes/3948/11

Here is an example of broadcasting that does not involve mutation and
illustrates a situation that has been troubling me.

TL;DR: I think broadcasting, either explicit or implicit, should be
applied only to specialized (strict) arrays, and the resulting arrays
should be immutable.

You start with an N x N array of 3-vectors, a vector-field.

The task is to subtract the average of the 3-vectors from the N x N
array of 3-vectors.

With broadcasting, one might write:

(define (recenter A)

   ;; A is an array of dimension N x N  x 3
   ;; representing a two-dimensional N x N field of three-vectors.

   ;; We want to subtract the mean of the three-vectors from all the
   ;; entries of A, which we could write as follows.

   ;; The "problem" is that the array-map marked by
   ;; <<<<<<<<<<<<<<<
   ;; returns a generalized array (and I don't want to change that)
   ;; so every time an element of the average of the three-vectors is
   ;; accessed, the internal array-reduce is recomputed.

   (array-copy
    (array-map fl- A (array-broadcast
                      (array-map  ;; <<<<<<<<<<<<<<<
                       ;; calculate the average of each coordinate
                       ;; of the three-vectors.
                       (lambda (slice)
                         (/ (array-reduce fl+ slice)
                            (interval-volume (array-domain slice))))
                       (array-curry
                        ;; write A as a three-dimensional array of NxN
arrays
                        (array-permute
                         ;; put the coordinates of the three-vectors first
                         A '#(2 0 1))
                        2))
                      (array-domain A)))))

Here I've explicitly called array-broadcast, but with implicit
broadcasting that explicit call would be elided.

I now have an efficient implementation of array broadcasting for
specialized (strict) arrays, but not for generalized (nonstrict) arrays
(which is not so easy, in general).  So here are two routines with as
efficient implementations of broadcasting generalized and specialized
arrays as I can imagine:

(define (recenter-generalized A)
   (array-copy
    (array-map - A (let* ((average
                           ;; the generalized array result is not copied
to a specialized array
                           (array-map
                            (lambda (slice)
                              (/ (array-reduce fl+ slice)
                                 (interval-volume (array-domain slice))))
                            (array-curry
                             (array-permute
                              A '#(2 0 1))
                             2)))
                          (average-getter ;; use internal unsafe getter
                           (%%array-unsafe-getter average)))
                     (make-array (array-domain A)
                                 ;; the most efficient way to broadcast
                                 (lambda (i j k)
                                   (average-getter k)))))))

(define (recenter-specialized A)
   (array-copy
    (array-map - A (array-broadcast
                    ;; copy the array to a specialized array first
                    (array-copy
                     (array-map
                      (lambda (slice)
                        (/ (array-reduce fl+ slice)
                           (interval-volume (array-domain slice))))
                      (array-curry
                       (array-permute
                        A '#(2 0 1))
                       2)))
                    (array-domain A)))))

And we can test it on very small and not quite so small arrays:

(define A (list->array (make-interval '#(10 10 3)) (map inexact (iota
300))))
(define B (time (recenter-specialized A)))
(define C (time (recenter-generalized A)))

(define A* (list->array (make-interval '#(100 100 3)) (map inexact (iota
30000))))
(define B* (time (recenter-specialized A*)))
(define C* (time (recenter-generalized A*)))

With a 10 x 10 array of 3-vectors, the timings are not sooooo different:

(time (recenter-specialized A))
     0.000084 secs real time
     0.000085 secs cpu time (0.000047 user, 0.000038 system)
     no collections
     37376 bytes allocated
     12 minor faults
     no major faults
     292467 cpu cycles
(time (recenter-generalized A))
     0.000894 secs real time
     0.000894 secs cpu time (0.000666 user, 0.000228 system)
     no collections
     574640 bytes allocated
     71 minor faults
     no major faults
     3202257 cpu cycles

But with a 100 x 100 array of 3-vectors, the timings differ dramatically:

(time (recenter-specialized A*))
     0.001860 secs real time
     0.001860 secs cpu time (0.001860 user, 0.000000 system)
     no collections
     783088 bytes allocated
     84 minor faults
     no major faults
     6671997 cpu cycles
(time (recenter-generalized A*))
     6.449153 secs real time
     6.433722 secs cpu time (6.428914 user, 0.004808 system)
     no collections
     55809952 bytes allocated
     6907 minor faults
     no major faults
     23162995368 cpu cycles

A library should not allow the programmer to fall into such a hole, even
inadvertently, so I conclude that generalized (nonstrict) arrays should
not be broadcast.

But if you have a call

(array-map f A B)

and B is supposed to be broadcast to the domain (shape) of A, then if B
is generalized one has two choices: raise an error and say that B is not
specialized; or silently copy B to a specialized B*, and then broadcast B*.

The thing is, in all other circumstances, the array elements of the
arguments to array-map are not evaluated by array-map---array-map is
supposed to set up work to do be done *later* by routines like
array-fold or array-every.  Copying B to B* would require array-map
evaluating B's getter on its domain, which would break this convention
of the library.

So I think the correct choice is to restrict array broadcasting to
specialized arrays.

Brad