Broadcasting and adding dimensions to arrays; array "slice" notation. Bradley Lucier 03 Feb 2025 23:19 UTC

TL;DR:

Add array broadcasting and inserting new axes into arrays; adopt
notation to specify more complex affine transformations on array
domains.  See the list at the bottom of this email.

(A) Judging from NumPy documentation and examples, it appears that array
broadcasting is used extensively for certain applications.

In my opinion, if you have inner and outer products and array-map, all
with Scheme closures as operators, you can do the examples I've seen
almost as easily in SRFI-231, but we should provide the tools that
people are used to.

NumPy has the following functions:

numpy.broadcast_arrays:

https://numpy.org/doc/stable/reference/generated/numpy.broadcast_arrays.html

numpy.broadcast_shapes:

https://numpy.org/doc/stable/reference/generated/numpy.broadcast_shapes.html

numpy.broadcast_to:

https://numpy.org/doc/stable/reference/generated/numpy.broadcast_to.html

These are Bawden-style transformations that are, generally, not one to
one.  So one would have to make various decisions to add them to a
Scheme library:

1.  All axes in NumPy and Racket's math/array have lower bounds zero,
which simplifies the meaning of array broadcasting.  We'd need to
determine the semantics of array broadcasting with possibly nonzero
lower bounds of axes, as in SRFI 231.

2.  NumPy does not have "generalized arrays", and setting the result of
broadcast_arrays to writable is now deprecated, see

https://numpy.org/doc/stable/reference/generated/numpy.broadcast_arrays.html

I would suggest that

(a) broadcasting not be allowed with generalized arrays;

(b) the array results of broadcasting not be mutable.  (Previously I had
argued that the *arguments* to array broadcasting not be mutable, but
perhaps that's too restrictive.)

(c) broadcasting not be implicit, as in NumPy and Racket's math/array,
but require an explicit function call.

(B) Both NumPy and Racket's math/array allow adding new axes to existing
arrays, with an implicit broadcast of data in the argument array to the
larger array.  See, e.g.,

numpy.expand_dims

https://numpy.org/doc/stable/reference/generated/numpy.expand_dims.html

or math/array's array-axis-insert:

https://docs.racket-lang.org/math/array_transform.html#%28def._%28%28lib._math%2Farray..rkt%29._array-axis-insert%29%29

Broadcasting can implicitly add axes to the left of existing axes;
operations like expand_dims can place new axes between existing axes.

(C) Both NumPy and Racket's math/array package have powerful notations
for specifying what they call array slices, which specify certain types
of affine transformations to implement Bawden-style array
transformations (some of which are one to one, some of which are not).

These notations implicitly specify both the domain of a new result array
and the mapping from the result domain to the domain of the argument
array.  See, for example, Racket's math/array notation for slicing here:

https://docs.racket-lang.org/math/array_slicing.html#%28part._.Slice-.New-.Axis__add_an_axis%29

On that page, "Sequenceof Integer" does not specify a Bawden-style
affine array transformation and in general requires copying of an array,
so we do not consider it here.

As these array transformations may not be one to one, I'd recommend
restricting them to specialized arrays, with the result being immutable.

As the implementation of Racket's math/array makes clear, these
notations do not require any new syntax.

These slicing specifications assume in some cases that all array axes
have lower bounds zero, so it will be necessary to determine proper
semantics for arrays with nonzero lower bounds.

I'm going to add these things to the list of things to change if a new
draft is needed.

Brad

Things to change in a future library:

1.  (srfi 231) is "safe", with an "unsafe" library (srfi 231 unsafe);
mixing safe and unsafe arrays and operations could be accomplished by
renaming routines.

2.  Get rid of array-freeze!

3.  Do not fix the order of evaluation of array elements in arguments to
the "bang" (!) procedures.

4.  Add (array-rebase array [lower-bounds]) to translate an array to
given lower bounds.

5.  Add broadcasting of arrays and adding new axes to arrays.

6.  Add a notation for quickly specifying Bawden-style array
transformations similar to that of NumPy or Racket's math/array