I originally planned to write this email as a point-by-point comparison of the two array SRFIs currently on the table for the Orange Edition of R7RS-large.  However, it soon became apparent that most of the procedures available only in one SRFI could readily be implemented in the other.  So I decided to write about what they have in common and what some of the more fundamental differences are, with occasional mention of specifics.

Common ideas:

Both SRFIs provide multidimensional arrays with an arbitrary number of dimensions, each of which has bounds that are a range of exact integers.  So, for example, a 3-dimensional array might have bounds from 0 to 3 in the first dimension, 2 to 4 in the second, and -10 to 10 in the third.  As with all other ordered containers, the lower bound is inclusive and the upper bound exclusive.

Furthermore, there are three kinds of arrays.  Arrays of the first kind are backed by a linearly addressed storage object similar to a vector: that is, indexed from 0 to the total number of elements in the array, less 1.  These arrays are always mutable, and are the kind known from Fortran, C, Common Lisp, Java, and many other programming languages.

Arrays of the second kind are defined by a procedure that accepts the same number of exact integers as the dimensions of the array, where the integers are bounded by the array bounds.  The procedure may calculate the value of the array elements or may use some other kind of storage to provide them.  Arrays of this kind are immutable.

Arrays of the third kind are mutable and are defined by two procedures, one for obtaining the values of array elements just like for arrays of the second kind, and another for changing them.  It is typically but not necessarily the case that when the second procedure is used to set the value of an array element, the first procedure will get back the same value.

Differences:

1) SRFI 122 does not support degenerate arrays.  An array is degenerate if it has zero dimensions (in which case it has one element) or has a bound such that the lower bound is not less than the upper bound (in which case there are no elements).  SRFI 164 fully supports such arrays; to make SRFI 122 do so would require a lot of shimming.

2) SRFI 122 provides a disjoint type of objects that specify the bounds of an array to be created.  SRFI 164 uses 2 x n arrays for this purpose, with a mechanism for breaking the circular dependency.

3) SRFI 122 specifies how the user can create storage objects that are restricted to contain only particular values such as bytes or 64-bit floats.  SRFI 164 leaves that up to the implementation.

4) SRFI 122 has no general-purpose setter functions such as array-set or array-copy!, so all array mutation must be done through the array's own setter function, which can be obtained via SRFI 122's extensive introspection procedures.  SRFI 164 has general-purpose setters but has less introspective ability.

5) SRFI 122 provides more convenience functions than SRFI 164, including in particular a lazy array-map operator that, given an array and a mapping function, provides an array of the second kind.

6) SRFI 122's transformation procedures are affine, which means they are very efficient.  SRFI 164 provides general transformations, which can of course be written on top of SRFI 122, but aren't provided by default.

7) SRFI 164's procedure names and other terminology are backward compatible with the original SRFI 25 and to some extent with Common Lisp; SRFI 122's terms are not.

8) SRFI 122 provides user control of safe vs. unsafe array accesses (the second choice leaves safety up to the underlying storage in arrays of the first kind); SRFI 164 does not.

9) Neither sample implementation is entirely portable.  SRFI 122 uses Gambit's `define-macro` non-hygienic macros, DSSSL-style keywords, and the non-portable `define-structure macro`.  SRFI 164 is written partly in Kawa but partly in Java.

I hope that this will help people interested in arrays to choose which to vote for (there will also be the usual options of "abstain" and "no arrays".  I think arrays are an important capability for the large language to have, but a majority of legal votes cast (an abstention is not a vote) are required in order to include anything.  Corrections to this document are of course welcome.



John Cowan          http://vrici.lojban.org/~cowan        xxxxxx@ccil.org
Your worships will perhaps be thinking that it is an easy thing
to blow up a dog? [Or] to write a book?
    --Don Quixote, Introduction