Row-major order, column-major order, and currying arrays: an exegesis Bradley Lucier (18 Aug 2024 21:32 UTC)
|
Re: Row-major order, column-major order, and currying arrays: an exegesis
Arthur A. Gleckler
(18 Aug 2024 21:36 UTC)
|
Re: Row-major order, column-major order, and currying arrays: an exegesis
Bradley Lucier
(18 Aug 2024 22:08 UTC)
|
Re: Row-major order, column-major order, and currying arrays: an exegesis
Arthur A. Gleckler
(18 Aug 2024 22:20 UTC)
|
TL; DR: I was thinking about a version of SRFI 231 with array elements stored in column-major order, as in Fortran. The current version of SRFI 231 specifies row-major order of storing elements, as in C or Common Lisp (I believe). That makes some things easier, but also made array currying a bit tricky. In my opinion, array currying is important in algorithms, so I made this discussion before continuing with the idea of library version with arrays in column-major order. Background: One big reason to consider using column-major order is that LAPACK, the BLAS, etc., originated in Fortran, so the original algorithms, and many optimized versions, assume column-major storage order. (Many routines allow one to indicate whether to compute with a given matrix argument or the transpose of a given matrix argument, so if data is stored in row-major order, one can simply flip that switch to get a result. This may or may not incur additional costs.) There is a second, perhaps less significant, reason to consider moving to column-major order. In the SRFI 231 code for stepping through array elements, there is an inefficiency caused by row-major order for arrays with dimension so large that hand-written (or macro-expanded) specialized loops cannot be used, so that arguments are stored in lists and one calls, e.g., (apply A_ indices) where A_ is (array-getter A) and indices is a list of indices. The SRFI 231 sample implementation avoids mutating argument lists in order to be "call/cc-safe". Lists are most efficiently operated on from the left side (the "head"), so getting to the next multi-index in column-major order, where the left-most index varies most quickly, is simply (let ((new-indices (cons (+ (car indices) 1) (cdr indices)))) (apply A_ new-indices)) (while keeping track of whether you've reached the upper bound of that axis, of course). Whereas, with elements stored in row-major order, with the right-most index varying most quickly, avoiding mutation means that while stepping through array elements in order, one keeps track of the *reversed* argument list and computes something like (let ((new-reversed-indices (cons (+ (car reversed-indices) 1) (cdr reversed-indices)))) (apply A_ (reverse new-reversed-indices))) with an extra call to "reverse". In the sample implementation this is usually done when the dimension of the array is bigger than 8, so it's not of much practical concern, but it is a bit tricky. Currying: Currying in logic and programming turns, e.g., a function of three variables into a function of one variable that returns a function of two variables: (lambda (x y z) ...) => (lambda (x) (lambda (y z) ...)) I just did there what *everyone* does without comment or perhaps even much thought: The *leftmost* argument is stripped off for the outer lambda, leaving the two rightmost arguments for the inner lambda. And that seems very natural for arrays, if you want to look at two-dimensional slices of a three-dimensional array A, then the getter B_ of the two dimensional slice B for fixed i_0 would be B_ == (lambda (i_1 i_2) (A_ i_0 i_1 i_2)) (Of course, if you want to look at slices that are oriented differently one would permute the axes of A to give a new array A-bar and then curry that array.) Now let's talk about row-major versus column-major order. Assume that A is a newly allocated three-dimensional array in row-major order, so that the elements of A are stored contiguously and in increasing index order in the backing store (are "packed" in the terminology of SRFI 231). Then if B is a curried slice of A, then the elements of B are also "packed". But if A a newly allocated three-dimensional array in *column-major* order, the elements of the curried slice B are *not* stored contiguously and in increasing column-major order, i.e., they are not "packed". So some questions arise for arrays stored in column-major order: 1. Should one strip indices from the *right* when currying arrays in column-major order? So a two-dimensional slice B for fixed i_2 would have getter B_ with B_ == (lambda (i_0 i_1) (A_ i_0 i_1 i_2)) Then B would be "packed". 2. Question 1 is premised on the idea that canonically, the elements of curried subarray slices should be "packed" if the elements of the original array are "packed". This is an assumption that programmers might make. If programmers will *not* make this assumption, then it doesn't matter much that the elements of a curried subarray are not "packed" even when the elements of the original array are "packed". 3. Maybe it won't be an issue because, as stated earlier, if the array axes are permuted before currying, the curried array slices will not be "packed", anyway. So the answer to Question 2 might be "No, programmers won't assume that." Brad