Bulk copying is a lot faster than element-by-element copying in one example.
Bradley Lucier 03 May 2020 20:18 UTC
A little over a week ago I implemented block copying for specialized
arrays whose elements are in order and adjacent in memory.
The code uses the various @vector-copy! routines from R7RS, which for
Gambit I defined using xxxxxx@vector-move! routines, which, at bottom, use
memmove.
So today I compiled generic-arrays.scm and the following test code
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(define a (make-specialized-array (make-interval '#(4 10000 10000))
u32-storage-class
#f))
(define b (make-specialized-array (make-interval '#(10000 10000))
u32-storage-class
#f))
;;; Access a through elements of dimension 10000 x 10000.
(define curried-a (array-curry a 2))
;;; Assign the first 10000 x 10000 subarray of a to b
;;; using bulk copy. (400,000,000 bytes)
(time (array-assign! b ((array-getter curried-a) 0)))
;;; Set d to a general array accessing the second 10000 x 10000
;;; subarray of a.
(define d (let ((d_ (array-getter ((array-getter curried-a) 1))))
(make-array (make-interval '#(10000 10000))
d_)))
;;; Assign d to b using element-by-element copy
(time (array-assign! b d))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
The times are
(load "assign-time-test")
(time (array-assign! b ((array-getter curried-a) 0)))
0.039606 secs real time
0.039588 secs cpu time (0.039588 user, 0.000000 system)
no collections
896 bytes allocated
1 minor fault
no major faults
(time (array-assign! b d))
1.947715 secs real time
1.947511 secs cpu time (1.947511 user, 0.000000 system)
no collections
64 bytes allocated
no minor faults
no major faults
As expected, the difference is significant: the block move code
transfers 10 GB/second on my seven-year-old Linux box, while the
element-by-element code transfers about 200 MB/second.
Brad