Generalized arrays in the presence of call/cc Bradley Lucier (27 Jul 2022 12:39 UTC)
Re: Generalized arrays in the presence of call/cc Marc Nieper-Wißkirchen (27 Jul 2022 13:15 UTC)
Re: Generalized arrays in the presence of call/cc Bradley Lucier (27 Jul 2022 15:11 UTC)
Re: Generalized arrays in the presence of call/cc John Cowan (27 Jul 2022 20:19 UTC)
Re: Generalized arrays in the presence of call/cc Marc Nieper-Wißkirchen (27 Jul 2022 21:04 UTC)
Re: Generalized arrays in the presence of call/cc Bradley Lucier (28 Jul 2022 21:47 UTC)
Re: Generalized arrays in the presence of call/cc John Cowan (30 Jul 2022 15:33 UTC)
Re: Generalized arrays in the presence of call/cc Marc Nieper-Wißkirchen (30 Jul 2022 17:06 UTC)
Re: Generalized arrays in the presence of call/cc Bradley Lucier (30 Jul 2022 19:48 UTC)
Re: Generalized arrays in the presence of call/cc Bradley Lucier (06 Aug 2022 17:58 UTC)
Re: Generalized arrays in the presence of call/cc John Cowan (06 Aug 2022 18:05 UTC)
Re: Generalized arrays in the presence of call/cc Bradley Lucier (10 Aug 2022 18:27 UTC)

Re: Generalized arrays in the presence of call/cc Bradley Lucier 28 Jul 2022 21:47 UTC
On 7/27/22 5:04 PM, Marc Nieper-Wißkirchen wrote:
> *The real difference would be writing the code in Fortran or C instead,
> so I think that maintaining Scheme's pureness is more important.

If we have "call/cc safe" routines in this Scheme library, then those
routines in Fortran or C would have different semantics to the Scheme
routines; if we think using the C or Fortran routines would be OK in
Scheme, then we could write this Scheme library to have the same
semantics as the C or Fortran library if we so choose.

An additional "difficulty" we have is that the setters of generalized
arrays are general procedures that themselves could capture and reinvoke
the continuation.  If the user wants to do that, go to! I can't see how
to deal with it.

But it seems that the costs of having call/cc safe array-copy, for
example, are not too onerous in Gambit.

I implemented a primitive call/cc-safe routine safe-array->reversed-list
for later use in list->array (so everything was call/cc-safe).

In Gambit, call/cc-safe or -unsafe procedures seem to make little
difference for either large or small arrays in practice.  Here are some
timing runs.

We work with NxN arrays with the following sizes and iterations:

(define bigsize 1000)
(define smallsize 4)

(define bigiterations 10)
(define smalliterations (* bigiterations (square (quotient bigsize
smallsize))))

We define three random fixnum and flonum specialized arrays with those
sizes, map a+b*c componentwise (using specialized fixnum or flonum
operations) on those arrays to get a generalized array, copy the
generalized array to specialized arrays with generic-storage-class and
f64-storage-class, respectively, and find the following results:

fixnum big arrays:

"Unsafe" copy:        0.383701 secs cpu time   80011360 bytes allocated
"Safe" copy:          0.545923 secs cpu time  400008320 bytes allocated

flonum big arrays:

"Unsafe" copy:        0.619352 secs cpu time 1280011264 bytes allocated
"Safe" copy:          0.848479 secs cpu time 1760008800 bytes allocated

Gambit boxes flonums a lot when crossing procedure boundaries.

fixnum small arrays:

"Unsafe" copy:        0.709278 secs cpu time  817999808 bytes allocated
"Safe" copy:          0.994238 secs cpu time 1200000000 bytes allocated

flonum small arrays:

"Unsafe" copy:        0.994272 secs cpu time 2180000000 bytes allocated
"Safe" copy:          1.010598 secs cpu time 2480000000 bytes allocated

We also built big and small specialized arrays with u1-storage-class, mapped

(lambda (a b c) (fxior a (fxand b c)))

and copied the result to a u1-storage-class array.  We found

u1 big arrays:

"Unsafe" copy:        0.470481 secs cpu time    1262960 bytes allocated
"Safe" copy:          0.564570 secs cpu time  478759120 bytes allocated

u1 small arrays:

"Unsafe" copy:        0.822633 secs cpu time  799874952 bytes allocated
"Safe" copy:          0.852193 secs cpu time 1100000000 bytes allocated

A "typical" report is at the bottom.  (We got the statistics above from
several different runs.)

The greatest slowdown we saw with "unsafe" copy was a factor of
1.4227823226939726 for fixnum operations, 1.369946331004017 for flonum
operations, and 1.1999846965127179 for bit operations (which seem to be
inherently slow because of how we access and write them).

These slowdowns are smaller for 4x4 arrays than for 1000x1000 arrays,
presumably because of the overhead of creating the many more resulting
arrays.

Memory usage increased by the greatest factor for large arrays, a factor
of 4.999394086039782 for fixnum arrays, 1.3749947750459797 for flonum
arrays, and 379.0770254006461 (!) for bit arrays (for which we store a
single bit in a 3-word, 192-bit, cons cell).

Just for giggles, I also timed assigning one specialized array to another:

(time
  (do ((i 0 (fx+ i 1)))
      ((fx= i bigiterations))
    (array-assign! bigfix-A bigfix-B)))

and similarly for small arrays and flonum arrays.  The times were

big array assignments:
fixnum:               0.006319 secs cpu time       1920 bytes allocated
flonum:               0.006837 secs cpu time       1920 bytes allocated

smll array assignments:
fixnum:               0.103023 secs cpu time  120000000 bytes allocated
flonum:               0.118339 secs cpu time  120000000 bytes allocated

So assigning a large fixnum array is quite a bit faster than setting up
a generalized array and copying the elements to a specialized array.

The output of an illustrative run is included below.  The test program
is also included.

I suspect I'll implement call/cc-safe procedures that evaluate all
elements of a generalized array for later transfer to a list, vector,
another array, etc.  This will take some time.

Brad

(load "time-arrays")
(time (do ((i 0 (fx+ i 1))) ((fx= i bigiterations)) (array-copy
(array-map (lambda (a b c) (fx+ a (fx* b c))) bigfix-A bigfix-B
bigfix-C) generic-storage-class)))
     0.383852 secs real time
     0.382333 secs cpu time (0.370329 user, 0.012004 system)
     no collections
     80011360 bytes allocated
     15865 minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i bigiterations)) (safe-array-copy
(array-map (lambda (a b c) (fx+ a (fx* b c))) bigfix-A bigfix-B
bigfix-C) generic-storage-class)))
     0.555263 secs real time
     0.553962 secs cpu time (0.513986 user, 0.039976 system)
     5 collections accounting for 0.093774 secs real time (0.077740
user, 0.015857 system)
     400008320 bytes allocated
     21534 minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i bigiterations)) (array-copy
(array-map (lambda (a b c) (fl+ a (fl* b c))) bigflo-A bigflo-B
bigflo-C) f64-storage-class)))
     0.610901 secs real time
     0.610795 secs cpu time (0.610771 user, 0.000024 system)
     12 collections accounting for 0.147686 secs real time (0.147672
user, 0.000002 system)
     1280011264 bytes allocated
     2881 minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i bigiterations)) (safe-array-copy
(array-map (lambda (a b c) (fl+ a (fl* b c))) bigflo-A bigflo-B
bigflo-C) f64-storage-class)))
     0.917943 secs real time
     0.917773 secs cpu time (0.845848 user, 0.071925 system)
     14 collections accounting for 0.365843 secs real time (0.353801
user, 0.011983 system)
     1760008800 bytes allocated
     61484 minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i smalliterations)) (array-copy
(array-map (lambda (a b c) (fx+ a (fx* b c))) smallfix-A smallfix-B
smallfix-C) generic-storage-class)))
     0.720171 secs real time
     0.720004 secs cpu time (0.707957 user, 0.012047 system)
     8 collections accounting for 0.100387 secs real time (0.096376
user, 0.003993 system)
     842000016 bytes allocated
     11476 minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i smalliterations)) (safe-array-copy
(array-map (lambda (a b c) (fx+ a (fx* b c))) smallfix-A smallfix-B
smallfix-C) generic-storage-class)))
     0.792686 secs real time
     0.792635 secs cpu time (0.792635 user, 0.000000 system)
     12 collections accounting for 0.147973 secs real time (0.147967
user, 0.000000 system)
     1200000000 bytes allocated
     no minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i smalliterations)) (array-copy
(array-map (lambda (a b c) (fl+ a (fl* b c))) smallflo-A smallflo-B
smallflo-C) f64-storage-class)))
     0.995574 secs real time
     0.995337 secs cpu time (0.992275 user, 0.003062 system)
     23 collections accounting for 0.279601 secs real time (0.279403
user, 0.000013 system)
     2180000000 bytes allocated
     no minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i smalliterations)) (safe-array-copy
(array-map (lambda (a b c) (fl+ a (fl* b c))) smallflo-A smallflo-B
smallflo-C) f64-storage-class)))
     1.012563 secs real time
     1.012227 secs cpu time (1.005246 user, 0.006981 system)
     25 collections accounting for 0.302828 secs real time (0.298791
user, 0.003929 system)
     2480000000 bytes allocated
     no minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i bigiterations)) (array-copy
(array-map (lambda (a b c) (fxior a (fxand b c))) bigu1-A bigu1-B
bigu1-C) u1-storage-class)))
     0.475493 secs real time
     0.475470 secs cpu time (0.475470 user, 0.000000 system)
     no collections
     1262960 bytes allocated
     no minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i bigiterations)) (safe-array-copy
(array-map (lambda (a b c) (fxior a (fxand b c))) bigu1-A bigu1-B
bigu1-C) u1-storage-class)))
     0.565081 secs real time
     0.564904 secs cpu time (0.564904 user, 0.000000 system)
     5 collections accounting for 0.077949 secs real time (0.077945
user, 0.000000 system)
     478759120 bytes allocated
     4645 minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i smalliterations)) (array-copy
(array-map (lambda (a b c) (fxior a (fxand b c))) smallu1-A smallu1-B
smallu1-C) u1-storage-class)))
     0.830780 secs real time
     0.830472 secs cpu time (0.826483 user, 0.003989 system)
     8 collections accounting for 0.097824 secs real time (0.093837
user, 0.003948 system)
     799874952 bytes allocated
     2606 minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i smalliterations)) (safe-array-copy
(array-map (lambda (a b c) (fxior a (fxand b c))) smallu1-A smallu1-B
smallu1-C) u1-storage-class)))
     0.848355 secs real time
     0.848194 secs cpu time (0.848189 user, 0.000005 system)
     11 collections accounting for 0.134465 secs real time (0.134465
user, 0.000000 system)
     1100000000 bytes allocated
     no minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i bigiterations)) (array-assign!
bigfix-A bigfix-B)))
     0.006319 secs real time
     0.006315 secs cpu time (0.006315 user, 0.000000 system)
     no collections
     1920 bytes allocated
     no minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i bigiterations)) (array-assign!
bigflo-A bigflo-B)))
     0.006840 secs real time
     0.006837 secs cpu time (0.006837 user, 0.000000 system)
     no collections
     1920 bytes allocated
     1 minor fault
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i smalliterations)) (array-assign!
smallfix-A smallfix-B)))
     0.103050 secs real time
     0.103023 secs cpu time (0.103023 user, 0.000000 system)
     1 collection accounting for 0.012488 secs real time (0.012488 user,
0.000000 system)
     120000000 bytes allocated
     no minor faults
     no major faults
(time (do ((i 0 (fx+ i 1))) ((fx= i smalliterations)) (array-assign!
smallflo-A smallflo-B)))
     0.118410 secs real time
     0.118339 secs cpu time (0.118339 user, 0.000000 system)
     2 collections accounting for 0.024541 secs real time (0.024538
user, 0.000000 system)
     120000000 bytes allocated
     no minor faults
     no major faults