Petite Chez Scheme is quite slow compared to Chez Scheme.
If you want to know how values and call-with-values compare
to mu in Chez Scheme, you have to compare using Chez Scheme,
not Petite Chez Scheme.
Unless you are benchmarking a slow interpreter (and what's
the point of that?), Joo ChurlSoo's original benchmark runs
too quickly to time accurately, so I changed the 100000 to
10000000 (ten million).
With that change, I measured the following on a SunBlade
1500 with no other users. All times are in seconds:
Chez Larceny MzScheme
v6.1 v0.90 v301
bm1: 10.9 6.5 25.4
bm2: 12.0 10.1 24.7
bm3: 11.2 6.4 22.6
bm4: 10.4 10.7 36.0
bm5: 10.5 6.0 17.3
bm6: 10.0 9.9 25.2
bm7: 10.8 6.8 25.5
bm8: 11.2 11.2 35.1
bm9: 1.6 3.6 12.3
bm10: 1.7 7.6 15.7
bm11: 2.1 4.7 17.2
bm12: 1.8 8.6 22.5
bm1: ten million uses of mu
bm2: ten million uses of values
bm3: ten million calls to a procedure that uses mu
bm4: ten million calls to a procedure that uses values
bm5: same as bm1, substituting 'one for 1, 'two for 2, and so on
bm6: same as bm2, substituting 'one for 1, 'two for 2, and so on
bm7: same as bm3, substituting 'one for 1, 'two for 2, and so on
bm8: same as bm4, substituting 'one for 1, 'two for 2, and so on
bm9: same as bm5, using a counter instead of for-each
bm10: same as bm6, using a counter instead of for-each
bm11: same as bm7, using a counter instead of for-each
bm12: same as bm8, using a counter instead of for-each
On this machine, I see little if any evidence that mu is
faster than values in Chez Scheme.
One could argue that mu is faster than values in Larceny
and in MzScheme, but Larceny v0.90 does not even attempt
to implement values efficiently, and the same appears to
be true of MzScheme. At any rate, it is clear that any
differences in the speed of mu and values are dwarfed by
differences in the speed of different implementations.
By the way, the effect of the minor changes between bm1
and bm5, and between bm5 and bm9, should warn us against
trying to draw firm conclusions from these micro-benchmarks.
With Joo's original benchmarks, most of the execution time
was spent in creating a long list and in traversing that
list via the for-each procedure, suffering many cache
misses along the way. Only a small portion of the run
time was spent in mu or values.
Will