"Branch Prediction and Interpreter Speed"
<http://swiss.csail.mit.edu/~jaffer/CNS/interpreter-branch>
Abstract
Modern CPUs can execute many instructions in the time it takes to
fetch one cache line. Thus, in a well-written Scheme (or other)
interpreter having immediate fixnums and boxed bignums and/or
flonums, generic operations on those numbers can run as fast as
type-restricted arithmetic operations.
To achieve these fast generic arithmetics, branch predictions for
the type dispatching code must default so that speculative fetches
are not initiated.
These principles are applied to the SCM Scheme interpreter compiled
by GCC, yielding 10% speed improvements on arithmetic and symbolic
operations running the JACAL symbolic mathematics program.
...
Some of the conclusions reached:
For an interpreter, using branch prediction to prevent speculative
fetches can make generic arithmetic operations as fast as
typed-restricted ones.
As a result, the SRFI-77 type-specific duplicate arithmetic functions
have motivation only for Scheme compilers. It is incumbent upon
SRFI-77 to justify its approach over type declarations, which it
doesn't address.