Dear authors of SRFI 77 and readers
of the discussion list,
> SRFI 77 issue #5:
> The fixnum operators wrap on overflow,
i.e., they perform modular
> arithmetic. For many purposes,
it would be better for them to
> signal an error in safe mode. Since
the modulus is implementation-
> dependent, portable code cannot
generally take advantage of the
> wrapping. Instead, applications
are likely to use fixnums simply
> to avoid the overhead of generic
arithmetic under the assumption
> that all quantities computed fit
in the fixnum range, and feedback
> if this turns out not to be the
case will be valuable. On the other
> hand, the wrapping semantics can
also be useful, i.e., in the coding
> of a practical implementation of
bignums in terms of fixnums. It
> may be that we should consider
including two sets of operators:
> fx operators that signal an error
upon overflow and wfx operators
> that wrap.
Unfortunately, there are more useful
arithmetic models for fixnums:
a) modular (aka wrapping) arithmetics,
b) subrange arithmetics, raising an
exception on overflow,
c) exact arithmetics returning several
fixnums (sum+carry), and
d) saturated arithmetics (clamping values
to range boundaries).
Clearly, a), b) and d) can easily be
emulated from c). C) can be
emulated inefficiently from b). Using
a) for emulating either b)
or c) is horribly inefficient because
the argument range must
essentially be sqrt'ed, losing about
an order of magnitude in speed
and a factor of two in storage consumption.
D) is also bad for
emulating the other models.
Model d) is used
in signal processing (RGB components,
audio samples)---often as
SIMD operations, e.g. Intel MMX has
8-way 8-bit signed or unsigned
saturated arithmetics.
Proposal: Provide operations
for b) and c). Do not include a) or d).
Rationale for dropping a): It is hardly
ever useful because the
precision is not a 'porting-time constant',
as you acknowledge
yourself (cited above). Wrapping semantics
is *not* useful for
implementing bignums---it is too inefficient.
Rationale for dropping d): Too specialized
application.
Saturated arithmetic is absolutely essential
in fixpoint
signal processing applications, e.g.
processing RGB components
or PCM audio samples. Usually the operations
are used with SIMD,
e.g. Intel's MMX has an 8-way 8-bit
signed or unsigned saturated
addition---and lots of other modes.
Rational for including b): Even if the
full tower is present
in a given system, as an implementor
you might want to use
FIXNUMs as a basis for implementing
BIGNUMs. Now for the workhorse
arithmetics the model of choice is c),
but for the rest of the
program (loops, counters, indices into
vectors etc.) you might
want to have b). Moreover, FIXNUMs are
often useful as efficient
alternatives for exact integer arithmetics,
provided they are
used with care (proof obligations are
taken seriously).
Concerning the set of operations for
b), the rationale suggests
not to be too 'minimal'---which is reflected
in the current
design in SRFI 77: full set of comparisons,
min/max, etc.
As matter of personal preference, a
few minor things I would modify:
Proposal:
* Add the following operations:
fxsign fxnot fxcompare
* Allow multiple arguments in the following
operations:
fx+ fx- fx* fxmin fxmax
* Extend the following predicates to
test chains:
fx= fx> fx< fx>=
fx<=
* Rename the following identifiers:
least-fixnum -> fx-min-value
greatest-fixnum ->
fx-max-value
fxbitwise-<op> ->
fx<op> for <op> in {and, ior, xor}
fxarithmetic-shift ->
fxshift
fxdiv+mod -> fxmod+div
[In fact,
'div+mod' -> 'mod+div' everywhere in the SRFI
in order
to have the least significant values come first,
see below.]
* Drop the following operations:
fxzero? fxpositive? fxnegative?
fxodd? fxeven?
fxquotient fxmodulo fxremainder
fxmodulo+remainder
fxgcd fxlcm
Furthermore I would not distribute the
fixnum ops over the
language and a library; I would rather
include them all in
the language or leave them out altogether.
Now for c), i.e. arithmetics with carry.
> SRFI 77 issue #4:
> The fixnum operations provide efficient
fixnums that "wrap."
> However, they do not give efficient
access to the hardware
> facilities for carry and overflow.
This would be desirable
> to implement efficient generic
arithmetic on fixnums portably.
> On the other hand, there isn't
much experience with formulating
> a portable interface to these facilities.
Yes there is. Consult for example pages
3 and 4 of
Henri Cohen: A course in computational
algebraic number theory.
Springer, 1993.
Cohen's solution is the following: Add
operations for +, -, *,
div+mod, shift, and bfffo (= x ->
ceil(log2(M/(2 x)))) where
arguments are in {0..M-1}, where M is
a word size modulus,
for example 2^32 or 2^31. The arithmetic
operations return the
(mod M)-part of the result and store
the (div M)-part of the
result in two global variables called
'overflow' (in {0,1}, for
+ and -) and 'remainder' (in {0..M-1}.
for *, div+mod, and shift).
This solution is a favorable compromise
between efficiency and
avoiding low level programming. My main
concern is the use of
global variables for the carry. Another
issue is the use of
an unsigned range, whereas the model
b) above (i.e. subrange
arithmetics) is certainly most useful
with a range including
negative integers (and there should
at least be -1).
Concerning the 'global variables' issue,
I propose to
replace it by procedures returning two
values, i.e.
sum+carry, difference+borrow, and lsb+msb
(for multiplication
and shift). The clumsiness of dealing
with multiple values
is largely gone with wider acceptance
of LET-VALUES or
SRFI-71's LET, i.e. (LET ((sum carry
(+carry x y c))) ...)).
The unsigned vs. signed issue is a little
trickier. The primary
purpose for arithmetics with carry is
for implementing (fixed
or arbitrary) multiprecision arithmetics,
e.g. for implementing
bignums, implementing crypto primitives
(say >=1024 bit mod N),
implementing pseudo-random number generators,
or emulating floats.
In all these cases, unsigned arithmetics
with carry is more useful
than signed arithmetics with carry,
as far as the workhorse ops
are concerned. This has two reasons:
a) after cascading signed
operations I always have to untie a
knot in my head, b) more
seriously the 2-complement interpretation
of a word depends on
the wordsize, i.e. #b1000 = -1 as a
4-bit word but = 4 as a 5-bit
word. In the unsigned interpretation
the wordsize does not matter.
Sadly, using signed arithmetics for
emulating unsigned operations,
i.e. only using the non-negative values
of the signed range, is
less efficient itself because the ordinary
signed carry word is
not adjacent in valuation to the sum
word. (Try adding 3-bit signed
integers, i.e. in {-4..3}, cascading
2-bit integers in {0..3}.)
There are several ways to deal with
the signed/unsigned issue:
* Define FIXNUM to be signed, only provide
signed arithmetic.
* Define FIXNUM to be unsigned, only
provide unsigned arithmetic.
* Define two FIXNUM types for signed
and unsigned.
* Define FIXNUM values as having two
interpretations: signed/unsigned.
* Define FIXNUM signed, add another
type for vectors of unsigned values.
* ...and lots more
In the following proposal a different
approach is taken: We define
FIXNUM to be a signed 2-complement range
and add operations that
use only the non-negative subrange---but
these operations return the
proper carry values for the *subrange*.
Limiting to 2-complement
representation and power-of-2 sized
range is necessary to make
the ranges fit together properly. This
is hardly a restriction.
Moreover, wasting the sign
bit is tolerable in terms of speed
and space---and FIXNUM are probably
not native hardware integers
anyhow due to one or more tag bits.
In practice, the following
proposal represents a reasonable
compromise between efficiency and avoiding
low-level programming.
It can be used for implementing multiprecision
arithmetics and
implementing efficient bitfield operations
in Scheme itself.
Proposal:
1. Define the FIXNUMs to represent w-bit
signed 2-complement
integers, i.e. the range is {-2^(w-1)..2^(w-1)-1}
for some
global integer constant w (read 'width'),
w >= 2, which is
also available as a symbolic constant
(e.g. fx-width).
2. Add operations with carry using only
the non-negative subrange,
i.e. {0..2^(w-1)-1} of FIXNUM. The behavior
of these operations
for negative or non-FIXNUM argument
values is left unspecified in
order to allow for 'nearly 1 instruction'
implementations. The operations
are not for the casual user, but for
the implementor of libraries---so they
fall into the 'all bets are off' category
(unlike the model b) operations).
The unsigned operations (define M =
2^(w-1)) are:
(fx-carry+ x y c)
the two values (mod+div
(+ x y c) M)
for fixnums x, y in {0..M-1},
c in {0,1}.
Note: Let s0 s1 denote the results of
(fx-carry+ x y c).
Then s0 + s1 M
= x + y + c, s0 in {0..M-1}, s1 in {0,1}.
(fx-carry- x y c)
the two values (mod+div
(- x y c) M)
for fixnums x, y in {0..M-1},
c in {0,1}.
Note: Let s0 s1 denote the results of
(fx-carry- x y c).
Then s0 - s1 M
= x - y - c, s0 in {0..M-1}, s1 in {0,1}.
(fx-carry* x y c)
compute (mod+div (+ (*
x y) c) M)
for fixnums x, y, c in
{0..M-1}.
Note: Let p0 p1 denote the results of
(fx-carry* x y c).
Then p0 + p1 2^(w-1)
= x y + c, p0, p1 in {0..M-1}.
(fx-mod+div x y)
the two values (mod+div
x y)
for fixnums x in {0..M-1},
y in {1..M-1}.
Note: Let r q denote the result of (fx-mod+div
x y).
Then r + q y =
x, r in {0..y-1}, q in {0..M-1}.
(fx-shift-left x0 x1 k)
the two values (mod+div
(* (+ x0 (* x1 M)) (expt 2 k)) M)
for fixnums x in {0..M-1}
and k in {0..w-2}.
(fx-shift-right x0 x1 k)
the two values (mod+div
(div (+ x0 (* x1 M)) (expt 2 k)) M)
for fixnums x0, x1 in
{0..M-1} and k in {0..w-2}.
(fx-lsb-index x)
k in {0..w-1} such that
the k-th bit is the least significant
bit set in the fixnum
x in {1..M-1}, or k = w-1 if x = 0.
(fx-msb-index x)
k in {-1..w-2} such that
the k-th bit is the most significant
bit set in the fixnum
x in {0..M-1}.
(fx-weight x)
number of bits set for
fixnum x in {0..M-1}.
Rationale: carry+/-/*/mod+div are the
basic arithmetic operations
on unsigned (w-1)-bit words. The shift
operations work on pairs of
words in order to be able to shift long
vectors efficiently. The
index finding operations and the weight
operation are critical to
implementing vectors of bits. Boolean
ops and comparison
are already included as part of the
model b).
Combined, the proposals in this email
allow for much more efficient
implementations of important data structures
(bignums, vectors of
bits, float emulations, many more...),
while hardly being more
complicated (remove 11 ops, add 12,
introduce some variable arity).
Sebastian.
----
Dr. Sebastian Egner
Senior Scientist
Philips Research Laboratories
Prof. Holstlaan 4 (WDC 1-051, 1st floor, room 51)
5656 AA Eindhoven
The Netherlands
tel: +31 40 27-43166
fax: +31 40 27-44004
email: xxxxxx@philips.com