Re: loss of abstraction

Re: loss of abstraction Alan Watson 24 Aug 2005 20:03 UTC
Marcin 'Qrczak' Kowalczyk wrote:
> Attaching information to numbers behind the scenes puts an
> unreasonable burden on the representation of runtime values.
> In particular small integers could not be represented by unboxed
> values.

(a) For my Scheme, I looked at the errors I wanted to signal and decided
that I only needed to deal with pairs, symbols, and nulls[*]. I would be
interested to hear why you might want to associate source location with
other types of object.

(b) You can distinguish small integers by having interned small integers
be unboxed[**] and uninterned small integers be boxed.

Most Schemes already some boxed numbers (e.g., flonums, bignums,
ratnums, or complex numbers), so this would not introduce significant
additional complexity. Indeed, in many implementations you could
represent uninterned small integers as bignums with almost no additional
complexity.

Sure, some implementation only have unboxed small integers because they
want to have a small runtime, but tracking source location is unlikely
to be a priority for them for exactly the same reason.

Regards,

Alan

[*] Distinguishing nulls is easy: In a Scheme that divides a pointer
into tag bits and value bits, you simply use different value bits for
the different nulls. In a Scheme that does not use tag bits, you can use
one specific pointer value for the interned null and box the others. Or
you can simply box all of the nulls and accept the consequences for
performance. (These need not be so terrible because you do not have to
allocate anything when you use, or more precisely reuse, the interned null.)

[**] Boxed small integers are not necessarily a serious performance
problem if you use a cache of very small integers. The trick is to avoid
heap allocations. You achieve this to some degree by initializing a
vector with a range of very small integers that *are* allocated on the
heap. After this, if you need to generate a small integer that is within
this range, you return the corresponding object from the vector rather
than allocating a new object on the heap. This technique was used in the
MacLisp compiler in the 1970s.

--
Dr Alan Watson
Centro de Radioastronomía y Astrofísica
Universidad Astronómico Nacional de México