Re: performance Taylor R Campbell 18 Sep 2009 03:15 UTC

   Date: Thu, 17 Sep 2009 20:53:55 -0400
   From: David Van Horn <xxxxxx@cs.brandeis.edu>

   Taylor R Campbell wrote:
   > What space usage are random-access lists guaranteed to exhibit?
   > Growth order and constant factors are both interesting -- growth order
   > to specify in the SRFI, and constant factors to satisfy my curiosity.

   Representing a random-access list of n elements takes O(n) space, just
   as with sequential lists.

I ought to have been more precise: while of course it takes O(n) space
to store a sequence of n elements, I was interested more in the amount
of extra space.  E.g., in most systems, a vector of length n will use
a constant amount of extra space (a header with its length at the
beginning), while a list of length n will use O(n) extra space (one
car for each element, plus one cdr extra space for each element, and
sometimes another word for a header or similar).

   A random-access list is a forest of complete binary trees.  The forest
   contains at most log n trees, and the space of each tree is proportional
   to the number of elements it contains.  If you represent binary trees
   using pairs [*], then it takes m-1 pairs to represent a complete binary
   tree with m elements.  The forest can be represented as a list, and thus
   takes a pair per tree, which is at most log n.  So the overall space
   consumption need not be more than n + log n.  My implementation stores
   the size of each tree in the forest making it n + 2 * log n.

I'm a little puzzled by the formulae you've computed here.  If my
cursory examination of the reference implementation is right, the
trees have data associated with each internal node, not with the
leaves.  So you need one location for each element, one location for
each left branch, and one location for each right branch, for a total
of 2 n extra locations in each tree, plus whatever constant overhead
each tree has: + 1 for the size, + 1 for the node, + 1 for the link to
the next tree?  This sounds like a total space usage of 3 n + 3 log n,
neglecting the overhead of records, which means an extra space usage
of 2 n + 3 log n = O(n).  I suppose I ought to have expected that,
though.