Re: AW: Too much of a good thing? Per Bothner 11 Apr 2003 15:18 UTC

Michael Burschik wrote:
> Correct me if I'm wrong, but R5RS does not provide resizable vectors, and
> functions generating vector element by element have no way of knowing the
> length of the resulting vector beforehand.

First, the SRFI process is about extending R5RS, so I don't see
that as a restriction.

> If you want to write portable
> code, this means you can either create n vectors of length 1 .. n and
> discard all but the last, or create a list and then convert it to a vector
> before returning it, which is, in fact, what the reference implementation
> does. Are you sure that the first alternative is more efficient?

You have to use some data structure or programming convention that
stores the "active length" - i.e. Common Lisp's "fill pointer".  If
you're stuck with vanilla Scheme, then this would be a separate variable
or a convention to use the the first variable of the vector.  (The
former is awkward and the latter is worse, whihc is why an extension
is needed.)

But assuming you have a separate fill pointer, then yes: It really is
more efficient to allocate and re-allocate multiple vectors - as long
as you each time double the allocated size.  E.g. start with a small
vector of say 8 slots, then when that gets full use one with 16 lots,
then one with 32 slots etc.  This uses less memory than cons cells
(especially if you an object or malloc header), provides linear
performance, and much better data locality.

  > Only if you know how large your data structures are going to be. If you
> constantly have to resize your vectors, or create new vectors of an
> appropriate size, your performance might not be so stellar.

vectors are still faster in almost all applications - as long as
you have a fill pointer.

> But iterating over all elements of a sequence is an example of exploiting
> constant-time access to the elements of the sequence. Otherwise, one would
> recurse over the sequence.

I don' understand what you're saying here.

Using recusion as the preferred mechanism to sequence through the
elements of a sequence is I believe a mistake.  Almost all programmers
find it easier to work with iteration than with recursion, and with
"sequence comprehensions" iteration can be just as functional and
"clean" as recursion.

There is a lot to be said for array languages, such as APL.
I'm working on implementing XQuery, which is well worth taking a
look at.  (It's sequence model is different than vectors, since
XQuery sequences don't nest directly, but they can be nested
indirecty using "elements".)
	--Per Bothner