Email list hosting service & mailing list manager

Storage Efficiency of Vectors Alan Watson (16 Apr 2003 16:56 UTC)
Re: Storage Efficiency of Vectors Per Bothner (16 Apr 2003 18:40 UTC)
Re: Storage Efficiency of Vectors bear (16 Apr 2003 22:58 UTC)
Re: Storage Efficiency of Vectors Per Bothner (16 Apr 2003 23:08 UTC)
Re: Storage Efficiency of Vectors Taylor Campbell (17 Apr 2003 02:04 UTC)

Re: Storage Efficiency of Vectors bear 16 Apr 2003 22:58 UTC


On Wed, 16 Apr 2003, Per Bothner wrote:

>The tradeoff here depends a lot on your memory allocation and gc
>overheads.  If you have a memory allocator that is tuned for handling
>allocating and reclaiming cons cells really fast, then I suspect the
>best strategy is to use a temporary list.  On the other hand if
>object allocation or collection are relatively slow or don't
>special-case cons cells, then re-allocating will probably be faster.

I think there's a far more important tradeoff that is being ignored
here.  My problem with association lists is not storage efficiency,
it's access time.

If you are considering a vector solution, you have at least the
opportunity to make hash tables, which actually solves the main
problem of alists, instead of just reimplementing an inefficient
algorithm using vectors instead of lists.

I have not understood yet why you're not doing this.  Linear access
time is a very serious limitation on how big a system it is practical
to make with the association mechanism; storage space is less so.  I
don't want to waste fifty thousand CPU cycles accessing an association
in a natural language application, for example, just because the word
whose parse-procedure I'm looking up happens to have been the ten
thousandth word discovered.  If I wanted to waste that much CPU, I'd
be using alists to store my procedures.

I can stand doubling the memory size of the overhead for the sake of
cutting CPU time for an access from linear to constant.  That is a
worthwhile tradeoff.  I can't believe I'm hearing you argue about a
savings of 30 percent in storage space when the big bleeding problem
of alist access time isn't addressed.

Since association vectors, association lists, and association
hashtables will use overhead space on the same order (from 20-200% of
each other's size), why are we even talking about doing something that
doesn't offer the substantial advantage in access time that's
available?  What was the motivation, if we're not after that
advantage, in using vectors in the first place?

				Bear