Storage Efficiency of Vectors
Alan Watson
(16 Apr 2003 16:56 UTC)
|
Re: Storage Efficiency of Vectors
Per Bothner
(16 Apr 2003 18:40 UTC)
|
Re: Storage Efficiency of Vectors bear (16 Apr 2003 22:58 UTC)
|
Re: Storage Efficiency of Vectors
Per Bothner
(16 Apr 2003 23:08 UTC)
|
Re: Storage Efficiency of Vectors
Taylor Campbell
(17 Apr 2003 02:04 UTC)
|
Re: Storage Efficiency of Vectors bear 16 Apr 2003 22:58 UTC
On Wed, 16 Apr 2003, Per Bothner wrote: >The tradeoff here depends a lot on your memory allocation and gc >overheads. If you have a memory allocator that is tuned for handling >allocating and reclaiming cons cells really fast, then I suspect the >best strategy is to use a temporary list. On the other hand if >object allocation or collection are relatively slow or don't >special-case cons cells, then re-allocating will probably be faster. I think there's a far more important tradeoff that is being ignored here. My problem with association lists is not storage efficiency, it's access time. If you are considering a vector solution, you have at least the opportunity to make hash tables, which actually solves the main problem of alists, instead of just reimplementing an inefficient algorithm using vectors instead of lists. I have not understood yet why you're not doing this. Linear access time is a very serious limitation on how big a system it is practical to make with the association mechanism; storage space is less so. I don't want to waste fifty thousand CPU cycles accessing an association in a natural language application, for example, just because the word whose parse-procedure I'm looking up happens to have been the ten thousandth word discovered. If I wanted to waste that much CPU, I'd be using alists to store my procedures. I can stand doubling the memory size of the overhead for the sake of cutting CPU time for an access from linear to constant. That is a worthwhile tradeoff. I can't believe I'm hearing you argue about a savings of 30 percent in storage space when the big bleeding problem of alist access time isn't addressed. Since association vectors, association lists, and association hashtables will use overhead space on the same order (from 20-200% of each other's size), why are we even talking about doing something that doesn't offer the substantial advantage in access time that's available? What was the motivation, if we're not after that advantage, in using vectors in the first place? Bear