Storage Efficiency of Vectors | Alan Watson | 16 Apr 2003 15:55 UTC |

Re: Storage Efficiency of Vectors | Per Bothner | 16 Apr 2003 18:41 UTC |

Re: Storage Efficiency of Vectors | bear | 16 Apr 2003 22:58 UTC |

Re: Storage Efficiency of Vectors | Per Bothner | 16 Apr 2003 23:08 UTC |

Re: Storage Efficiency of Vectors | Taylor Campbell | 17 Apr 2003 02:03 UTC |

Storage Efficiency of Vectors

*Alan Watson*16 Apr 2003 15:55 UTCHello, Points have been made regarding the storage efficiency of vectors. I've got a couple of comments. First, I'm not sure that association vectors will save storage. Unless I'm very much mistaken, a typical implementation requires an overhead of two words for a pair and N+1 words for a vector of length N (with the additional word storing the length). Thus, an association list has four words of overhead per key, two in the spine and two in the rib, and an association vector also has (asymptotically) four words of overhead per key, one in the spine and three in the rib. If you're really bothered about the storage overhead, a "property vector", with alternating keys and values, or simply one vector for the keys and another for the values are probably better ideas. Both of these have (asymptotically) two words of overhead per key. I'm also not sure that "stretchy" vectors will help much in the common case that a vector of an initially unknown size is constructed and then its size is not changed. Let's consider constructing a vector of length M = 1.5 * 2^N. I've chosen this length as a median case appropriate to constructing a vector by creating a fresh vector of twice the previous size when the previous vector becomes full. If we construct a list first and then create an appropriate vector, we will allocate 3M words, 2M for the list and M for the vector. (If we make the mistake of using "(apply vector list)", we require 5M!) If we construct the vector by repeated doubling, we will allocate (1+1) + (2+1) + (4+1) + ... + (2^(N+1)+1) = 2^(N+2) + N + 1 = (8/3) M words (ignoring the logarithmic term). So, unless I'm making some mistake here, it would appear that in the median case stretchy vectors save us about 10%. For smaller than median cases, lists are slightly better and for larger than median cases, stretchy vectors are slightly better. However, even in the best case for stretchy vectors, they will allocate only 1/3 less storage than lists (2M versus 3M), and in the worst case they will allocate 1/3 more (4M versus 3M). So, to some degree which is better comes down to the distribution of vector lengths. However, we're talking about factors of less than two. Of course, stretchy vectors are an enormous convenience and will help if the vector is subsequently resized. Furthermore a clever allocator that can grow a vector in place will change my analysis. But in this common case and for a naive allocator, I would venture to paraphrase David Thornley and say that lists don't look any deader than usual to me. Regards, Alan Watson -- Dr Alan Watson Instituto de Astronom��a UNAM