Email list hosting service & mailing list manager

Re: New draft (#2) and last call for comments on SRFI 229: Tagged Procedures Linas Vepstas (08 Nov 2021 19:59 UTC)

Re: New draft (#2) and last call for comments on SRFI 229: Tagged Procedures Linas Vepstas 08 Nov 2021 19:59 UTC

On Mon, Nov 8, 2021 at 12:53 PM John Cowan <xxxxxx@ccil.org> wrote:
>
> Tagged pairs would be Very Bad Indeed, requiring a 33% increase in space (from "type tag, car, cdr" to "type-tag, car, cdr, user tag") for one of the most commonly used Scheme data types.

You would be so lucky. In a real-world applications, you need indexes,
and so the actual structure is ("type-tag, car, cdr, user tag, index
of all references") I've measured this for several real-world apps, it
works out to about 1KBytes per pair for genomic/proteomic datasets,
and about 1.5KBytes per pair for natural language datasets. This is
contrasted to (for example) 32-bit types and 64-bit car, cdr for a
total of 4+8+8= 20 bytes.  So the real-world actual increase in space
is almost 100x not 33%.  Still it allows about a million pairs in
1GByte of RAM, so it is tolerable.  Who needs more than a million
pairs? :-)

--linas