Re: alternate implementation strategy Wolfgang Corcoran-Mathe 12 Dec 2020 19:22 UTC

On 2020-12-12 22:31 +0900, Alex Shinn wrote:
> (chibi iset) and the Chicken iset egg use a fairly
> simple but effective approach to compact isets,
> which is a binary tree whose leaves can be either
> bitsets or ranges.

Perhaps we could add this as another sample implementation, if it
could be extended to provide the rest of the SRFI 217 forms.

The original idea *was* to use (chibi iset) as the basis for the
sample implementation.  It's a tiny implementation and I admire its
simplicity.  However, as far as I could tell, the trees are not
self-balancing, resulting in dramatic slowdowns when sets are
constructed by sequential insert with large increments between
elements.  The provided optimization routines seem to improve the
situation greatly, but I was not familiar enough with the theory of
their use to begin pasting them into the other iset routines.  It's
also not immediately clear what the time complexities of the
operations of (chibi iset) are.  (The current sample implementation
could do with some more explicit statements here.)

> As an example, char-set:letter (using Unicode 13.0)
> is comprised of 132875 characters.  In (chibi iset)
> this takes up 3120 bytes of memory, whereas the
> current SRFI 217 sample impl requires 5314960 bytes.

Yes, the space efficiency of the current sample implementation is
a work in progress.  It does not yet provide bitmap compression for
dense sets.  On the plus side, dense and sparse sets use about the
same amount of space, and operations on all sets have the same
asymptotic time and space consumption.

--
Wolfgang Corcoran-Mathe  <xxxxxx@sigwinch.xyz>

"Therefore, 100 victories in 100 battles is not the most skillful.
Subduing the other's military without battle is the most skillful."
--_Sun Tzu_