Thanks, I will check it out. If you feel inspired to make a portable implementation, that would be valuable.

On Thu, Sep 19, 2019 at 7:56 PM Per Bothner <> wrote:
On 9/19/19 4:19 PM, John Cowan wrote:
> The Chibi implementation uses a tree of bitvectors whose lengths are between 128 and 512 bits each (16 to 64 bytes), so it will be as efficient (modulo a small constant factor) in space and time as a purpose-built ASCII-only implementation.

You might also want to take a look at the Kawa implementation of srfi-14.
The implementation is written by Jamison Hope and uses an interesting
data structure: inversion lists.  This is very compact and extremely cache-friendly
(binary search in a linear integer array).

The code is gnu/kawa/slib/srfi14.scm in the Kawa sources
(, while the code to generate
the Unicode tables (at build-time) is gnu/kawa/util/generate-charsets.scm.

The code is highly non-portable (because it uses Kawa classes and other features),
but it should be straightforward to convert it into something more portable.
        --Per Bothner