Re: SRFI 14 (character sets) needs replacement in R7RS-large Per Bothner 19 Sep 2019 23:56 UTC
On 9/19/19 4:19 PM, John Cowan wrote:
> The Chibi implementation uses a tree of bitvectors whose lengths are between 128 and 512 bits each (16 to 64 bytes), so it will be as efficient (modulo a small constant factor) in space and time as a purpose-built ASCII-only implementation.
You might also want to take a look at the Kawa implementation of srfi-14.
The implementation is written by Jamison Hope and uses an interesting
data structure: inversion lists. This is very compact and extremely cache-friendly
(binary search in a linear integer array).
The code is gnu/kawa/slib/srfi14.scm in the Kawa sources
(https://gitlab.com/kashell/Kawa), while the code to generate
the Unicode tables (at build-time) is gnu/kawa/util/generate-charsets.scm.
The code is highly non-portable (because it uses Kawa classes and other features),
but it should be straightforward to convert it into something more portable.