Removing the ASCII string constants from SRFI 175 Lassi Kortela (19 Sep 2019 23:38 UTC)
Re: SRFI 14 (character sets) needs replacement in R7RS-large Per Bothner (19 Sep 2019 23:56 UTC)

Re: SRFI 14 (character sets) needs replacement in R7RS-large Per Bothner 19 Sep 2019 23:56 UTC

On 9/19/19 4:19 PM, John Cowan wrote:
> The Chibi implementation uses a tree of bitvectors whose lengths are between 128 and 512 bits each (16 to 64 bytes), so it will be as efficient (modulo a small constant factor) in space and time as a purpose-built ASCII-only implementation.

You might also want to take a look at the Kawa implementation of srfi-14.
The implementation is written by Jamison Hope and uses an interesting
data structure: inversion lists.  This is very compact and extremely cache-friendly
(binary search in a linear integer array).

The code is gnu/kawa/slib/srfi14.scm in the Kawa sources
(https://gitlab.com/kashell/Kawa), while the code to generate
the Unicode tables (at build-time) is gnu/kawa/util/generate-charsets.scm.

The code is highly non-portable (because it uses Kawa classes and other features),
but it should be straightforward to convert it into something more portable.
--
	--Per Bothner
xxxxxx@bothner.com   http://per.bothner.com/