Re: Semantics of char-set-cursor-next
Marc Feeley 10 Jul 2023 07:21 UTC
From my reading of SRFI-14 I think it is allowed for char-set-cursor-next to mutate the cursor. However, this means some coding patterns will be harder to use such as backtracking searches, in particular because there is no cursor “copy” operation. So a functional implementation would be best if the cost is reasonable.
Have you thought of packing the cursor information in a single fixnum? Gambit has >= 30 bit fixnums, so you could use 21 of those bits as a Unicode character code “limit” and the remaining 9 bits as a count. The char-set-cursor-next procedure checks the count and simply decrements it if the count is >= 1. If it is 0 it checks the char-set to determine the next cursor. This will avoid the memory allocation of a new cursor and the log(N) time to access the char-set will be amortized by a factor of (up to) 512.
> On Jul 10, 2023, at 4:19 AM, Bradley Lucier <> wrote:
> The signature is
> char-set-cursor-next cset cursor -> cursor
> The SRFI text says:
> ========================
> A cursor index is incremented with char-set-cursor-next; in this way, code can step through every character in a char set.
> ========================
> I don't have much experience with cursors. I want to know whether char-set-cursor is allowed to modify the "cursor" argument it is passed.
> I ask because in my design the cursor is a vector of three elements, but with very fast increment (on average) if I can modify the cursor argument.
> Here's what some web pages say about char-set-cursor-next.
> ========================
> char-set-cursor-next increments a cursor index and returns a new cursor indexing the next character in the set; in this way, code can step through every character in a char set.
> ========================
> I assume this means that the argument cursor is not modified. The code in lib/vicare/containers/char-sets.vicare.sls seems not to modify the cursor, but to build an entirely new one.
> ========================
> Scheme Procedure: char-set-cursor-next cs cursor
> C Function: scm_char_set_cursor_next (cs, cursor)
> Advance the character set cursor cursor to the next character in the character set cs. It is an error if the cursor given satisfies end-of-char-set?.
> ========================
> This sounds like the argument cursor is modified. The code in libguile/srfi-14.c seems to modify the argument.
> I don't understand the code in gnu/kawa/slib/srfi14.scm to know what Kawa does (it appears that the cursor is an integer, though, so it wouldn't change the argument).
> Brad