Semantics of char-set-cursor-next Bradley Lucier (10 Jul 2023 02:20 UTC)
Re: Semantics of char-set-cursor-next Marc Feeley (10 Jul 2023 07:21 UTC)
Re: Semantics of char-set-cursor-next Marc Nieper-Wißkirchen (10 Jul 2023 09:40 UTC)
Re: Semantics of char-set-cursor-next Marc Nieper-Wißkirchen (10 Jul 2023 09:41 UTC)
Re: Semantics of char-set-cursor-next Bradley Lucier (10 Jul 2023 19:34 UTC)

Re: Semantics of char-set-cursor-next Marc Feeley 10 Jul 2023 07:21 UTC

From my reading of SRFI-14 I think it is allowed for char-set-cursor-next to mutate the cursor. However, this means some coding patterns will be harder to use such as backtracking searches, in particular because there is no cursor “copy” operation. So a functional implementation would be best if the cost is reasonable.

Have you thought of packing the cursor information in a single fixnum? Gambit has >= 30 bit fixnums, so you could use 21 of those bits as a Unicode character code “limit” and the remaining 9 bits as a count. The char-set-cursor-next procedure checks the count and simply decrements it if the count is >= 1. If it is 0 it checks the char-set to determine the next cursor. This will avoid the memory allocation of a new cursor and the log(N) time to access the char-set will be amortized by a factor of (up to) 512.

Marc

> On Jul 10, 2023, at 4:19 AM, Bradley Lucier <xxxxxx@purdue.edu> wrote:
>
> The signature is
>
> char-set-cursor-next cset cursor -> cursor
>
> The SRFI text says:
> ========================
> A cursor index is incremented with char-set-cursor-next; in this way, code can step through every character in a char set.
> ========================
> I don't have much experience with cursors.  I want to know whether char-set-cursor is allowed to modify the "cursor" argument it is passed.
>
> I ask because in my design the cursor is a vector of three elements, but with very fast increment (on average) if I can modify the cursor argument.
>
> Here's what some web pages say about char-set-cursor-next.
>
> http://marcomaggi.github.io/docs/vicare-libs.html/srfi-char_002dsets-spec-iter.html
> ========================
> char-set-cursor-next increments a cursor index and returns a new cursor indexing the next character in the set; in this way, code can step through every character in a char set.
> ========================
> I assume this means that the argument cursor is not modified.  The code in lib/vicare/containers/char-sets.vicare.sls seems not to modify the cursor, but to build an entirely new one.
>
> https://www.gnu.org/software/guile/manual/html_node/Iterating-Over-Character-Sets.html
> ========================
> Scheme Procedure: char-set-cursor-next cs cursor
> C Function: scm_char_set_cursor_next (cs, cursor)
>
>    Advance the character set cursor cursor to the next character in the character set cs. It is an error if the cursor given satisfies end-of-char-set?.
> ========================
> This sounds like the argument cursor is modified.  The code in libguile/srfi-14.c seems to modify the argument.
>
> I don't understand the code in gnu/kawa/slib/srfi14.scm to know what Kawa does (it appears that the cursor is an integer, though, so it wouldn't change the argument).
>
> Brad
>