A few SRFI-115 post-finalization notes
Sergei Egorov 04 Aug 2025 15:13 UTC
Hi,
Some SRE post-finalization notes:
1) There is a typo in the <cset-sre> grammar:
| (- <difference> ...)
should read
| (difference <cset-sre> ...)
2) I think it needs to be explicitly specified that **? (and perhaps ** for
consistency?) may allow #f for the 'to' parameter; otherwise, it is
impossible to express non-greedy half-open intervals such as x{1,}? (i.e.
x+?). The reference implementation seems to allow this.
3) There seems to be an ambiguity in multi-argument versions of the (w/xxxx
...) forms: should the arguments form a sequence as in <sre>, or a union,
as in <cset-sre>? The reference implementation seems to prohibit some
multi-argument forms in some contexts:
(regexp-matches? '(w/nocase #\a #\b) "aB") => #t ; a sequence
(regexp-matches? '(or (w/nocase #\a #\b) #\c) "a") error: "w/nocase
takes only one char-set"
(regexp-matches? '(: (w/nocase #\a #\b) #\c) "Abc") => #t ; a
sequence
4) One has to guess what some argument-less regexps mean.
What is the meaning of (-)? Is it an error?
Is (and) an error, or the equivalent of the 'any' charset?
Is (or) an error, or an expression that fails without consuming any input?
Do (w/xxxx) forms follow rules for (or), (:), or differ? What about (/)?
5) I think it would be nice to have a use case for (word+ ...): why
restrict it instead of extending via adding constituents, such as minus
sign or colon?
-Sergei