A few SRFI-115 post-finalization notes

Show/hide message thread

A few SRFI-115 post-finalization notes Sergei Egorov (04 Aug 2025 15:13 UTC)

Re: A few SRFI-115 post-finalization notes Alex Shinn (05 Aug 2025 01:04 UTC)

Re: A few SRFI-115 post-finalization notes Arthur A. Gleckler (14 Aug 2025 14:59 UTC)

A few SRFI-115 post-finalization notes Sergei Egorov 04 Aug 2025 15:13 UTC

Hi,

Some SRE post-finalization notes:

1) There is a typo in the <cset-sre> grammar:

    | (- <difference> ...)

    should read

    | (difference <cset-sre> ...)

2) I think it needs to be explicitly specified that **? (and perhaps ** for
consistency?) may allow #f for the 'to' parameter; otherwise, it is
impossible to express non-greedy half-open intervals such as x{1,}? (i.e.
x+?). The reference implementation seems to allow this.

3) There seems to be an ambiguity in multi-argument versions of the (w/xxxx
...) forms: should the arguments form a sequence as in <sre>, or a union,
as in <cset-sre>? The reference implementation seems to prohibit some
multi-argument forms in some contexts:

      (regexp-matches? '(w/nocase #\a #\b) "aB") => #t  ; a sequence
      (regexp-matches? '(or (w/nocase #\a #\b) #\c) "a") error: "w/nocase
takes only one char-set"
      (regexp-matches? '(: (w/nocase #\a #\b) #\c) "Abc") => #t ; a
sequence

4) One has to guess what some argument-less regexps mean.
What is the meaning of (-)? Is it an error?
Is (and) an error, or the equivalent of the 'any' charset?
Is (or) an error, or an expression that fails without consuming any input?
Do (w/xxxx) forms follow rules for (or), (:), or differ? What about (/)?

5) I think it would be nice to have a use case for (word+ ...): why
restrict it instead of extending via adding constituents, such as minus
sign or colon?

-Sergei