Re: Complexity of seq & related patterns

Show/hide message thread

Complexity of seq & related patterns Wolfgang Corcoran-Mathe (05 Nov 2025 19:37 UTC)

Re: Complexity of seq & related patterns Daphne Preston-Kendal (05 Nov 2025 20:05 UTC)

Re: Complexity of seq & related patterns Daphne Preston-Kendal (06 Nov 2025 20:07 UTC)

Re: Complexity of seq & related patterns Wolfgang Corcoran-Mathe (07 Nov 2025 18:22 UTC)

Re: Complexity of seq & related patterns Daphne Preston-Kendal (25 Nov 2025 10:37 UTC)

Re: Complexity of seq & related patterns Wolfgang Corcoran-Mathe (07 Jan 2026 22:49 UTC)

Re: Complexity of seq & related patterns Daphne Preston-Kendal 05 Nov 2025 20:04 UTC

On 5 Nov 2025, at 20:36, Wolfgang Corcoran-Mathe <xxxxxx@sigwinch.xyz> wrote:

> Hi dpk & the SRFI 262 list,
>
> This SRFI spends quite a few words describing 'seq', 'seq*', and
> 'seq/unordered'.  The syntax & semantics of these patterns far
> exceeds the other patterns in complexity.  To me they are also the
> least familiar parts of the language, since there seems to be nothing
> comparable in other pattern-matching systems.

See below on the motivations for introducing this construct.

> There is something Rube Goldbergian about a syntax ('seq') that
> combines a looping construct with a pattern that can match either
> a single value or a subset of the sequence produced by the loop.
> Pattern-matching is timeless, at least conceptually, but iteration
> is, well, iterative, happening through time.  Repeatedly generating a
> value & doing something with it is a familiar pattern, as is matching
> a whole sequence of values & doing something with the bindings.  But
> generating values, matching them, & then finally doing just one thing
> with the bindings is, at least to me, rather strange.

I don’t understand this objection; based on what follows I think it’s rooted in a misunderstanding. If I can improve the spec language to avoid others having the same misunderstanding(s) in future, I’d like to.

> An example non-intuitive use:
>
>    (match <vector>
>      ((seq vec ((i 0 (+ i 1)))
>                (>= i (vector-length vec))
>                (vector-ref vec i)
>         a)
>       <body>))
>
> Sorry this is a bit crude; I had to rely on the SRFI's examples.
> Anyway, this is apparently allowed, although the sample implementation
> raises an exception if <vector> contains more than one element.  What
> value should *a* be bound to in <body>?  The last element of <vector>?

It’s bound to the *only* element of <vector>. This pattern can only match a 1-element vector. It is basically what the pattern (vector a) expands into. Like constructive ‘vector’ would make a vector of one element which is the value of the variable ‘a’, this deconstructive ‘vector’ would match a vector of one element and bind its single element to the variable ‘a’.

I don’t see what’s meant to be non-intuitive or not allowed about this use; it more or less exactly mirrors the ‘vektor’ example in the spec, adding a single subpattern without any ellipses.

Seq patterns implement regular expressions over Scheme sequences, although the Kleene star is restricted to operating on a single token at once. (This is mostly for syntactic convenience.) The subpatterns as a whole must, using the Kleene star or otherwise, match the entire contents of the sequence. (In the terminology of string regular expressions, this is a match, not a search/the entire pattern is surrounded by ^ $ in POSIX notation.)

> Syntactically, the association with 'do' is strong.  Would it really
> take an eccentric mind to expect <body> to be evaluated once for each
> item, with a new binding of *a* on each step?

You mean the body of the whole match clause? I don’t think that’s a reasonable interpretation. No other pattern causes such a radical change in the evaluation semantics of the Scheme ‘match’ form itself.

> So much for the curiousities of 'seq'. 'seq*' is curiouser still, &
> as for 'seq/unordered', or 'seq/reorderable', or 'seq/interleaving'
> ... the naming difficulties give some idea of the semantic monsters
> dwelling there.

seq* is indeed not very satisfactory. It’s there to enable the ‘cons*’ pattern and things like it.

> Certainly, powerful primitives for sequence patterns make sense in
> an extensible system.  But these aren't the primitives I'd like.

Can you suggest anything better? :-)

You are correct in your implication that seq, seq*, and seq/unordered are not really intended for direct use but are meant to be expanded into by higher-level patterns.

In general, a pattern (my-spiffy-sequence-type <<subpatterns>>) should expand into (seq <<the state vars etc for a my-spiffy-sequence-type>> <<subpatterns>>) – exactly replicating the subpatterns and just telling the matcher how to iterate over the sequence. One of the infelicities of seq* is that this is currently not the case.

The challenge here is to enable this efficiently. Originally I had only ‘list’ and you had to convert your sequence into a list first with the => pattern before matching on it. Now, since the ellipsis gathers items as a list anyway regardless of the input type, this might have been somewhat premature optimization. However, as a data point I would also point to how Racket’s ‘hash-table’ pattern was implemented by conversion of the input hash table to a list before feeding it to list-no-order; this was slow and caused bugs.

I was originally searching around for a general iteration interface that would allow flexibility to the pattern match implementation (not only the sample implementation) in how it actually implements these regular expression semantics – using a linear NFA or DFA as the current implementation of seq and seq* does, or using a backtracking implementation, or whatever. I decided to model it on named let and do because those would already be familiar to Scheme programmers, although the need to accommodate implementation strategies such as backtracking required imposing the restrictions on mutation of state variables which I think is part of the complexity you’re criticizing.

Daphne