Leftmost-longest behavior John Cowan 27 Apr 2016 20:04 UTC

The SRFI 115 sample implementation provides leftmost-longest behavior,
but the SRFI does not specify it.  This means that

(regexp-partition '(or "a" "bcdef" "g" "ab" "c" "d" "e" "efg" "fg") "abcdefg")

produces

("" "ab" "" "c" "" "d" "" "efg")

whereas its Python analogue

re.findall(r"(a|bcdef|g|ab|c|d|e|efg|fg)", "abcdefg")

produces

['a', 'bcdef', 'g']

The implementation's behavior agrees with egrep:

$ echo 'abcdefg' | egrep -o '(a|bcdef|g|ab|c|d|e|efg|fg)'
ab
c
d
efg

so it is not wrong, but it may be surprising.  (Example due to dpk.)

--
John Cowan          http://www.ccil.org/~cowan        xxxxxx@ccil.org
Don't be so humble.  You're not that great.
        --Golda Meir