On Tue, Oct 10, 2017 at 6:17 AM, Shiro Kawai <xxxxxx@gmail.com> wrote:
It reads "Splits str into a list of strings separated by matches of re" but the current reference implementation seems to "eat up" any empty strings between matches.  Suppose I want to parse text fields separated by a comma or a semicolon:

  (regexp-split '((",;")) "a,,b,")

I'd expect to have ("a" "" "" "b" "") rather than ("a" "b").

Thanks Shiro.  I think this is a little ambiguous in the spec but agree with
your interpretation.  I'll fix the reference implementation, and will propose
an errata to add such an example for clarity.

Note: If we allow empty strings between matches, we'd better to refer explicitly to the case when regexp matches an empty strings, since one can argue that such regexp splits an empty string to infinite number of empty strings.  

`regexp-extract' and `regexp-partition' explicitly only return non-empty
matches - even if the regexp can match the empty string, we're not
allowed to use it.  We can assume the same holds for `regexp-split'.

P.S. The link to the reference implementation seems obsolete (it points to code.google.com, and redirected to github project toppage but not the actual file.)

The reference implementation is now at:

  https://github.com/ashinn/chibi-scheme/blob/master/lib/srfi/115.sld
  https://github.com/ashinn/chibi-scheme/blob/master/lib/chibi/regexp.sld
  https://github.com/ashinn/chibi-scheme/blob/master/lib/chibi/regexp.scm

Note I have not yet added support for look-around assertions (which
are an optional feature) but hope to do so soon.

-- 
Alex