Email list hosting service & mailing list manager

Ambiguity in regexp-replace, and maybe a bug Chris Hanson (02 Dec 2019 23:24 UTC)
Re: Ambiguity in regexp-replace, and maybe a bug Chris Hanson (02 Dec 2019 23:31 UTC)
Re: Ambiguity in regexp-replace, and maybe a bug Alex Shinn (26 Dec 2019 16:10 UTC)
Re: Ambiguity in regexp-replace, and maybe a bug Duy Nguyen (27 Dec 2019 08:51 UTC)
Re: Ambiguity in regexp-replace, and maybe a bug Alex Shinn (28 Dec 2019 14:49 UTC)
Re: Ambiguity in regexp-replace, and maybe a bug Duy Nguyen (30 Dec 2019 08:41 UTC)
Re: Ambiguity in regexp-replace, and maybe a bug Alex Shinn (30 Dec 2019 14:03 UTC)

Re: Ambiguity in regexp-replace, and maybe a bug Duy Nguyen 30 Dec 2019 08:40 UTC

I picked up the test in
https://github.com/ashinn/chibi-scheme/commit/6f28159667212fffc9df4383dc773ea129f106d0
in order to add it to Gauche, but it looks like Chibi and Gauche
behave a bit differently, so just to double check...

The test in the commit above uses the string

    abc def: ghi

as input and expects the output string

    bc pre: <<<bc >>> match1: <<<def>>> post: <<<gh>>>gh

The substitution part is correct, e.g. we got "<<<bc >>>" instead of
"<<<abc >>>". What tripped Gauche is that the output string is
_truncated_. The "a" is missing at the beginning (and also "i" at the
end). In other words, it expects this

    abc pre: <<<bc >>> match1: <<<def>>> post: <<<gh>>>ghi

I've read the SRFI again. I don't think start and stop parameters can
truncate the string, can it? In the examples in the srfi (with start
param), the output string is not truncated either.

If what Chibi does now is correct, I'll fix up the Gauche implementation.

On Sat, Dec 28, 2019 at 9:49 PM Alex Shinn <xxxxxx@gmail.com> wrote:
>
> Fix pushed.
>
> On Fri, Dec 27, 2019 at 4:51 PM Duy Nguyen <xxxxxx@gmail.com> wrote:
>>
>> On Thu, Dec 26, 2019 at 11:10 PM Alex Shinn <xxxxxx@gmail.com> wrote:
>> >
>> > On Tue, Dec 3, 2019 at 7:24 AM Chris Hanson <xxxxxx@chris-hanson.org> wrote:
>> >>
>> >> The definition of regexp-replace specifies the meanings of ‘pre and ‘post as substitution arguments.
>> >>
>> >> What I expect is that ‘pre/‘post will transform into
>> >>
>> >> (substring string start (regexp-match-submatch-start match 0))
>> >> (substring string (regexp-match-submatch-end match 0) end)
>> >>
>> >> However, looking at the chibi implementation (the only one I’ve found after 10 minutes of searching), what these transform into is
>> >
>> >
>> > Were you hoping to find others or did you have trouble finding the reference implementation?  It is linked from the SRFI document.
>> >
>> > A complete regexp engine is quite a bit of work so it's unsurprising if there are no other impls, though borrowing sre->pcre and wrapping another impl is reasonable.
>>
>> For what it's worth, Gauche also supports this srfi (except grapheme
>> stuff) but it's in the second category, converting sre to gauche ast
>> syntax, not full regexp engine written in Scheme.
>>
>> >> (substring string 0 (regexp-match-submatch-start match 0))
>> >> (substring string (regexp-match-submatch-end match 0) (string-length string))
>> >>
>> >> I suspect that this is a bug, given this sentence in the description:
>> >>
>> >> The optional parameters start and end restrict both the matching and the substitution, to the given indices, such that the result is equivalent to omitting these parameters and replacing on (substring str start end).
>> >>
>> >>
>> >> But I’d appreciate confirmation of this. It might also be good to clarify what happens here.
>> >
>> >
>> > Yes, thanks for catching this, 'pre/'post should respect start and end.  I will fix this.
>>
>> Thanks. Gauche sre behaves this way, so I don't have to fix anything else :D
>> --
>> Duy

--
Duy