General response to Per Bothner

Show/hide message thread

string-cursor-forward/back Shiro Kawai (13 Apr 2016 01:25 UTC)

Re: string-cursor-forward/back John Cowan (13 Apr 2016 06:28 UTC)

Re: string-cursor-forward/back Alex Shinn (13 Apr 2016 06:48 UTC)

Re: string-cursor-forward/back Shiro Kawai (13 Apr 2016 06:53 UTC)

Re: string-cursor-forward/back John Cowan (13 Apr 2016 13:28 UTC)

Re: string-cursor-forward/back Per Bothner (16 Apr 2016 03:53 UTC)

General response to Per Bothner John Cowan (16 Apr 2016 05:11 UTC)

General response to Per Bothner John Cowan 16 Apr 2016 05:11 UTC

Per Bothner scripsit:

> I also suggest removing string-cursor-{forward|back}.  Kawa already has:
>
> (string-cursor-next/-prev s cursor [nchars])
> which subsumes string-cursor-{forward|back}.  You might consider
> that instead.

I don't see much difference.  If you want to allow -next and -prev
to accept a third argument in Kawa, that's no problem, and then the
implementation of -forward and -back is trivial.

(I think you dropped a few words from your next email, which I've added
here in square brackets.)

> The mapping procedures [called by] string-tabulate and string-unfold
> return characters.  We should [not] be encouraging people work at the
> character level - because [we are] in a Unicode world with grapheme
> clusters and what-not.  The mapping procedures should return strings -
> or characters, for compatibility.

I seriously considered flushing these two, on the grounds that they
don't have that much purpose: more for constructing character bags
than true strings containing text.  But they are very general and
may come in useful in some way I don't see right now, so I left
them in.

> Probably the same applies to string and make-string - the arguments
> should be strings or characters.

I think there's no need for that variability.  Use string if you have
characters, or string-append if you have strings.  If you really need
to run both together, then:

    (string-concatenate (map (lambda (c) (if (char? c) (string c) c))
                             list-of-strings-or-chars))
should do the job.

> It probably makes sense to defer this issue for a successor SRFI.

I agree.

> strange string-for-each-cursor example

Already pointed out by Jim Rees and fixed in my draft.

> "Delimiter specifies a string whose characters are to be used as the word
> separator."
>
> Does this mean:
> (1) Delimiter specifies a char-set composed of the characters in it;
> if a character is 's matches *any* character in 'delimiter' it's a
> word boundary.
> (2) A delimiter can match multiple characters, and the scan is as if
> done by string-contains?

(2) is intended.  I've added the following to my draft: "This will often
be a single character, but multiple characters are allowed for cases
like splitting on <code>"\r\n"</code>."

> The delimiter should probably be a regex, so maybe deferred to another
> library.

SRFI 115 already provides regexp-split for that purpose.  Here's the
example given there:

   (regexp-split '(+ space) " fee fi  fo\tfum\n")
         ("fee" "fi" "fo" "fum")

String-split is just meant to handle simple fixed cases, the inverse
of string-join.

> It strings are allowed should "abc" be interpreted as /[abc]/ or /abc/.
> I think the latter is cleaner.

In SRFI 115, "abc" is equivalent to /abc/ in traditional notation.
The equivalent for [abc] is ("abc"), which works by codepoints, or
you can use (or "a" "b" "c"), which can work by codepoints or clusters
or whatever.

--
John Cowan          http://www.ccil.org/~cowan        xxxxxx@ccil.org
                if if = then then then = else else else = if;