General response to Per Bothner
John Cowan 16 Apr 2016 05:11 UTC
Per Bothner scripsit:
> I also suggest removing string-cursor-{forward|back}. Kawa already has:
>
> (string-cursor-next/-prev s cursor [nchars])
> which subsumes string-cursor-{forward|back}. You might consider
> that instead.
I don't see much difference. If you want to allow -next and -prev
to accept a third argument in Kawa, that's no problem, and then the
implementation of -forward and -back is trivial.
(I think you dropped a few words from your next email, which I've added
here in square brackets.)
> The mapping procedures [called by] string-tabulate and string-unfold
> return characters. We should [not] be encouraging people work at the
> character level - because [we are] in a Unicode world with grapheme
> clusters and what-not. The mapping procedures should return strings -
> or characters, for compatibility.
I seriously considered flushing these two, on the grounds that they
don't have that much purpose: more for constructing character bags
than true strings containing text. But they are very general and
may come in useful in some way I don't see right now, so I left
them in.
> Probably the same applies to string and make-string - the arguments
> should be strings or characters.
I think there's no need for that variability. Use string if you have
characters, or string-append if you have strings. If you really need
to run both together, then:
(string-concatenate (map (lambda (c) (if (char? c) (string c) c))
list-of-strings-or-chars))
should do the job.
> It probably makes sense to defer this issue for a successor SRFI.
I agree.
> strange string-for-each-cursor example
Already pointed out by Jim Rees and fixed in my draft.
> "Delimiter specifies a string whose characters are to be used as the word
> separator."
>
> Does this mean:
> (1) Delimiter specifies a char-set composed of the characters in it;
> if a character is 's matches *any* character in 'delimiter' it's a
> word boundary.
> (2) A delimiter can match multiple characters, and the scan is as if
> done by string-contains?
(2) is intended. I've added the following to my draft: "This will often
be a single character, but multiple characters are allowed for cases
like splitting on <code>"\r\n"</code>."
> The delimiter should probably be a regex, so maybe deferred to another
> library.
SRFI 115 already provides regexp-split for that purpose. Here's the
example given there:
(regexp-split '(+ space) " fee fi fo\tfum\n")
("fee" "fi" "fo" "fum")
String-split is just meant to handle simple fixed cases, the inverse
of string-join.
> It strings are allowed should "abc" be interpreted as /[abc]/ or /abc/.
> I think the latter is cleaner.
In SRFI 115, "abc" is equivalent to /abc/ in traditional notation.
The equivalent for [abc] is ("abc"), which works by codepoints, or
you can use (or "a" "b" "c"), which can work by codepoints or clusters
or whatever.
--
John Cowan http://www.ccil.org/~cowan xxxxxx@ccil.org
if if = then then then = else else else = if;