String library d96-mst@xxxxxx 18 Nov 1999 20:54 UTC

The SRFI-13 string library seems like a good proposal, but I have a few
comments.

Overall, I think that the library is too big, it has too many
procedures. I think we should try to remove some
unnessesary/redundant/seldom useful procedures.

Some specific comments:

* I think that string-for-each should work form left to right and that
no other string-iter[are] procedure should be included.

* I don't see the point in [sub]string-compare[-ci]. Can anybody give a
sensible example of their use?

* I think that the string-trim and string-pad procedures should be
named string-trim-left and string-pad-left. I don't think that left is
an obvious default for trimming and padding (as it is for string-index,
string-skip etc).

* I think that the let-string-start+end macro should be left out of the
standard. It's easily enough to do it anyway with the
string-parse-start+end procedure. And it might create problems to
include a macro.

I think that the KMP searching procedures should be generalized to
allow for other algorithms than KMP. But only to algorithms that have
the same properties as KMP, i.e. scanning a stream without
backtracking. This excludes Boyer-Moore, but allows for some recent
algorithms that are said to be faster than KMP[1].

The interface could be something like this:

make-search-object c= s [start end] -> opaque "search object"
    Returns an opaque "search object" that can be used to search
    for S in a stream/string. The search object includes S itself
    if nessesary (nessesary for KMP, but not for [1]).

search-step search-object c= c search-state -> bool or opaque "search state"
    Performs a step in the search, taking an opaque "search state"
    object and returns a new "search state" object. Use '() as search state
    for the first time.

Or why not like this:

make-search-proc c= s [start end] -> search procedure
    Returns a search procedure that can be used to search
    for S in a stream/string.

<search-proc> c -> bool
    Performs a step in the search, updating an internal search state.

<search-proc> '() -> '()
	Reset the internal state of the search procedure.

[1] Sun Wu, Udi Manber: "Fast text searching allowing errors",
Communications of the ACM, October 1992.