Substring indices everywhere? Eating a cake without destroying it oleg@xxxxxx 31 Dec 1999 21:26 UTC
As the recent discussion (in particular, Tom Lord's message) indicated, the pervasive use of substring indices in SRFI-13 is controversial. There appears however to be a way to have the best of both approaches. Consider procedures string= SMTH1 SMTH2 string-pad SMTH k [char] string-prefix? SMTH1 SMTH2 string-tokenize SMTH [token-set] string->number SMTH [base] etc. SMTH may be a string value. In that case, string= is equivalent to R5RS string=?; The meaning of other procedures is obvious. The argument list is simple and concise. However, SMTH may also be a form: (XS>< STR BEG-INDEX) or (XS>< STR BEG-INDEX END-INDEX) where END-INDEX is assumed to be (string-length STR) if omitted. Thus we can write: (let ((str "foobar") (foo "foo")) (display (string= str foo)) (display (string= (XS>< str 0 3) foo)) (display (string= (XS>< str 3) (XS>< foo 0)))) (string->number (XS>< "$12345.99" 1)) etc. What exactly is the XS>< form? It's up to an implementation. One Scheme system may choose to implement (XS>< str ind1 ind2) as (substring str ind1 ind2). This is the easiest (albeit not very efficient) approach. In this case, (XS>< str ind1 ind2) is a real string, so we can use R5RS string->number, string=?, etc. procedures as they are. (XS>< str ind1 ind2) may also be a shared substring, should a particular Scheme system support such things. (XS>< str ind1 ind2) may also be a lazy substring, implemented as (vector 'lazy-subst-tag str ind1 ind2) or with records, or even as a distinct datatype. Note that I have NOT said that (XS>< str ind1 ind2) is a procedure and its result is a first-class value. I don't want to commit that (XS>< str ind1 ind2) may meaningfully be used outside of string functions. The only promise I'd like to make is that (XS>< str ind1 ind2) _denotes_ a substring when used within a string function. It appears that such a limited promise makes even a shared substring implementation of the XS>< form transparent to the user. The XS>< form seems to answer all the concerns Tom Lord had about pervasiveness of substring indices. At the same time it preserves the spirit of Olin's library. Happy Y2K! PS. It may make sense to allow indices in the XS>< form take negative values as well. If an index is negative, the length of the string should be added to it implicitly. For example, (XS>< str -3) would mean the last three characters of str. This convention is supported in Perl and Python, and appears rather useful.