Re: Substring indices everywhere? and the XS>< form oleg@xxxxxx 03 Jan 2000 19:41 UTC

There are two major differences between 'substring' and the proposed
XS>< _notation_. For one thing, 'substring' function's interface is
written in stone: (substring str i j). All three arguments are
required, furthermore, (<= 0 i j (string-length str)) must be #t. XS><
is not bound by this interface; it can accept fewer arguments and
negative indices.

For another thing, substring is codified to be a procedure whose
result is a first-class value. No such promises have been made about
the XS>< notation.

One can envision the following three implementations:

System A:
	(define (XS>< str i . j-opt)
	    ((null? j-opt)
	      (substring str
		(if (negative? i) (+ i (string-length str)) i)
		(string-length str)))

with substring and string->number as defined in R5RS.

System B:
	the same as system A but with substring/shared

System C:
	(define-syntax string->number
	   (syntax-rules (XS><)
	     ((string->number (XS>< arg ...))
		(substring->number arg ...))
	     ((string->number arg ...) (r5rs-string->number arg ...))))

(this is just the outline. A Scheme compiler may notice the XS><
notation and perform the above conversion without actually requiring
string->number be a syntax. string->number may remain a normal

Any of these three systems permit an expression
	(string->number (XS>< "$12345.99" 1))

which would return exactly the same result. The user thus does not
care what XS>< really is. A particular Scheme system is therefore free
to implement XS>< in any way it thinks fit -- with inlining, shared
substrings, lazy substrings, etc. User's code however remains
portable across the three implementations.

Again, the code with XS>< notation can be easily moved to a system
that implements XS>< via shared substrings. This IMHO addresses the
following Tom Lord's concern:
> This makes it awkward to incorporate the code such programmers write
> into systems which support shared substrings.

The XS>< notation also effectively removes substring indices from all
string library procedures. Again this addresses Tom Lord's and other
people's concern.

Furthermore, as XS>< notation is not guaranteed to be useful outside
string functions, implementing it through shared substrings appears
safer (for example, an implementation does not have to add code to do
a copy-on-write when a parent string is modified). Hence XS>< may be
implemented via a lighter version of substring/shared.

The purpose of the XS>< notation is to raise the level of abstraction,
by introducing a _notational_ indirection. The hope is to guarantee
portability of user code and to impose as few restraints on
implementations as possible. The decision as to whether a SRFI-13-
compliant system should support shared substrings is IMHO best left to
implementors. BTW, I too consider the 'XS><' name to be ugly, but I
couldn't think up anything cuter.