Email list hosting service & mailing list manager

Comments on SRFI-13 reference implementation Brad Lucier (16 Nov 2000 20:05 UTC)
Re: Comments on SRFI-13 reference implementation shivers@xxxxxx (16 Dec 2000 02:42 UTC)
Re: Comments on SRFI-13 reference implementation Brad Lucier (16 Dec 2000 03:16 UTC)
Re: Comments on SRFI-13 reference implementation Brad Lucier (16 Dec 2000 04:55 UTC)
list of alphas shivers@xxxxxx (16 Dec 2000 16:05 UTC)

Re: Comments on SRFI-13 reference implementation Brad Lucier 16 Dec 2000 03:16 UTC

>    string-index, string-index-right, string-skip, string-skip-right, string-count:
>    Add internal routines that do the work but no error checking.  Call the
>    string arg s for consistency.
>
> There are no module-internal calls to STRING-COUNT.
>
> Adding internal routines for string-index & string-index-right would only
> speed up string-titlecase & string-tokenize... but would slow down other
> code's use of string-index by the cost of the procedure call from string-index
> to the internal %string-index

Well, it's a tail call from string-index to %string-index with the same arguments,
so in Gambit it's a jump.   Doing all the the optional argument parsing and
checking that the arguments are reasonable takes a lot longer than this.
(Although in Gambit, optional arguments are quite fast.)  For example, my
checking code is

(##define-macro (macro-check-string-start-end s start end form expr)
  (define (flat x) (if (pair? x) (cons (car x) (flat (cdr x))) (list x)))
  `(if (##string? ,s)
       (let ((,end (if (absent-object? ,end) (string-length ,s) ,end)))
         (if (##fixnum? ,start)
             (if (##fixnum.>= ,start 0)
                 (if (##fixnum? ,end)
                     (if (and (##fixnum.<= ,start ,end)
                              (##fixnum.<= ,end (string-length ,s)))
                         ,expr
                         ,(if (list? form)
                              `(##trap-check-range ',(car form) ,@(cdr form))
                              `(##trap-check-range* ',(car form) ,@(flat (cdr form)))))
                     (if (##bignum? ,end)
                         ,(if (list? form)
                              `(##trap-check-range ',(car form) ,@(cdr form))
                              `(##trap-check-range* ',(car form) ,@(flat (cdr form))))
                         ,(if (list? form)
                              `(##trap-check-exact-int ',(car form) ,@(cdr form))
                              `(##trap-check-exact-int* ',(car form) ,@(flat (cdr form))))))
                 ,(if (list? form)
                      `(##trap-check-range ',(car form) ,@(cdr form))
                      `(##trap-check-range* ',(car form) ,@(flat (cdr form)))))
             (if (##bignum? ,start)
                 ,(if (list? form)
                      `(##trap-check-range ',(car form) ,@(cdr form))
                      `(##trap-check-range* ',(car form) ,@(flat (cdr form))))
                 ,(if (list? form)
                      `(##trap-check-exact-int ',(car form) ,@(cdr form))
                      `(##trap-check-exact-int* ',(car form) ,@(flat (cdr form)))))))
       ,(if (list? form)
            `(##trap-check-string ',(car form) ,@(cdr form))
            `(##trap-check-string* ',(car form) ,@(flat (cdr form))))))

in

(define (string-index s criterion #!optional (start 0) (end (macro-absent-object)))
  (macro-check-string-start-end s start end (string-index criterion s start end)
   (##string-index s criterion start end)))

Each individual check is fast (a few hardware instructions at most), but there are
7 of them, 8 if you have to get end from (string-length s).  So I'd rather do the
extra jump for string-index than the extra checks for the routines that use it.

>    string-concatenate/shared, string-concatenate, string-join:
>    Need to check strings.
>
> There's no cheap way to check an alpha list to ensure that all elements
> are alphas. So I don't check these.

Well, my version reads

(##define-macro (##every? proc lst)
  `(let ((proc ,proc))
     (let loop ((lst ,lst))
       (or (null? lst)
           (and (proc (car lst))
                (loop (cdr lst)))))))

(define (string-concatenate/shared strings)
  (if (list? strings)
      (if (##every? string? strings)
          (let lp ((strings strings) (nchars 0) (first #f))
            (cond ((pair? strings)                      ; Scan the args, add up total
                   (let* ((string  (car strings))       ; length, remember 1st
                          (tail (cdr strings))          ; non-empty string.
                          (slen (string-length string)))
                     (if (zero? slen)
                         (lp tail nchars first)
                         (lp tail (+ nchars slen) (or first strings)))))

                  ((zero? nchars) "")

                  ;; Just one non-empty string! Return it.
                  ((= nchars (string-length (car first))) (car first))

                  (else (let ((ans (make-string nchars)))
                          (let lp ((strings first) (i 0))
                            (if (pair? strings)
                                (let* ((s (car strings))
                                       (slen (string-length s)))
                                  (##string-copy! ans i s 0 slen)
                                  (lp (cdr strings) (+ i slen)))))
                          ans))))
          (error "Not every element of strings is a string: (string-concatenate/shared " strings ")"))
      (error "strings is not a list: (string-concatenate/shared " strings ")")))

Considering the amount of stuff you actually do with the contents of these strings,
and how the routine is going to blow up if it isn't a list or if all the
entries are not strings, I'd check it.  I suppose since you're walking the list
anyway in lp, you could check it inside the cond and get rid of the extra two
traversals of the list (for list? and ##every?).

Brad