numbers; efficiency Aluminum Petrofsky 18 Dec 2002 09:48 UTC

> From: bear <xxxxxx@sonic.net>

> But at third glance, (eq?) objects behave just like (eqv?) objects in
> the absence of assignment - and there is no way in R5RS to assign to a
> number or a character without changing its identity and thereby
> terminating the (eq?) relationship -- without a side effect on
> whatever it formerly had an (eq?) relationship with.

Uh, there is no way in r5rs to assign to a number or character at
all.  You can assign a new value to a location that previously held a
number, but that's neither here nor there.

> Well, my intent was to recover (eq?) relationships when reading
> anything that had been written.

R5rs numerical eq?-ity is much too ephemeral to survive a write-read
cycle.  Note this example from the standard:

     (let ((n (+ 2 3))) (eq? n n))  =>  unspecified

It's also true that:

     (let ((n 1)) (eq? (eq? n n) (eq? n n)))  =>  unspecified

I think what you really want is just for shared locations to be
shared.  Other than procedures (for which external representation is
quite problematic), the three types of r5rs data that represent
locations are pairs, nonempty vectors, and nonempty strings.

> So it appears that there may be situations in which numbers may be
> (eqv?) without being (eq?) in some schemes.  (What scheme are you
> using, marc?  I've never had this happen and didn't know it was
> possible!)

It happens frequently with bignums, and with floating-point if
floating-point numbers don't fit in a word.

  (define p (cons (expt 10 100) (expt 10 100)))

  (eqv? (car p) (cdr p))  =>  #t
  (eq?  (car p) (cdr p))  =>  unspecified, usually #f.

> From: Sven Hartrumpf <xxxxxx@FernUni-Hagen.de>
>
> I would like to see the reference implementation to be improved for efficiency.
> Yes, I know that efficiency is not a requirement for SRFI reference
> implementions.  But many Scheme implementors just copy the reference
> implementation without modifications in order to save time - which is
> understandable but bad for many users.

The big efficiency issue here is that any portable implementation will
take quadratic time, but if you are the system implementor you should
be able to provide an O(n log n) version.  If all the locations are
mutable, you can write an O(n) version that destructively marks
locations as it goes and then restores them all when done.  If
locations are immutable, then there should be some fixed ordering of
them that you can use to make an O(log n) lookup table, giving you an
O(n log n) write-showing-shared.  R5rs does not give the programmer
access to mutability information nor to comparison of constant data's
addresses, but both of these are trivial operations if you have access
to the system's internals.

(Hmm, that destructive marking idea will actually only work for pairs
and vectors, not strings ... but I think if you have access to the
allocator it shouldn't be too hard to work something out.)

We can certainly improve the portable implementation, however.  As
Bear said, the code in the reference implementation was written to
implement a different function -- one that was purely functional and
returned a string.  It has to do extra work to accommodate the lack of
write-to-string functions in r5rs.  To implement the imperative
write-showing-shared procedure of the SRFI is more straightforward:

  ;; Warning: only very lightly tested.
  (define (write-showing-shared obj)
    (define (acons key val alist)
      (cons (cons key val) alist))
    ;; We only track duplicates of pairs, vectors, and strings.  We
    ;; ignore zero-length vectors and strings because r5rs doesn't
    ;; guarantee that eq? treats them sanely (and they aren't very
    ;; interesting anyway).
    (define (interesting? obj)
      (or (pair? obj)
	  (and (vector? obj) (not (zero? (vector-length obj))))
	  (and (string? obj) (not (zero? (string-length obj))))))
    ;; (write-obj OBJ ALIST):
    ;; ALIST has an entry for each interesting part of OBJ.  The
    ;; associated value will be:
    ;;  -- a number if the part has been given one,
    ;;  -- #t if the part will need to be assigned a number but has not been yet,
    ;;  -- #f if the part will not need a number.
    ;; The cdr of ALIST's first element should be the most recently
    ;; assigned number.
    ;; Returns an alist with new shadowing entries for any parts that
    ;; had numbers assigned.
    (define (write-obj obj alist)
      (define (write-interesting alist)
	(cond ((pair? obj)
	       (display "(")
	       (let write-cdr ((obj (cdr obj)) (alist (write-obj (car obj) alist)))
		 (cond ((and (pair? obj) (not (cdr (assq obj alist))))
			(display " ")
			(write-cdr (cdr obj) (write-obj (car obj) alist)))
		       ((null? obj)
			(display ")")
			alist)
		       (else
			(display " . ")
			(let ((alist (write-obj obj alist)))
			  (display ")")
			  alist)))))
	      ((vector? obj)
	       (display "#(")
	       (let ((len (vector-length obj)))
		 (let write-vec ((i 1) (alist (write-obj (vector-ref obj 0) alist)))
		   (cond ((= i len) (display ")") alist)
			 (else (display " ")
			       (write-vec (+ i 1)
					  (write-obj (vector-ref obj i) alist)))))))
	      ;; else it's a string
	      (else (write obj) alist)))
      (cond ((interesting? obj)
	     (let ((val (cdr (assq obj alist))))
	       (cond ((not val) (write-interesting alist))
		     ((number? val) (display "#") (write val) (display "#") alist)
		     (else
		      (let ((n (+ 1 (cdar alist))))
			(display "#") (write n) (display "=")
			(write-interesting (acons obj n alist)))))))
	    (else (write obj) alist)))
    ;; Scan computes the initial value of the alist, which maps each
    ;; interesting part of the object to #t if it occurs multiple times,
    ;; #f if only once.
    (define (scan obj alist)
      (cond ((not (interesting? obj)) alist)
	    ((assq obj alist)
             => (lambda (p) (if (cdr p) alist (acons obj #t alist))))
	    (else
	     (let ((alist (acons obj #f alist)))
	       (cond ((pair? obj) (scan (car obj) (scan (cdr obj) alist)))
		     ((vector? obj)
		      (let ((len (vector-length obj)))
			(do ((i 0 (+ 1 i))
			     (alist alist (scan (vector-ref obj i) alist)))
			    ((= i len) alist))))
		     (else alist))))))
    (write-obj obj (acons 'dummy 0 (scan obj '())))
    ;; We don't want to return the big alist that write-obj just returned.
    (if #f #f))

-al