Re: LIST-LENGTH & circular lists
Noah Friedman 22 Apr 2001 23:27 UTC
Sorry to be bringing up such an old subject, but I ran across this thread
as the result of a web search and couldn't help replying.
Before quoting the following, let me remind you that Harvey Stein proposed
that the "generalized length" of a list should be defined as the number of
elements, not the number of times you could apply cdr to the list.
On Feb 22 1999, Doug Currie wrote the following:
>At 9:35 PM +0200 2/22/99, Harvey J. Stein wrote:
>>...
>>However, I don't think a generalized length is such a good idea. It
>>implies a substantial amount of additional overhead on computing the
>>length of a list. This overhead remains no matter what you return for
>>circular lists - it's the act of detecting them which adds the
>>overhead.
>
>Not necessarily. The length function must compute a count. At some point
>this count will exceed a (implementation dependent) limit, e.g., the size
>of the heap, when it becomes clear that the list is circular. For
>non-circular lists this adds little overhead, just one integer compare per
>iteration (in a clever low level implementation, it may add no overhead if
>it's combined with the detection of fixnum overflow to switch to bignum
>addition). For circular lists, this may take a while to detect, but on
>today's fast processors with lists that fit in cache it can be quite
>competitive with more complex approaches that allocate memory or use N*N*N
>algorithms. Using these counting techniques a list-length can be provided
>competitive in performance with a non-circular-list-detecting version.
>However, this approach doesn't provide a result for circular list more
>useful than 0 or #f.
I think you can compute the number of distinct elements in a circular list
in linear time and constant space without resorting to any
implemented-defined behavior of the runtime environment.
I defined the function below in MIT Scheme and evaluated the following:
(define l '(1 2 3 4 5 6 7))
(set-cdr! (list-tail l 6) (list-tail l 3))
(circular-list-length l)
;Value: 7
(circular-list-length '(1 2 3))
;Value: 3
(circular-list-length '())
;Value: 0
Here's the function definition, which works for non-circular lists as well.
(define (circular-list-length lst)
"Return the number of distinct elements of circular list LST."
(let ((tortoise lst)
(hare lst)
(tortoise-advance #t)
(len 0))
;; Find a member of the list guaranteed to be within the cycle, and
;; compute length if list turns out to be non-circular.
(do ()
((null? hare))
(set! hare (cdr hare))
(set! len (1+ len))
(set! tortoise-advance (not tortoise-advance))
(if tortoise-advance
(set! tortoise (cdr tortoise)))
(if (eq? hare tortoise)
(begin
(set! hare '())
(set! len 0))))
(if (and (not (null? lst))
(zero? len))
(begin
;; Find period of cycle.
(set! hare (cdr tortoise))
(set! len 1)
(do ()
((eq? hare tortoise))
(set! hare (cdr hare))
(set! len (1+ len)))
;; Give hare a head start from the start of the list equal to the
;; loop size. If both move at the same speed they must meet at
;; the nexus because they are in phase, i.e. when tortoise enters
;; the loop, hare must still be exactly one loop period
;; ahead--but that means it will be pointing at the same list
;; element.
(set! tortoise lst)
(set! hare (list-tail lst len))
(do ()
((eq? tortoise hare))
(set! hare (cdr hare))
(set! tortoise (cdr tortoise))
(set! len (1+ len)))))
len))