Re: LIST-LENGTH & circular lists

Re: LIST-LENGTH & circular lists Noah Friedman 22 Apr 2001 23:27 UTC

Sorry to be bringing up such an old subject, but I ran across this thread
as the result of a web search and couldn't help replying.

Before quoting the following, let me remind you that Harvey Stein proposed
that the "generalized length" of a list should be defined as the number of
elements, not the number of times you could apply cdr to the list.

On Feb 22 1999, Doug Currie wrote the following:
>At 9:35 PM +0200 2/22/99, Harvey J. Stein wrote:
>>...
>>However, I don't think a generalized length is such a good idea.  It
>>implies a substantial amount of additional overhead on computing the
>>length of a list.  This overhead remains no matter what you return for
>>circular lists - it's the act of detecting them which adds the
>>overhead.
>
>Not necessarily. The length function must compute a count. At some point
>this count will exceed a (implementation dependent) limit, e.g., the size
>of the heap, when it becomes clear that the list is circular. For
>non-circular lists this adds little overhead, just one integer compare per
>iteration (in a clever low level implementation, it may add no overhead if
>it's combined with the detection of fixnum overflow to switch to bignum
>addition). For circular lists, this may take a while to detect, but on
>today's fast processors with lists that fit in cache it can be quite
>competitive with more complex approaches that allocate memory or use N*N*N
>algorithms. Using these counting techniques a list-length can be provided
>competitive in performance with a non-circular-list-detecting version.
>However, this approach doesn't provide a result for circular list more
>useful than 0 or #f.

I think you can compute the number of distinct elements in a circular list
in linear time and constant space without resorting to any
implemented-defined behavior of the runtime environment.

I defined the function below in MIT Scheme and evaluated the following:

    (define l '(1 2 3 4 5 6 7))
    (set-cdr! (list-tail l 6) (list-tail l 3))

    (circular-list-length l)
    ;Value: 7

    (circular-list-length '(1 2 3))
    ;Value: 3

    (circular-list-length '())
    ;Value: 0

Here's the function definition, which works for non-circular lists as well.

(define (circular-list-length lst)
  "Return the number of distinct elements of circular list LST."
  (let ((tortoise lst)
        (hare lst)
        (tortoise-advance #t)
        (len 0))
    ;; Find a member of the list guaranteed to be within the cycle, and
    ;; compute length if list turns out to be non-circular.
    (do ()
        ((null? hare))
      (set! hare (cdr hare))
      (set! len  (1+ len))
      (set! tortoise-advance (not tortoise-advance))
      (if tortoise-advance
           (set! tortoise (cdr tortoise)))
      (if (eq? hare tortoise)
          (begin
            (set! hare '())
            (set! len  0))))

    (if (and (not (null? lst))
             (zero? len))
        (begin
          ;; Find period of cycle.
          (set! hare (cdr tortoise))
          (set! len 1)
          (do ()
              ((eq? hare tortoise))
            (set! hare (cdr hare))
            (set! len (1+ len)))

          ;; Give hare a head start from the start of the list equal to the
          ;; loop size.  If both move at the same speed they must meet at
          ;; the nexus because they are in phase, i.e. when tortoise enters
          ;; the loop, hare must still be exactly one loop period
          ;; ahead--but that means it will be pointing at the same list
          ;; element.
          (set! tortoise lst)
          (set! hare (list-tail lst len))
          (do ()
              ((eq? tortoise hare))
            (set! hare (cdr hare))
            (set! tortoise (cdr tortoise))
            (set! len (1+ len)))))
    len))