Email list hosting service & mailing list manager

Shortcomings of the API Marc Feeley (14 Oct 2023 19:05 UTC)
Re: Shortcomings of the API John Cowan (14 Oct 2023 21:14 UTC)
Re: Shortcomings of the API Marc Nieper-Wi├čkirchen (14 Oct 2023 21:41 UTC)
Re: Shortcomings of the API Marc Feeley (15 Oct 2023 01:09 UTC)

Shortcomings of the API Marc Feeley 14 Oct 2023 19:04 UTC

I’m not particularly fond of the proposed API.  Abstractly a guardian is an interface to the garbage collector.  At any point in time the guardian can be queried to determine if some object registered with the guardian has become “unreachable” (side note: I prefer the term “not strongly reachable” which distinguishes the kind of reachability we are talking about, because it is still “weakly reachable” through the guardian).

The main problems with guardians is that they:

1) Use a custom protocol for iterating over the currently not strongly reachable registered objects.  This protocol is yet another way to iterate over a sequence (there are already lists, generators, ports, etc).  It would be nice to use one of the existing protocols so that guardians can reuse familiar iteration patterns.

2) There’s a scalability issue because guardians are passive.  In order to implement a finalization mechanism there’s a need for some thread of execution to “poll” the guardian once in a while to process the newly not strongly reachable objects.  An alternative it to use a mechanism (not proposed by this SRFI but available with Chez Scheme) to be notified of a garbage collection in order to check the new state of the guardian(s) and proceed with the required finalizations.  Note that this still requires polling, but only when a GC notification is received.  This does not scale to large numbers of guardians.

A better API would be to view a guardian as a stream of objects which represents the order in which the registered objects have been discovered to be not strongly reachable by the GC.  For simplicity lets just say a guardian is a port.  Then a thread could be given the responsability of reading the next object from the guardian, in a loop, to process the required finalization of these objects.  Essentially:

  (define g (make-guardian))

  (thread-start! ;; start a “finalization” thread
    (lambda ()
      (let loop ()
        (let ((obj (read g))) ;; get next not strongly reachable object from guardian (block if there is none)
          (finalize! obj)     ;; finalize it

The important point here is that the garbage collector uses the guardian as a mechanism to notify the finalization thread, that their operation is asynchronous, and that polling is no longer needed.

To me this is a cleaner API because it corresponds with the reality that the GC and main programs are separate threads (indeed this is consistent with the “collector process” and “mutator process” vocabulary used when talking about garbage collection algorithms).

A variation on this API would be to add an operation on the guardian that blocks the calling thread until there is at least one available not strongly reachable object for that guardian.  Unfortunately the proposed representation of guardians as procedures does not make it easy or elegant to do this.  For this reason guardians should be their own type such that (make-guardian) returns this type and the operations on a guardian are done with specific procedures:

(guardian-register guardian obj [rep])  ;; equivalent to the proposed (guardian obj [rep]) operation
(guardian-unregister guardian)          ;; equivalent to the proposed (unregister-guardian guardian)
(guardian-next guardian)                ;; equivalent to the proposed (guardian) operation
(guardian-wait guardian)                ;; new operation that blocks the calling thread until there is at least one not strongly reachable object

This makes it easier to add operations if needed in the future.

This set of operations allows a program to use a polling approach if that’s appropriate (for example low number of guardians), or one based on notification (that blocks the finalization thread when it has nothing to do).