Re: GC safety and return values

Re: GC safety and return values Tom Lord 24 Dec 2003 17:34 UTC

    > From: Jim Blandy <xxxxxx@redhat.com>

    > Yes, the EXTRACT issues aren't critical.  But the thread-related
    > problems with GCPRO that I don't see how to solve are those
    > created by the user's compiler rearranging code that operates
    > directly on heap references.  The compiler is free to make
    > copies of heap references in registers where a copying GC can't
    > find them to update them.

We agree about the problem caused by C semantics and in broad strokes
about the compiler-taming tricks needed to solve them -- my solution
differs from yours (and JNIs) in that it doesn't require reference
objects to be explicitly heap allocated and freed.  Loosely speaking,
they are instead stack allocated.  (Yes, I suspect you are thinking of
making a tiny stack on the heap to allocate the local mn_frefs of a
given call but keep reading.)

In the GCPRO system I'm proposing, all parameters are passed by
reference (by a `scheme_value *'); all return values returned by
output parameters (again a `scheme_value *'); and local assignment is
via a macro that can impose a write barrier rather than with C's
assignment operator.  In those regards, it very much resembles the
mn_refs idea.  (Though changing the value of an mn_ref seems something
one is less likely to do in your system.)

One difference in our proposals concerns the lifetimes of variables.
Local mn_refs seem to live until some outermost call returns unless
code explicitly creates then destroys a new mn_call.  My approach
controls variable lifetimes with GCPRO-style calls giving them
lifetimes that coincide with C stack frame lifetimes.

Note that I haven't made any proposal about what you call `global
mn_refs'.  I am planning on having an interface for allocating arrays
of GC roots to which C data structures can refer.

A simple example:

  /* scm_cons1 (result, arena, a)
   *
   * Return a new pair of the form ( *a . () )
   *
   * Equivalent to (lambda (a) (cons a '()))
   *
   */
  void
  scm_cons1 (scheme_value * result, scheme_instance arena, scheme_value * a)
  {
    struct cons1_locals
      {
        SCM_FRAME;
        scheme_value nil;
      } l;

    SCM_PROTECT_FRAME (l);

    SCHEME_MAKE_NIL (&l.nil, arena);
    SCHEME_CONS (result, arena, a, &l.nil);

    SCM_UNPROTECT_FRAME (l);
  }

The parameter, `*a', is protected by the caller.  `l.nil' is protected
because SCM_PROTECT_FRAME has made it visible to GC and because it's
address, not its value, is passed to the primitives `scm_make_nil' and
`scm_cons'.  The value stored in `l.nil' is protected by `scm_cons1'
before `scm_make_nil' returns.  The value stored in `*result' is
protected by the caller of `scm_cons1' before `scm_cons' returns to
`scm_cons1'.   (If interprocedural optimization is allowed to screw
this I'd like to know exactly how and why....)

    > The general view is like this: the GCPRO'd variables are inescapably a
    > data structure that is shared between the mutator thread that owns the
    > stack frame and some other collecting thread out there.  But there's
    > no opportunity for the API implementation to do the needed
    > synchronization.

Yes there is.  The only way the GCPROtected variables are ever
modified is in the primitives provided by the FFI.  This includes
assignment between two locals:

	SCM_LSET (&l.a, arena, &l.b);    /* l.a = l.b */

and as tb pointed out, other C operators on scheme_values are also
prohibited:

	scm_is_nil (arena, &l.a)   /* rather than l.a == scm_nil */

A function using the FFI has no reason to ever land a raw `scm_value'
in a register or compiler-created temporary variable.

    > The only way I can see to save GCPRO is to forbid collection except
    > when every thread is at a "safe point".  In other words, you
    > reintroduce the restriction that "collection may only happen in calls
    > that do allocation", by saying that *every* thread must be in a
    > specially designated call.

Not at all.  For example, in `scm_cons1' above, GC can safely happen
at any point at all during its execution from prolog to postlog (even
if, for some strange reason, nil is newly heap allocated by
scm_make_nil).

The biggest issue in choosing between the two approaches, as far as I
know, is the question of efficiency.   The approach above has a few
advantages in that regard, I think:

  a) Assuming that you plan to build little stacks on the heap to
     allocate the local mn_refs for a given call, the allocation
     overheads are probably close to a wash.

     I might get some advantage in allocation times by not having to
     do a separate overflow check and by getting the space for them
     when the C stack frame is allocated.  I get some advantage by
     allocating a bunch of variables at once with GCPRO.  I get some
     advantage by not having to separately stack allocate room for
     `mn_ref *' values.   I get some disadvantage (speed-wise, not
     precision-wise) from the greater number of GCUNPRO calls.

  b) In single-threaded environment, I can inline some primitives and
     (at least my hunch is) get much better code.   For example,
     SCM_LSET can come out to just an ordinary C assignment (`=');
     SCHEME_IS_NIL can come out to an == check on a local variable
     that may very well be in a handy register.

  c) You may have a good answer for this but I don't see it in your
     post to the list.   Don't local mn_refs leak like a sieve?

     For example, `mn_cdr' returns a new `mn_ref *', right?
     It's not freed until some outermost call, associated with
     the `mn_call *' I got returns.

     So now what if I'm traversing a long list with K elements?
     Won't that allocate K local mn_refs which aren't freed until
     I return?   Won't they, until then, be protecting the values
     they refer to?

-t