> From: Jim Blandy <xxxxxx@redhat.com>
> Yes, the EXTRACT issues aren't critical. But the thread-related
> problems with GCPRO that I don't see how to solve are those
> created by the user's compiler rearranging code that operates
> directly on heap references. The compiler is free to make
> copies of heap references in registers where a copying GC can't
> find them to update them.
We agree about the problem caused by C semantics and in broad strokes
about the compiler-taming tricks needed to solve them -- my solution
differs from yours (and JNIs) in that it doesn't require reference
objects to be explicitly heap allocated and freed. Loosely speaking,
they are instead stack allocated. (Yes, I suspect you are thinking of
making a tiny stack on the heap to allocate the local mn_frefs of a
given call but keep reading.)
In the GCPRO system I'm proposing, all parameters are passed by
reference (by a `scheme_value *'); all return values returned by
output parameters (again a `scheme_value *'); and local assignment is
via a macro that can impose a write barrier rather than with C's
assignment operator. In those regards, it very much resembles the
mn_refs idea. (Though changing the value of an mn_ref seems something
one is less likely to do in your system.)
One difference in our proposals concerns the lifetimes of variables.
Local mn_refs seem to live until some outermost call returns unless
code explicitly creates then destroys a new mn_call. My approach
controls variable lifetimes with GCPRO-style calls giving them
lifetimes that coincide with C stack frame lifetimes.
Note that I haven't made any proposal about what you call `global
mn_refs'. I am planning on having an interface for allocating arrays
of GC roots to which C data structures can refer.
A simple example:
/* scm_cons1 (result, arena, a)
*
* Return a new pair of the form ( *a . () )
*
* Equivalent to (lambda (a) (cons a '()))
*
*/
void
scm_cons1 (scheme_value * result, scheme_instance arena, scheme_value * a)
{
struct cons1_locals
{
SCM_FRAME;
scheme_value nil;
} l;
SCM_PROTECT_FRAME (l);
SCHEME_MAKE_NIL (&l.nil, arena);
SCHEME_CONS (result, arena, a, &l.nil);
SCM_UNPROTECT_FRAME (l);
}
The parameter, `*a', is protected by the caller. `l.nil' is protected
because SCM_PROTECT_FRAME has made it visible to GC and because it's
address, not its value, is passed to the primitives `scm_make_nil' and
`scm_cons'. The value stored in `l.nil' is protected by `scm_cons1'
before `scm_make_nil' returns. The value stored in `*result' is
protected by the caller of `scm_cons1' before `scm_cons' returns to
`scm_cons1'. (If interprocedural optimization is allowed to screw
this I'd like to know exactly how and why....)
> The general view is like this: the GCPRO'd variables are inescapably a
> data structure that is shared between the mutator thread that owns the
> stack frame and some other collecting thread out there. But there's
> no opportunity for the API implementation to do the needed
> synchronization.
Yes there is. The only way the GCPROtected variables are ever
modified is in the primitives provided by the FFI. This includes
assignment between two locals:
SCM_LSET (&l.a, arena, &l.b); /* l.a = l.b */
and as tb pointed out, other C operators on scheme_values are also
prohibited:
scm_is_nil (arena, &l.a) /* rather than l.a == scm_nil */
A function using the FFI has no reason to ever land a raw `scm_value'
in a register or compiler-created temporary variable.
> The only way I can see to save GCPRO is to forbid collection except
> when every thread is at a "safe point". In other words, you
> reintroduce the restriction that "collection may only happen in calls
> that do allocation", by saying that *every* thread must be in a
> specially designated call.
Not at all. For example, in `scm_cons1' above, GC can safely happen
at any point at all during its execution from prolog to postlog (even
if, for some strange reason, nil is newly heap allocated by
scm_make_nil).
The biggest issue in choosing between the two approaches, as far as I
know, is the question of efficiency. The approach above has a few
advantages in that regard, I think:
a) Assuming that you plan to build little stacks on the heap to
allocate the local mn_refs for a given call, the allocation
overheads are probably close to a wash.
I might get some advantage in allocation times by not having to
do a separate overflow check and by getting the space for them
when the C stack frame is allocated. I get some advantage by
allocating a bunch of variables at once with GCPRO. I get some
advantage by not having to separately stack allocate room for
`mn_ref *' values. I get some disadvantage (speed-wise, not
precision-wise) from the greater number of GCUNPRO calls.
b) In single-threaded environment, I can inline some primitives and
(at least my hunch is) get much better code. For example,
SCM_LSET can come out to just an ordinary C assignment (`=');
SCHEME_IS_NIL can come out to an == check on a local variable
that may very well be in a handy register.
c) You may have a good answer for this but I don't see it in your
post to the list. Don't local mn_refs leak like a sieve?
For example, `mn_cdr' returns a new `mn_ref *', right?
It's not freed until some outermost call, associated with
the `mn_call *' I got returns.
So now what if I'm traversing a long list with K elements?
Won't that allocate K local mn_refs which aren't freed until
I return? Won't they, until then, be protecting the values
they refer to?
-t