> From: Jim Blandy <xxxxxx@redhat.com> > Yes, the EXTRACT issues aren't critical. But the thread-related > problems with GCPRO that I don't see how to solve are those > created by the user's compiler rearranging code that operates > directly on heap references. The compiler is free to make > copies of heap references in registers where a copying GC can't > find them to update them. We agree about the problem caused by C semantics and in broad strokes about the compiler-taming tricks needed to solve them -- my solution differs from yours (and JNIs) in that it doesn't require reference objects to be explicitly heap allocated and freed. Loosely speaking, they are instead stack allocated. (Yes, I suspect you are thinking of making a tiny stack on the heap to allocate the local mn_frefs of a given call but keep reading.) In the GCPRO system I'm proposing, all parameters are passed by reference (by a `scheme_value *'); all return values returned by output parameters (again a `scheme_value *'); and local assignment is via a macro that can impose a write barrier rather than with C's assignment operator. In those regards, it very much resembles the mn_refs idea. (Though changing the value of an mn_ref seems something one is less likely to do in your system.) One difference in our proposals concerns the lifetimes of variables. Local mn_refs seem to live until some outermost call returns unless code explicitly creates then destroys a new mn_call. My approach controls variable lifetimes with GCPRO-style calls giving them lifetimes that coincide with C stack frame lifetimes. Note that I haven't made any proposal about what you call `global mn_refs'. I am planning on having an interface for allocating arrays of GC roots to which C data structures can refer. A simple example: /* scm_cons1 (result, arena, a) * * Return a new pair of the form ( *a . () ) * * Equivalent to (lambda (a) (cons a '())) * */ void scm_cons1 (scheme_value * result, scheme_instance arena, scheme_value * a) { struct cons1_locals { SCM_FRAME; scheme_value nil; } l; SCM_PROTECT_FRAME (l); SCHEME_MAKE_NIL (&l.nil, arena); SCHEME_CONS (result, arena, a, &l.nil); SCM_UNPROTECT_FRAME (l); } The parameter, `*a', is protected by the caller. `l.nil' is protected because SCM_PROTECT_FRAME has made it visible to GC and because it's address, not its value, is passed to the primitives `scm_make_nil' and `scm_cons'. The value stored in `l.nil' is protected by `scm_cons1' before `scm_make_nil' returns. The value stored in `*result' is protected by the caller of `scm_cons1' before `scm_cons' returns to `scm_cons1'. (If interprocedural optimization is allowed to screw this I'd like to know exactly how and why....) > The general view is like this: the GCPRO'd variables are inescapably a > data structure that is shared between the mutator thread that owns the > stack frame and some other collecting thread out there. But there's > no opportunity for the API implementation to do the needed > synchronization. Yes there is. The only way the GCPROtected variables are ever modified is in the primitives provided by the FFI. This includes assignment between two locals: SCM_LSET (&l.a, arena, &l.b); /* l.a = l.b */ and as tb pointed out, other C operators on scheme_values are also prohibited: scm_is_nil (arena, &l.a) /* rather than l.a == scm_nil */ A function using the FFI has no reason to ever land a raw `scm_value' in a register or compiler-created temporary variable. > The only way I can see to save GCPRO is to forbid collection except > when every thread is at a "safe point". In other words, you > reintroduce the restriction that "collection may only happen in calls > that do allocation", by saying that *every* thread must be in a > specially designated call. Not at all. For example, in `scm_cons1' above, GC can safely happen at any point at all during its execution from prolog to postlog (even if, for some strange reason, nil is newly heap allocated by scm_make_nil). The biggest issue in choosing between the two approaches, as far as I know, is the question of efficiency. The approach above has a few advantages in that regard, I think: a) Assuming that you plan to build little stacks on the heap to allocate the local mn_refs for a given call, the allocation overheads are probably close to a wash. I might get some advantage in allocation times by not having to do a separate overflow check and by getting the space for them when the C stack frame is allocated. I get some advantage by allocating a bunch of variables at once with GCPRO. I get some advantage by not having to separately stack allocate room for `mn_ref *' values. I get some disadvantage (speed-wise, not precision-wise) from the greater number of GCUNPRO calls. b) In single-threaded environment, I can inline some primitives and (at least my hunch is) get much better code. For example, SCM_LSET can come out to just an ordinary C assignment (`='); SCHEME_IS_NIL can come out to an == check on a local variable that may very well be in a handy register. c) You may have a good answer for this but I don't see it in your post to the list. Don't local mn_refs leak like a sieve? For example, `mn_cdr' returns a new `mn_ref *', right? It's not freed until some outermost call, associated with the `mn_call *' I got returns. So now what if I'm traversing a long list with K elements? Won't that allocate K local mn_refs which aren't freed until I return? Won't they, until then, be protecting the values they refer to? -t