Jim Blandy <xxxxxx@redhat.com> writes:
> Well, if SRFI-50 turned out not to be what I was hoping, and I didn't
> come to my senses quickly enough, I was going to turn <minor/minor.h>,
> into a .texi file, start a SRFI from that, and see what people said.
In light of that, I'm curious to know how people generally feel about
the Pika vs. JNI issue. If there's a near consensus on one or the
other, then that could save a lot of trouble.
Here are links to the specs:
Pika: http://arch.quackerhead.com/~lord/releases/pika
http://regexps.srparish.net/src/pika/
Minor: http://svn.red-bean.com/repos/minor/trunk/include/minor/minor.h
Here's how I see it:
Commonalities:
- Both work by having C code manipulate only references to Scheme
values, not Scheme values themselves.
- Both impose few restrictions on the representation of Scheme objects.
- Both allow GC to occur at any time.
- Both can be implemented in a way that interacts nicely with threads.
In Pika:
- Leaks are impossible, since references are stack-allocated.
- References are freed upon exit from the lexical block that owns
them --- finer-grained than JNI-style.
- Probably less overhead than JNI-style.
But:
- Forgetting an UNGCPRO corrupts the GC's data structures, and may
fail only intermittently. Irregular exits (return; goto; break;
continue) require attention. longjmp is even harder.
- Functions may only return Scheme values by reference; they may not
provide them as their (syntactic) return values. Instead of writing
"f (g (x))", you must write:
g (&frame.x, &frame.temp);
f (&frame.temp, &frame.temp2);
In other words, you must write your code as linear series of
operations which work by side-effects.
- Since the API functions all expect pointers to t_scm_word values,
this discourages people from passing them around directly, but it
can still be done --- e.g. "frame.x = frame.y;" --- and doing so
will usually work. But doing so is a bug.
- Variable declarations are cluttered with enclosing structs and GCPRO
/ UNGCPRO calls.
In JNI-style:
- Functions can return references directly, so code need not be
linearized. You can write "f (call, g (call, x))" --- if you know
that "call" will return and free g's return value soon enough.
- Local references are freed automatically when the Scheme->C call to
which they belong returns. Leaks due to unfreed local references
(which will probably be the most common sort of error) have a
bounded and often (though not always) short lifetime.
- No GC data structures live on the C stack, so careless control flow
and longjmps will not corrupt the GC's data structures.
- The "explicit free" model is familiar to C programmers.
- Variables are declared normally, and their values used directly.
- Since mn_ref is an incomplete type, it can't be dereferenced, so
people can't be sloppy and operate on the heap values directly.
But:
- The "explicit free" model is still error-prone. The fact that leaks
are bounded by their owning call's lifetime may not always help.
- Probably more overhead than Pika-style.
- Code will be cluttered with explicit-free crap.
Is this fair? What have I missed? What do people think?
It would be nice to see sample code in each style. C implementations
of "cadr" and "assq" would be nice. As far as I know, error checking
is similar under both interfaces, so that can be left out.
mn_ref *
cadr (mn_call *c, mn_ref *obj)
{
return mn_car (c, mn_cdr (c, obj));
}
mn_ref *
assq (mn_call *c, mn_ref *key, mn_ref *alist)
{
while (mn_pair_p (c, alist))
{
mn_ref *pair = mn_car (c, alist);
mn_ref *pair_key = mn_car (c, pair);
if (mn_ref_eq (c, key, pair_key))
return pair;
mn_free_local_ref (c, pair);
mn_free_local_ref (c, pair_key);
alist = mn_to_cdr (c, alist);
}
return mn_false (c);
}