Comparing Pika-syle and JNI-style Jim Blandy 14 Jan 2004 09:06 UTC
Jim Blandy <xxxxxx@redhat.com> writes: > Well, if SRFI-50 turned out not to be what I was hoping, and I didn't > come to my senses quickly enough, I was going to turn <minor/minor.h>, > into a .texi file, start a SRFI from that, and see what people said. In light of that, I'm curious to know how people generally feel about the Pika vs. JNI issue. If there's a near consensus on one or the other, then that could save a lot of trouble. Here are links to the specs: Pika: http://arch.quackerhead.com/~lord/releases/pika http://regexps.srparish.net/src/pika/ Minor: http://svn.red-bean.com/repos/minor/trunk/include/minor/minor.h Here's how I see it: Commonalities: - Both work by having C code manipulate only references to Scheme values, not Scheme values themselves. - Both impose few restrictions on the representation of Scheme objects. - Both allow GC to occur at any time. - Both can be implemented in a way that interacts nicely with threads. In Pika: - Leaks are impossible, since references are stack-allocated. - References are freed upon exit from the lexical block that owns them --- finer-grained than JNI-style. - Probably less overhead than JNI-style. But: - Forgetting an UNGCPRO corrupts the GC's data structures, and may fail only intermittently. Irregular exits (return; goto; break; continue) require attention. longjmp is even harder. - Functions may only return Scheme values by reference; they may not provide them as their (syntactic) return values. Instead of writing "f (g (x))", you must write: g (&frame.x, &frame.temp); f (&frame.temp, &frame.temp2); In other words, you must write your code as linear series of operations which work by side-effects. - Since the API functions all expect pointers to t_scm_word values, this discourages people from passing them around directly, but it can still be done --- e.g. "frame.x = frame.y;" --- and doing so will usually work. But doing so is a bug. - Variable declarations are cluttered with enclosing structs and GCPRO / UNGCPRO calls. In JNI-style: - Functions can return references directly, so code need not be linearized. You can write "f (call, g (call, x))" --- if you know that "call" will return and free g's return value soon enough. - Local references are freed automatically when the Scheme->C call to which they belong returns. Leaks due to unfreed local references (which will probably be the most common sort of error) have a bounded and often (though not always) short lifetime. - No GC data structures live on the C stack, so careless control flow and longjmps will not corrupt the GC's data structures. - The "explicit free" model is familiar to C programmers. - Variables are declared normally, and their values used directly. - Since mn_ref is an incomplete type, it can't be dereferenced, so people can't be sloppy and operate on the heap values directly. But: - The "explicit free" model is still error-prone. The fact that leaks are bounded by their owning call's lifetime may not always help. - Probably more overhead than Pika-style. - Code will be cluttered with explicit-free crap. Is this fair? What have I missed? What do people think? It would be nice to see sample code in each style. C implementations of "cadr" and "assq" would be nice. As far as I know, error checking is similar under both interfaces, so that can be left out. mn_ref * cadr (mn_call *c, mn_ref *obj) { return mn_car (c, mn_cdr (c, obj)); } mn_ref * assq (mn_call *c, mn_ref *key, mn_ref *alist) { while (mn_pair_p (c, alist)) { mn_ref *pair = mn_car (c, alist); mn_ref *pair_key = mn_car (c, pair); if (mn_ref_eq (c, key, pair_key)) return pair; mn_free_local_ref (c, pair); mn_free_local_ref (c, pair_key); alist = mn_to_cdr (c, alist); } return mn_false (c); }