Re: Comparing Pika-syle and JNI-style Tom Lord 14 Jan 2004 22:09 UTC


    > From: Jim Blandy <xxxxxx@redhat.com>

    > > http://srfi.schemers.org/srfi-50/mail-archive/msg00241.html

    > See, that is interesting, though --- it shows that you don't
    > have to always set up frames in Pika, if you can do all your
    > computation in references owned by the calling frame.

Ok and Yup.

    > > Again, it comes down to the three classes of functions, (a), (b), and
    > > (c).  My proposal allows people to write FFI-using code in any of
    > > those three classes.  If the Pika FFI includes a standard interface to
    > > that auxiliary stack, then people writing FFI-using code in class (c)
    > > get full interoperability with one another.  But in the far more
    > > common case (how many libraries do you know that permit longjmping
    > > past them?), users writing FFI-using code in class (b) don't have to
    > > pay for auxiliary stacks.

    > But I don't even want to have to think about whether I'm going to be
    > longjmped past.  It's a non-local property, involving code I probably
    > haven't even read, and certainly can't afford to audit each time it's
    > revised.  I'm willing to tolerate less-than-optimal behavior when a
    > longjmp occurs, as long as it's still correct, in exchange for not
    > having to think about it.  And I've suggested a way to fix the
    > problem: have SCM_PROTECT malloc frames, and SCM_UNPROTECT free them.

Well, you're saying (I think) that you always want all of your
calling-out functions to be in class (c) ("safe to longjmp past").
That's fine and Pika-style permits you to write that way.  Pika-style
extended with a standard for the "auxiliary stack" let's you both
write that way and mix and match your code with others who do the
same.

But should the FFI spec force _all_ FFI-using functions to be in class
(c)?  I hope not.

About the following: one of us is confused.  Not sure which.

    > >     >     mn_ref *
    > >     >     mn_to_car (mn_call *call, mn_ref *ref)
    > >     >     {
    > >     >       mn__begin_incoherent (call);
    > >     >       {
    > >     >         ref->obj = check_pair (ref)->car;
    > >     >       }
    > >     >       mn__end_incoherent (call);
    > >     >
    > >     >       return ref;
    > >     >     }

    > > Isn't that code incorrect in a threaded system?   While `ref' is,
    > > indeed, about to be freed, the pair that it refers to is live.
    > > Assuming that the `incoherent' calls exclude only GC but not other
    > > mutators (which is the benefit you seem to be claiming), then the
    > > `->car' risks producing garbage.

    > This is what that comment is going on about.  References are
    > immutable: there is no operation that changes a reference's referent.
    > mn_to_car looks like a counter-example, but it isn't: officially, it
    > frees REF, so it would be incorrect to call it if any other thread
    > were referring to it.  But since it's freeing a reference and then
    > immediately allocating a new one, it might as well just reuse the
    > reference.

That's not what I mean by "incorrect in a threaded system".

Am I correct that `check_pair (ref)' returns a pointer to something
like:

	struct pair
        {
          scheme_value car;
          scheme_value cdr;
        }

?

And am I correct that mn__begin_incoherent excludes GC but not other
mutators?  If it excludes _all_ mutators then I don't think you really
have much savings in `to_car' compared to `car': you're saving the
allocation of a new handle, sure, but that should be very, very cheap
compared to `mn__begin_incoherent' and `mn__end_incoherent'.

If both assumptions are true then the code is incorrect because
`->car' is not necessarily going to return a legitimate scheme value
(it may return a "half written" one).

If the second assumption (about excluding only GC) is false -- then I
don't see a substantial savings here: you need to acquire a highly
contentious lock.

If the first assumption is false (about returning a pointer to a pair)
-- then I don't know what the heck that code means.

    > Well, if you'll grant that this is fuzzy talk:

    > Pika's ooze is that you appear to be operating on local variables, but
    > you can only use them as lvalues, never rvalues.  And they're actually
    > data structures owned by the GC; rather than being managed by the
    > compiler, as local variables are, they have to be explicitly
    > registered and unregistered.  That's the source of the longjmp
    > problems, too.

Ok.

-t