Re: Pika-style from first principles Marc Feeley 06 Jan 2004 15:36 UTC

> In other words -- FFI implementations should be granted complete
> liberty for implementing the primitive operations on the big data
> structure that is the universe of Scheme data.  The interfaces to
> assignment, reference, and procedural transformation should all grant
> the FFI implementor complete opportunity to seize the flow of control,
> subject only to the constraint that that flow is sanely returned to C
> once the operation is committed.
> Moreover, the FFI should be a procedural interface which makes no
> presumptions at all about the representation of Scheme values or of
> Scheme locations -- it need only, and therefore should only, impose
> constraints on the C reification of references to Scheme locations.
> Pika and JNI/minor satisfy these abstraction requirements.  Pika does
> so with less overhead and overhead at least comperable to the draft.
> The draft soundly fails to satisfy these abstraction requirements for
> reasons listed above.
> Later, I think we can make similar evaluations of the requirements for
> abstracting over Scheme's flow of control.

I haven't followed the whole discussion (from lack of time, not from
lack of interest), but what you propose above makes a lot of sense.

In the Gambit FFI there is no (documented) way to access Scheme
objects directly in C because of all the GC intricacies involved and
because I want to have the liberty to change the object
representation, GC, etc in the future.  The Gambit FFI always converts
parameters and return values between the Scheme representation and the
C representation.  This is sufficient for most "normal" uses of the
FFI.  The Gambit FFI also allows the Scheme program to introduce new
types, and the associated Scheme-to-C and C-to-Scheme conversion
functions (note that these functions typically need to access the
Scheme representation).  In the Gambit system, the main cases where C
code must access Scheme objects directly is for the implementation of
internal operations (for example to get at the encoding of an object,
the type tag, etc) and for speed and minimizing code bloat (for
example when it is known that a parameter or return value is an
integer that fits in a fixnum).

One interesting case I have come across where the ability to access
Scheme objects directly seems useful is when the concrete type depends
on some C compile-time switch and/or depends on the platform/OS.
Here's a simplified example from the Gambit runtime system.  To
implement file I/O the Gambit runtime relies on an "open_path" routine
written in C that calls the appropriate OS routine, i.e. "open" under
Unix, "CreateFile" under Windows, etc.  Under Unix, the file name is
of type "char*".  Under Windows (depending on a compile-time switch)
the file name is of type "unsigned char*" or "wchar_t*" (to support
Unicode file names).  Because at Scheme compilation time it is not
known (for portability) what the target platform/OS is, it is not
possible for the Scheme compiler to generate C code that directly
calls the appropriate conversion function.  To get around this
problem, the file name parameter can be defined to be of type
"scheme-object" (rather than "nonnull-char-string",
"nonnull-ucs2-string", ...) and it is the C function "open_path" that
calls the appropriate conversion function.  This is not an elegant
solution because it is tedious to write the conversion code (in
"open_path") by hand.  A more elegant solution is to use the FFI's
type definition form.  The idea is to define a "FILENAME" type and use
that as the file name type in "open_path" and other I/O operations,
something like:

(define-c-type FILENAME "FILENAME") ; Scheme type name and C type name
(define-c-type STREAM "STREAM")     ; Scheme type name and C type name
(define-c-function (open-path FILENAME) STREAM "open_path")

To avoid the tedium of writing by hand the associated Scheme-to-C and
C-to-Scheme conversion functions (i.e. "FILENAME_FROM_SCHEME" and
"FILENAME_TO_SCHEME", "STREAM_FROM_SCHEME", ...), a way to define type
aliases on the C side of the FFI is useful so that one can write in a
C header file something like:

/* file: "my_types.h" */
#include "basic_types.h"
#ifdef UNICODE

[If someone knows of a shorter way to do the same with some fancy C
macros please let me know!]

and the function open_path is then written

#include "my_types.h"
STREAM open_path (FILENAME fn) { ... }

I believe something like this should be supported by SRFI-50.