> From: Eric Knauel <xxxxxx@informatik.uni-tuebingen.de> > On Thu 15 Jan 2004 00:16, Tom Lord <xxxxxx@emf.net> writes: >> I sampled some of the C code in a version of SCSH that I have on hand >> (0.5.2 -- sorry, a download for a more recent version was taking _way_ >> too long so I'll risk being embarassed that everything has changed >> since then). > Actually, that changed completely in the 0.6-series, it's almost > exactly the FFI Scheme48 is using. That's why migration of the > existing bindings for scsh *0.6-series* and Scheme48 is easy. Ok, then -- the motivation of the authors is now somewhat clearer to me. I have 0.6.5 now. I'm looking at the revamped posix_regexp_match (in regex1.c) and notice that: ~ it doesn't GC protect it's parameters as required by the SRFI (BTW: this appears to _not_ be a bug in the context of s48 because of assumptions the code makes about what can and can't cause collection. However, a simplistic conversion of this code to the analogous draft-FFI functions would, indeed, have a bug in this regard.) ~ it assumes that STRING_LENGTH returns an integer (SRFI says long) ~ it uses s48_raise_range_error which the SRFI doesn't provide ~ it contains the code: s48_raise_range_error (sch_start, s48_enter_fixnum (0), s48_enter_fixnum (len)) There is no _enter_fixnum in the draft and, properly, there is no number-constructing function in the draft which is not in the "(may GC)" category. Yet that code is not GC-safe if s48_enter_fixnum is replaced by a possibly GC-causing function. In syscalls1.c: ~ an instance of comparing to S48_FALSE using !=, an instance of comparing to S48_TRUE using !=, and two instances of comparing to S48_TRUE using == ~ general assumption that s48_extract_string is not in the the "may GC" class Of course the draft agrees with that but I point it out here to emphasize that the draft is fragile in this sense. If the primary motivation is to be able to publish a few 10K LOC from SCSH under a SRFI FFI then either the draft _can_not_ change extract_string to "may GC" or all of that code must be reviewed and fixed. ~ more use of error signalling functions not provided by the draft ~ this code which is incorrect under the current interpretation of the draft (because it is incompatible with copy collection): s48_cons (sch_result_cutime, s48_cons (sch_result_cstime, S48_NULL)) > > > - most of scsh > > > - bindings for ODBC (also for scsh) > > > - bindings for NIS and LDAP (also for scsh) > > I'd appreciate it if you could say more about this: quantities of > > code, filenames and distributions containing them, and what you think > > the effort of migration from native-scsh to draft-ffi would involve. Thank you for replying, to that, btw. > The scsh CVS repository at sourceforge.net contains ODBC and LDAP > bindings in the modules scsh-ldap[1] and the directory > scsh/scsh/odbc[2]. The LDAP bindings are almost complete and about > 1200 LOC C-code and 1100 LOC Scheme-code (about 300 LOC automatically > generated). The ODBC bindings consist of about 3000 LOC C-code > (partially tricky) and about 2000 LOC Scheme-code. > Currently, I'm busy cleaning up the ODBC bindings and changing them to > use the SRFI 34/35 exception system. Building the c-stub as a shared > module that can be dlopen()'ed by scsh and Scheme48 is also on my > list. > I'm very confident that migrating those bindings to the SRFI-FFI is > not much work. Checking whether the GC annotations are (still) > correct and a few search/replace-operations should be enough. (1200 - 300) + 3000 * trickiness_bonus ~= 7000 I'm confident too that migrating s48 bindings to the draft is not, in some sense, much work. That isn't my point. I have two points, actually: 1) The kinds of bugs I found in syscalls1.c and regexp1.c are a big deal in at least three respects: a) They suggest that to the degree rapidly releasing this code under the draft FFI is a priority for the authors, the draft is constrained _by_this_code_ to not change in what would otherwise be some fairly minor ways. (For example, that _extract_string might GC.) In other words, the degree of value the authors place on getting this particular code out easily is the same degree they face a conflict of interest when it comes to modifying the draft. b) These bugs include some that _will_ be bugs under the draft FFI such as the pervasive assumption that enter_fixnum can not GC and the occaisional vestige of C == and != comparisons to certain "constants". The nested calls to s48_cons are another example. c) The style of the code in posix_regexp_match -- in particular that it is written with very strong assumptions (stronger than the draft's in fact) about when GC can occur -- suggests to me that (i) the proposed FFI is fairly hard to use and (ii) it's very fragile and constraining of implementors. The trickiness that (in s48, not in the draft) permits parameters to go unprotected in posix_regexp_match is an example of why the proposed interface is hard to use well. That this same code becomes wrong under the fairly minor differences between the s48 ffi and the draft illustrates how fragile the draft is. 2) I don't mean to diminish the work that has gone into this stuff but we seem to be talking about, what, 20K LOC all told? That's 20K LOC that, to be correct under the draft, will have to be reviewed for the kinds of errors I found in syscalls1.c and regexp1.c. Meanwhile -- what happens if (a) the draft is finalized; (b) a bunch of implementors provide it; (c) by hook or by crook a certain amount of the SCSH code winds up being widely used. Then we have a superficially credible Scheme FFI contradicted only by the discussions on this list. Will it then be considered a success if a few months later instead of 20K LOC depending on it we have, scattered in various projects, 200K LOC depending on it? -t