Re: Comparing Pika-syle and JNI-style Tom Lord 15 Jan 2004 01:59 UTC
> From: Per Bothner <xxxxxx@bothner.com> > The problem Tom is referring to is (I assume) misidentifying a pointer > as a non-pointer. That can happen if: > (a) You didn't tell the collector to scan the area containing the > pointer (most common problem). > (b) the pointer is "mangled", either through "clever" coding (such as > the xor-trick for double-linked lists) or an optimizing compiler > being to clever. The former is a 'don't do that". The latter is > very rare, but can conceivably happen if the compiler generates an > offsetted pointer while without leaving any reference to the actual > object start. Boehm GC can be configured to also check "interior > pointers"; this reduces the problem, this hurts performance. > See http://www.hpl.hp.com/personal/Hans_Boehm/gc/issues.html > especially the "Safety" section. It was not quite either of those. In this case, the optimizer wasn't even being very clever -- it was just reusing a register. Picture a Scheme string representation like this: scheme_value | V --------------------------------------------- | length, tag bits, gc gbits | (char *) o | -----------------------------------------|--- | V malloced string (That the string is separately allocated makes little difference here. Were it inline with the GC-controlled object the same kind of bug would be just as likely to occur. You could just as well picture: scheme_value | V -------------------------------------------------------- | length, tag bits, gc gbits | the string itself .... | -------------------------------------------------------- Given the scheme_value, I read the address of the malloced (or inline) string and operate on that. At this point the scheme_value, if I'm not otherwise using it, is a dead variable as far as C is concerned. Should the scheme value be collected while I'm working on the string, the string data will be freed out from under me. I must take additional steps to keep the scheme_value live. Picture (buggy) code like: { SCM scheme_string = some_init (); char * data = SCM_STRING_DATA (scheme_string); [... do stuff that can cause GC but doesn't directly use scheme_string ...]; return SCM_BOOL_F; } It needs to be corrected at least to something like: { SCM scheme_string = some_init (); char * data = SCM_STRING_DATA (scheme_string); [... do stuff that can cause GC but doesn't directly use scheme_string ...]; scm_remember (scheme_string); return SCM_BOOL_F; } or even (depending on the details of the object representations and the situation with async execution or threads): { SCM scheme_string = some_init (); char * data; scm_remember_pointer (&scheme_string); data = SCM_STRING_DATA (scheme_string); [... do stuff that can cause GC but doesn't directly use scheme_string ...]; scm_remember_stuff (); return SCM_BOOL_F; } at which point I stop and ask myself "Why is it, again, that I'm not just using precise GC instead?" > Tom Lord wrote: > > > On a hunch, you review some of > > the functions that you think your program is exercising to an unusual > > degree and, sure enough -- find a conservative GC bug. > > What kind of "cerservative GC bug"? Is this with the Boehm GC? Are > these C functions, Scheme functions, or what? Is it an optimizer bug? > -- It was not with Boehm but this and similar problems apply to Boehm as far as I know. The even worse problem, as far as I'm concerned, is that conservative collectors (including Boehm) admit subtle malicious attacks that programmers simply can not protect themselves from (more in the direction of failing to free values rather than freeing them early). I don't think that Boehm himself disagrees with any of my factual claims -- only with our subjective assessments of how serious those are and how one should promote conservative GC as a result. And, to be sure -- he's got the empirical edge on me if you measure conseravative GC for its economic value minus its econmic costs ---- so far (and so far as we know). My bet is that it's just a matter of (within our lifetime) time before the balance shifts in my favor due to a malicious exploit (unless conservative techniques simply fall out of favor). Since conservative is ultimately no easier to use than precise: why take that bet; why accept that risk? Why not just eliminate the issue by barring conservative GC from all critical systems? That conservative is vulnerable to malicious attack greatly skews any attempt you might make to estimate the probability of a critical failure: really, it's a function of the value to the attacker of that failure. -t