Re: must SRFI-128 mustard be taken seriously?

Re: must SRFI-128 mustard be taken seriously? William D Clinger 11 May 2016 00:44 UTC
John Cowan wrote:
> > so h0 must be a fixed point of f1.  In like manner, all of the hash
> > values that can be computed for all other circular structures would
> > have to be fixed points of f1.
>
> I follow your reasoning up to the last point.

I'm basically saying the same argument holds for circular lists
as well as for vectors.  The argument would also hold for any
other structures that might be circular, such as procedures
whose representations involve circularity (as is often the case
because that's a standard way to implement recursion and scope).

> I think the requirements
> are satisfied if hashing a vector with one element produces the same hash
> value as the element.  Is this extraordinarily expensive?

It precludes the standard technique for implementing terminating
hash functions that are compatible with equal?.  That implies it
precludes the standard technique for implementing a hash function
that's compatible with SRFI 126.  As consequence, the SRFI 128
default-hash is likely to be both more dangerous (because it may
not terminate) and substantially slower than SRFI 126's equal-hash.
Those two significant disadvantages of SRFI 128's default-hash do
not appear to have any countervailing advantages.

It is of course possible to implement an always-terminating version
of default-hash that conforms to the SRFI 128 mustard, but the only
practical way I know to do it would use hash tables, making it even
slower and more memory-intensive than it is now.

To give you an idea of how serious this is, the sample implementation
of SRFI 128 defines a default-hash procedure that's more than 500
times as slow as the equal-hash procedure of SRFI 126, without
delivering any significant improvement in hash quality (7.02 vectors
per hash bucket instead of 7.04).  Those measurements were made over
all of the non-circular vectors present within Larceny's R7RS heap.
(The restriction to non-circular vectors emphasizes another significant
advantage of SRFI 126 equal-hash over SRFI 128 default-hash.)

I can't quantify the extent to which SRFI 126 equal-hash outperforms
SRFI 128 default-hash on non-circular lists, because the default-hash
procedure defined by the sample implementation of SRFI 128 uses so
much memory that it exceeds Larceny's two-gigabyte limit.  SRFI 126's
equal-hash requires very little memory when it runs that same benchmark
to completion.

> The intention
> is simply to assure that the hash of an object containing more than one
> element depends on the hashes of the elements.

If that's the intent, then the SRFI 128 mustard was intended to
preclude efficient implementation of an always-terminating version
of default-hash, because the standard way to achieve an efficient
always-terminating hash function looks at only a bounded portion
of its argument.

> > Can anyone even conceive of writing a program that wouldn't work if
> > default-hash and SRFI 126 equal-hash were the same procedure?
>
> It seems to me that the chief difference is that default-hash must respect
> any registered comparators, whereas equal-hash has implementation-defined
> behavior when applied to records or other non-standard Scheme objects.

If it weren't for the SRFI 128 mustard that effectively insists
upon having default-hash examine the entire structure it's hashing,
it would be easy to add that feature.

From what you're saying, it sounds as though I'll have to deprecate
SRFI 128 (which implies deprecation of SRFI 125 as well).

Will