Email list hosting service & mailing list manager

Comments on SRFI 69 David Van Horn (11 Aug 2005 14:38 UTC)
Re: Comments on SRFI 69 Panu Kalliokoski (12 Aug 2005 09:28 UTC)
Re: Comments on SRFI 69 Panu Kalliokoski (12 Aug 2005 09:53 UTC)
Re: Comments on SRFI 69 felix winkelmann (12 Aug 2005 12:06 UTC)
Re: Comments on SRFI 69 David Van Horn (12 Aug 2005 19:09 UTC)
Re: Comments on SRFI 69 felix winkelmann (12 Aug 2005 20:12 UTC)
Re: Comments on SRFI 69 David Van Horn (12 Aug 2005 20:29 UTC)
Re: Comments on SRFI 69 felix winkelmann (12 Aug 2005 20:35 UTC)

Comments on SRFI 69 David Van Horn 11 Aug 2005 14:23 UTC

The SRFI document states the following in the Abstract:

    This SRFI specifies an API for basic hash tables. Hash tables are data
    structures that provide a mapping from some set of keys to some set of
    values associated to those keys.

 From your description of what hash tables are, the name "hash table" seems
too specific of a term.  This issue was raised earlier by Marc Feeley, and as
far as I can tell was never addressed in the document or discussion list.
Perhaps, I'm just missing it.  However, the more important issue here is that
what follows in the document is far from a basic API for structures that
provide a mapping from some set of keys to some set of values associated to
those keys.

One of the most common ways SRFIs go wrong is that their purpose is not
clearly articulated.  Without a clear thesis, it is impossible to evaluate
design choices and rationales, or even provide helpful suggestions.  Luckily
this SRFI does state its aims, however there are conflicts with what is
stated.  Further, design choices have been made that violate these aims, and
rationales rarely appeal to the aims in a consistent manner.

The SRFI document states the following in the Rationale:

    The primary aim of this SRFI is to provide a simple and generic hash table
    API that will answer most of users' needs for basic usage of hash tables.

This conflates two disparate and competing aims into one; that of providing a
simple and generic API for a data structure which "provides a mapping from
some set of keys to some set of values associated to those keys," and that of
covering the general usage patterns of hash tables (whatever those patterns
may be).  The abstract makes no mention of most common hash table usage, so
this second aim seems out of the scope of this SRFI.  But several choices are
made contrary to the aim of simplicity and generality, such as the ad-hoc
collection of type-specialized hash table procedures and their hash function
counterparts.  Appeals for generality in the API have been discounted by the
author saying such things as, "I'd rather define these routines to account for
the most common situation(s) and be done with it."

On the other hand, there is widespread use of immutable hash tables, tables
with weakly held keys, concurrency, and GC-sensitive tables, but this SRFI
addresses none of those common usages or the issues that arise in their
presence, and is therefore deficient on this second stated aim.

By conflating these aims, the design choices lack a clear purpose and often
seem to reflect the authors personal preferences rather than a reasoned
rationale.  Indeed, it is difficult if not impossible to evaluate a design
choice when there is not a clear and consistent aim for the SRFI.  I think
this document would greatly benefit a more explicit statement of its purpose,
resolving the conflict of the current aims.  As it exists now, the SRFI
neither provides a basic API for a key-value mapping datastructure, nor
provides an API covering most, or even common, users' hash table needs.

Also, I think a key aim that this SRFI should have, but does not, is the
selection of names that represent consistent conventions with existing Scheme
practices.

The SRFI document states the following in the Rationale:

    Hash tables are widely recognized as a fundamental data structure for many
    kinds of computational tasks. Almost every non-minimal Scheme
    implementation provides some kind of hash table functionality.

This is certainly true.  The majority of Scheme's I'm familiar with include a
hash table datastructure and their common operations, however the names vary
highly.  This highlights what should be a primary concern in the design of
this SRFI, but which has been neglected; to identify a consistent, and
portable set of names and parameter conventions.

The author has chosen the name and parameter conventions that run counter to
several existing Scheme conventions, such as previous SRFIs, RnRS, and
numerous Scheme implementations.  Some choices have no precedent whatsoever.
To choose such conventions is perfectly allowable, but the advantage of these
new conventions must be thoroughly articulated and compelling.  I don't think
that is the case here.  Further, if the aim of this goal is to cover common
hash table usage, unprecedented names and conventions run counter to this aim;
something which has never been used before is not common.

My preference for this SRFI is the following.  Drop the aim of covering common
hash table usage.  Writing such a SRFI is a very ambitious and difficult thing
to do, and it requires an extensive amount of surveying common use.  I would
expect such a SRFI to discuss the design choices taken by most Scheme
implementations, the several related SRFIs, as well as similar libraries from
related languages such as ML and Lisp.  SRFI 1 is a good example of such a
"common use" SRFI.  Shivers surveyed R4RS/R5RS Scheme, MIT Scheme, Gambit,
RScheme, MzScheme, slib, Common Lisp, Bigloo, guile, T, APL and the SML
standard basis in designing that library.  A good common use hash table SRFI
would need to do likewise.

Instead, this SRFI should focus on providing a simple and generic API for data
structures that provide a mapping from some set of keys to some set of values
associated to those keys.  All parts of this SRFI that do not contribute to
that aim should be dropped.  All rationales that do not appeal this aim,
should be abandoned.  I would like to see this API be consistent with the
existing datastructure API conventions that exist in Scheme.  Most notably,
this SRFI should be consistent in its choice of names and parameter
conventions with SRFI 44 [1].  If the author chooses against these
conventions, this SRFI then conflicts and competes with SRFI 44 (and others)
and as such the "rationale should explain why the present proposal is a
substantial improvement" over these existing conventions, as required by the
process document.

This SRFI will be an important one.  People will turn to it regardless of how
well or poorly constructed it is.  Without a clear and consistent aim, which I
believe is the case now, such a SRFI can do a great deal of harm.

David

[1] This issue of SRFI 44 names was raised as the first comment during the
discussion period by Scott Miller, to which Bear voiced criticism over SRFI 44
on implementation and usability grounds.  This is irrelevant.  SRFI 44
included a great deal of work on identifying the proper names for precisely
this kind of datastructure.  The choices include rationales, some of which
have been appealed to in deciding names in this SRFI.  If this SRFI is not
going to use the names of SRFI 44, it *must* include compelling rationales for
these names over the names (and rationales) identified in SRFI 44.  Many of
the choices made thus far in the SRFI directly conflict with SRFI 44,
sometimes in very confusing ways, such as hash-table-equivalence-function,
which a reader of SRFI 44 would expect to return an equivalence over the items
in the hash table collection, i.e. key value pairs, whereas
hash-table-key-equivalence-function would return what SRFI 69 returns for
hash-table-equivalence-function.