Email list hosting service & mailing list manager


Hash procedures Marc Feeley 26 Apr 2005 14:39 UTC

SRFI-69 proposes the following hash procedures

     hash            consistent with equal?
     string-hash     consistent with string=?
     string-ci-hash  consistent with string-ci=?
     symbol-hash     consistent with eq? on symbols only

These names seem to have been chosen to highlight the
name of the type of object being hashed.  Better would be
a naming convention that makes explicit the link between the
hash procedure and the comparison procedure (after all there
are many application-specific ways to define the equality of
two values of a given type).  I would suggest the following
(taken from Gambit):

                      SRFI-69
     equal?-hash      hash
     eqv?-hash        no equivalent
     eq?-hash         no equivalent, but subsumes symbol-hash
     string=?-hash    string-hash
     string-ci?-hash  string-ci-hash

This consistent naming scheme reduces the programmer's
"intellectual clutter".

I suggest dropping the symbol-hash procedure, for the more
general eq?-hash procedure.  Moreover, eqv?-hash can be used
when the keys are numeric (re: make-integer-hash-table).

I also suggest adding the constraint that hashing a string
x with string=?-hash and a symbol y with equal?-hash, eqv?-hash,
or eq?-hash, must yield the same hash number when
(string=? x (symbol->string y)).  This is useful
when dealing with textual data in a mixed string/symbol
representation, that is you need to use something like
the following key comparison procedure

    (define (text-equal? x y)
      (string=?
        (if (string? x) x (symbol->string x))
        (if (string? y) y (symbol->string y))))

then you can still use the default equal?-hash.

Marc