Email list hosting service & mailing list manager

seeds Alex Shinn (08 Oct 2015 05:03 UTC)
Re: seeds taylanbayirli@xxxxxx (08 Oct 2015 07:55 UTC)
Re: seeds Arthur A. Gleckler (08 Oct 2015 17:46 UTC)
Re: seeds taylanbayirli@xxxxxx (08 Oct 2015 19:38 UTC)
Re: seeds Arthur A. Gleckler (08 Oct 2015 21:47 UTC)
Re: seeds taylanbayirli@xxxxxx (09 Oct 2015 08:09 UTC)
Re: seeds Kevin Wortman (09 Oct 2015 17:18 UTC)
Re: seeds taylanbayirli@xxxxxx (09 Oct 2015 18:33 UTC)
Re: seeds Kevin Wortman (13 Oct 2015 17:36 UTC)
Re: seeds taylanbayirli@xxxxxx (13 Oct 2015 18:28 UTC)
Re: seeds Alex Shinn (14 Oct 2015 01:59 UTC)
Re: seeds taylanbayirli@xxxxxx (14 Oct 2015 09:17 UTC)
Re: seeds taylanbayirli@xxxxxx (14 Oct 2015 10:06 UTC)
Re: seeds Alex Shinn (16 Oct 2015 01:23 UTC)
Re: seeds taylanbayirli@xxxxxx (16 Oct 2015 13:34 UTC)
Re: seeds Alex Shinn (16 Oct 2015 23:48 UTC)
Re: seeds taylanbayirli@xxxxxx (17 Oct 2015 12:08 UTC)
Re: seeds Alex Shinn (17 Oct 2015 13:12 UTC)
Re: seeds taylanbayirli@xxxxxx (17 Oct 2015 14:09 UTC)
What SRFIs are for John Cowan (17 Oct 2015 14:41 UTC)
Re: What SRFIs are for taylanbayirli@xxxxxx (17 Oct 2015 15:56 UTC)
Re: What SRFIs are for John Cowan (17 Oct 2015 16:55 UTC)
Re: What SRFIs are for taylanbayirli@xxxxxx (17 Oct 2015 18:08 UTC)
Re: What SRFIs are for John Cowan (17 Oct 2015 18:51 UTC)
Re: seeds John Cowan (15 Oct 2015 17:49 UTC)
Re: seeds Alex Shinn (09 Oct 2015 02:54 UTC)
Re: seeds taylanbayirli@xxxxxx (09 Oct 2015 07:59 UTC)
Re: seeds John Cowan (15 Oct 2015 17:51 UTC)
Re: seeds taylanbayirli@xxxxxx (15 Oct 2015 23:08 UTC)
Re: seeds John Cowan (16 Oct 2015 13:09 UTC)
Re: seeds taylanbayirli@xxxxxx (16 Oct 2015 14:01 UTC)

Re: seeds taylanbayirli@xxxxxx 16 Oct 2015 13:34 UTC

Alex Shinn <xxxxxx@gmail.com> writes:

> On Wed, Oct 14, 2015 at 6:09 PM, Taylan Ulrich Bayırlı/Kammer
> <xxxxxx@gmail.com> wrote:
>
>     It's not that existing implementations of hash functions don't
>     take seed arguments. It's that existing implementations of hash
>     tables don't pass seed arguments to hash functions.
>
> Ah, ok. You can still provide wrapper hash-tables to get around
> this, but I'm not particularly moved by implementation compatibility
> arguments. Existing library code matters more, but I'm willing to
> break that in favor of clean design, and anyway the two proposals
> are roughly equally compatible in this respect.

I think implementation compatibility is very important, because there is
already a huge disappointment in the RnRS among some parties, and making
incompatible changes for questionable benefit won't improve that case.
Likewise, backwards compatibility or even just familiarity with existing
RnRS and SRFIs puts us on a safe lane so long as we know they have no
serious flaws.

So far I'm afraid I'm still not convinced that these novel semantics for
hash functions solve any serious problem in SRFI-69 or R6RS hashtables.
The fact that pretty much no existing Scheme implementation does
anything like this should hint us that it's of questionable benefit.

>     If I'm not mistaken, most custom hash functions ultimately
>     delegate the true hashing calculation to one of the built-in hash
>     functions.
>
> One counter-example is string-nfc=? which compares strings with
> Unicode Normalization Form C.

I'm sure there's a handful of counter-examples, but they are rare
use-cases, can be easily elevated to "first class" status by an
implementation, or implemented portably without any real problem.

Compare

    (define (string-nfc-hash string bound salt)
      ...)

with

    (define (make-string-nfc-hash salt)
      (lambda (string bound)
        ...))

    (define string-nfc-hash
      (make-string-nfc-hash (hash-salt)))
    (define string-nfc-hash/double
      (make-string-nfc-hash (+ 1 (hash-salt))))

and

    (define table (make-hashtable string-nfc-hash string-nfc=?))

with

    (define table (make-hashtable (cons string-nfc-hash
                                        string-nfc-hash/double)
                                  string-nfc=?))

That's a small, constant amount of boilerplate for a rare use-case,
traded for improved compatibility and familiarity with all existing
implementations and specifications of hash tables in Scheme.

By the way, your proposal forces one to use the same algorithm for both
functions of double hashing, just with a different salt.  Might one not
want to choose a "cheaper" algorithm for the second hash function, to
make a certain correctness/performance trade-off?

>     Wouldn't you agree that the small theoretical benefit to be gained
>     in those very rare use-cases is no justification to expect every
>     Scheme implementation to make a change to the core of their
>     implementation?
>
> Absolutely not. "SRFI" stands for "Scheme Request For Implementation."
> The whole point is to provide new, better, standard implementations.

Emphasis on "small theoretical benefit in very rare use-cases," which is
not what we want to request being implemented, I hope.

>     From your explanation it sounds like SRFI_126_HASH_SEED is
>     sufficient for the described use-case, or am I missing something?
>
> Yes, you missed:
>
>> [...] An env variable is a poor way to handle this, since it places
>> additional restrictions on the test running framework (e.g. there is
>> currently no way in Snow to specify env vars to be set when running
>> tests).

I see.  I would say that's a limitation of Snow though, or rather R7RS,
since there's no 'set-environment-variable' and no 'system*'.

I'm also not sure how changing the hash function signatures helps with
this problem.  You still need a way to tell the hash table API to pass
deterministic salts to the functions.  I suppose you could create
wrapper hash functions which pass constant salts to the built-in hash
functions.  That's not a lot of work, but definitely more so than an
environment variable, command line switch, or similar.  And it doesn't
cover eq? and eqv? hashtables.  Did you have another solution in mind?

> I think tests should be self-explanatory, not depending on external
> factors, so that they can be run automatically by tools and editors
> and such.

Agreed, except in this case I see no way to achieve that which doesn't
have any of the other issues we went over (e.g. with parameters).

>     > And all of this is a minor point compared to the fact
>     > that the current SRFI-126 proposal is ugly, makes
>     > more work for everybody, and makes custom hashes
>     > second class citizens. I have no interest in this API.
>
>     "Ugly" on its own is purely subjective. Please clarify.
>
> Apart from the other points listed, I'd add the constructor
> argument which can be either a function or pair of functions.
> This is an unintuitive polymorphism, and arbitrarily limits to
> either 1 or 2 but not more (when more can be useful).

It might be that the notion of a "pair" is slightly abused here, but
still, it doesn't get much more intuitive than passing a pair of hash
functions to an API that needs a pair of hash functions.

I think it's OK to discriminate against the zero Scheme implementations
which require more than two hash functions for their hash table API. :-)

I'd like to reiterate some my priorities (or maybe I didn't make them
clear before):

- Write standards that won't stir up even more controversy than we
  already have, and instead have a good chance of adoption.

- Write standards that fulfill common use-cases in straightforward ways,
  instead of optimizing for hypothetical use-cases, so long as we don't
  get locked up in a local maximum.

(These are derived from the higher-level goal of "make standard Scheme
stop being a joke.")

If I find time I'll try to figure out what some other people would think
of changing the hash function signatures.  While I'm not convinced they
bring realistic improvements, they are also not as incompatible as I had
thought, so I'd like to let the majority of implementors decide.

Taylan