seeds
Alex Shinn
(08 Oct 2015 05:03 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(08 Oct 2015 07:55 UTC)
|
Re: seeds
Arthur A. Gleckler
(08 Oct 2015 17:46 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(08 Oct 2015 19:38 UTC)
|
Re: seeds
Arthur A. Gleckler
(08 Oct 2015 21:47 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(09 Oct 2015 08:09 UTC)
|
Re: seeds
Kevin Wortman
(09 Oct 2015 17:18 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(09 Oct 2015 18:33 UTC)
|
Re: seeds
Kevin Wortman
(13 Oct 2015 17:36 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(13 Oct 2015 18:28 UTC)
|
Re: seeds
Alex Shinn
(14 Oct 2015 01:59 UTC)
|
Re: seeds taylanbayirli@xxxxxx (14 Oct 2015 09:17 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(14 Oct 2015 10:06 UTC)
|
Re: seeds
Alex Shinn
(16 Oct 2015 01:23 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(16 Oct 2015 13:34 UTC)
|
Re: seeds
Alex Shinn
(16 Oct 2015 23:48 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(17 Oct 2015 12:08 UTC)
|
Re: seeds
Alex Shinn
(17 Oct 2015 13:12 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(17 Oct 2015 14:09 UTC)
|
What SRFIs are for
John Cowan
(17 Oct 2015 14:41 UTC)
|
Re: What SRFIs are for
taylanbayirli@xxxxxx
(17 Oct 2015 15:56 UTC)
|
Re: What SRFIs are for
John Cowan
(17 Oct 2015 16:55 UTC)
|
Re: What SRFIs are for
taylanbayirli@xxxxxx
(17 Oct 2015 18:08 UTC)
|
Re: What SRFIs are for
John Cowan
(17 Oct 2015 18:51 UTC)
|
Re: seeds
John Cowan
(15 Oct 2015 17:49 UTC)
|
Re: seeds
Alex Shinn
(09 Oct 2015 02:54 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(09 Oct 2015 07:59 UTC)
|
Re: seeds
John Cowan
(15 Oct 2015 17:51 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(15 Oct 2015 23:08 UTC)
|
Re: seeds
John Cowan
(16 Oct 2015 13:09 UTC)
|
Re: seeds
taylanbayirli@xxxxxx
(16 Oct 2015 14:01 UTC)
|
Alex Shinn <xxxxxx@gmail.com> writes: > On Wed, Oct 14, 2015 at 3:28 AM, Taylan Ulrich Bayırlı/Kammer > <xxxxxx@gmail.com> wrote: > > If you mean the proposal of changing the signature of hash > functions to hash(object, bound, seed), I explained some issues > with that in previous mails on the list. (Please correct me if any > of my explanations was wrong.) > > The explanation as I understand is simply that existing > implementations don't take seed arguments. This seems > a non-issue to me since as I explained, the lazy implementor > can simply provide a wrapper which ignores the seed. Once > you add the seed argument internally, however, you get all > the benefits of security and multi-hashing which you didn't > have before. It's not that existing implementations of hash functions don't take seed arguments. It's that existing implementations of hash tables don't pass seed arguments to hash functions. There is no lazy "surface-only" fix to that, AFAICT; it will require changes to the core of an implementation's hash table implementation, e.g. scm_hash_fn_get_handle in hashtab.c of libguile. Compare that to my SRFI-126 implementation for Guile which is purely a Scheme module. > > And the only complaint lodged against using a parameter object > > seems to be that referencing it costs one procedure call, which > > seems perfectly acceptable for an infrequent operation in a > > functional-leaning language. > > I would currently summarize my main complaints as: > > - Things will crash and burn when a hash table is operated on from > two dynamic extents with different values for the hash salt. > > John Cowan previously suggested a parameter, and I > consider that idea broken for this very reason. The seed > should be hidden from the user. However, your proposal > of making every custom hash function author maintain > their own seed handling logic is similarly a recipe for bugs. If I'm not mistaken, most custom hash functions ultimately delegate the true hashing calculation to one of the built-in hash functions. For instance my application might have a record type to represent certain objects in the real world (or at least outside of Scheme semantics); one such real object could be represented in my program by any number of instances of that record type; an instance only needs to hold a unique identity value in an "id" field to be considered a Scheme representation of the real object identified by that identity value. (For instance, a <student> record type where the id field holds the unique student ID.) In that case I want equality on my records to be defined in terms of their ID value and not their eq?/eqv? equality. So I will define (same-student? a b) as (equal? (student-id a) (student-id b)), and (student-hash student bound) as (equal-hash (student-id student) bound). In my experience this is the most common use-case for custom equality and hashing. I've never encountered any other use-case. Of course my experience is pretty limited. I would be interested in learning of other use-cases. It's just that so far I'm assuming that any use-case which involves using *truly* custom hashing is going to be fairly obscure and involved, such that expecting the programmer to take care of salting without any API help is acceptable. In fact, I don't really see much of a difference in first place, between passing a salt to a custom hash function, and expecting it to derive one from SRFI_126_HASH_SEED. The latter is merely a bit of annoyance to get an integer from a string. A user so security-unconscious as to skip that is likely also going to plainly ignore the salt value passed to the custom hash function. Wouldn't you agree that the small theoretical benefit to be gained in those very rare use-cases is no justification to expect every Scheme implementation to make a change to the core of their implementation? > - Since the only real use-case is determinism for unit tests > > No, unit tests are a minor use case. Any tests involving > hash tables should be written to not depend on the order > of elements in the table. The only exception is testing the > hash functions themselves, for which you can't do much > better than: > > (test <expected> (my-hash obj seed bound)) > > for several values. Without an explicit seed you need > some way to specify the global value. An env variable > is a poor way to handle this, since it places additional > restrictions on the test running framework (e.g. there > is currently no way in Snow to specify env vars to be > set when running tests). > > But the primary motivation for determinism is in scientific > computing, where you want resumable and/or verifiable > results. One important case is mapreduce, where when > running a long computation on thousands of machines > at once, hardware failure is the norm rather than the > exception. This can be addressed by running every > computation twice, and rerunning when the results differ. > As in tests, you can take effort to sort outputs and > explicitly remove dependence on hash order, but you > may be using third-party libraries which you can't change, > or may not be able to afford the extra computation. > This is where deterministic hashing can be useful. From your explanation it sounds like SRFI_126_HASH_SEED is sufficient for the described use-case, or am I missing something? > And all of this is a minor point compared to the fact > that the current SRFI-126 proposal is ugly, makes > more work for everybody, and makes custom hashes > second class citizens. I have no interest in this API. "Ugly" on its own is purely subjective. Please clarify. I've explained that SRFI-126 makes little extra work (especially compared to your proposal) for Scheme implementations, and showcased this on one implementation so far. There is a big difference between being able to support an SRFI purely as a library, and through changes to the core of an implementation. I'm looking forward to hearing an explanation of how it is a problem to get a salt from SRFI_126_HASH_SEED instead of receiving one as an argument, in the truly-custom-hashing use-cases you have in mind which I don't know of at all yet. We can even specify (get-hash-salt) if you want, then the difference will be truly trivial; a matter of wrapping your hash function body in (let ((salt (get-hash-salt))) ...) and voila. Custom hash functions for double-hashing implementations are still "second-class" then in the sense that a user needs to provide two procedures, but *even* then, the difference is from (define (foo-hash obj bound salt) ...) to (define (make-foo-hash salt) (lambda (obj bound) ...)) (define foo-hash1 (make-foo-hash x)) (define foo-hash2 (make-foo-hash y)) and the second is backwards compatible with R6RS, compatible with all existing Scheme implementations, compatible even with SRFI-69, and otherwise familiar to programmers. Taylan