A proposed new approach to bounds vs. salt John Cowan (26 Dec 2015 02:12 UTC)
|
Re: A proposed new approach to bounds vs. salt
taylanbayirli@xxxxxx
(26 Dec 2015 14:00 UTC)
|
Re: A proposed new approach to bounds vs. salt
Alex Shinn
(30 Dec 2015 05:37 UTC)
|
Re: A proposed new approach to bounds vs. salt
John Cowan
(03 Jan 2016 22:26 UTC)
|
Re: A proposed new approach to bounds vs. salt
Shiro Kawai
(04 Jan 2016 02:08 UTC)
|
Re: A proposed new approach to bounds vs. salt
Alex Shinn
(07 Jan 2016 04:26 UTC)
|
Re: A proposed new approach to bounds vs. salt
Kevin Wortman
(15 Jan 2016 00:45 UTC)
|
Well, I hope this doesn't get drowned in the flood of holiday-related Scheme mail, but I've been thinking about this for a while and this is my chance to get it written down. This is about hash functions, which are important for both hash tables and for independent use, which is why I'm cross-posting this to the lists for both flavors of hash tables currently being discussed as well as to the comparator list, since hash functions are an element of comparators. To recap the existing situation with bounds: R6RS and SRFI 126 have no notion of it, and SRFI 69 does not actually require that a hash function accepts it, though it does require that the four standard hash functions that a SRFI 69 implementation must provide must accept it and respect it. The SRFI 69 *sample* implementation always provides it and errors out if a hash function does not accept it and respect it, but this is not true of all other SRFI 69 implementations. Consequently, a reusable hash function needs to accept a mandatory object argument and a second optional argument. I'm going to claim that bounds aren't really necessary for hash functions; a hash function should simply strive to return a number in the range from 0 to the largest conceivable number of elements in a data structure, which is what SRFI 69 says hash functions should do if they don't have a bound argument. Obviously, that's implementation-dependent, and so I suggest that it be made available as the value of the function (hash-default-bound). This eliminates any need to pass it to an individual hash function. Salt is another matter: it's not provided for in any existing Scheme standard, but many other languages provide it as a defense against denial of service attacks. However, those languages typically don't expose their hash functions, nor do they permit tables with arbitrary hash functions, so some innovation is necessary. The hash function must be aware of the salt when it starts working, so that the salt value gets hashed by the function along with the object. Now the simplest approach is simply to pass the salt as an argument to the hash function, but it easily gets mixed up with passing the bounds. I'm hopeful that as bounds go out, salt will come in, but I want to avoid misinterpretation. So I kicked around various ideas and came up with this: Bounds are always positive. So let's make salt always negative, or else not an exact integer at all. When a hash function compliant to SRFI 125/6/8 receives a positive exact integer as its second argument, it should treat it as bounds, but is not obliged to do so unless the programmer wants to make it compatible with the SRFI 69 reference implementation. But otherwise the second argument is interpreted as salt. If there is no second argument, then the bound is the value of (hash-default-bound) and the salt is the value of (hash-default-salt), which is also implementation-dependent but should be changed by the implementation on each run. For debugging purposes, the implementation may also provide an implementation-dependent way to set the value returned by (hash-default-salt), such as examining a particular environment variable. Comments? -- John Cowan http://www.ccil.org/~cowan xxxxxx@ccil.org Fundamental thinking is ha-ard. Let's go ideology-shopping. --Philosopher Barbie