Re: hash function bounds summarized John Cowan 30 Nov 2015 04:26 UTC
Alex Shinn scripsit: > This assumption is plainly wrong as it depends on the actual range > of fixnums. However, R6RS mandates at least 24-bit fixnums, and looking at FixnumInfo, we see that all the widely used Schemes have at least 29-bit fixnums. 2^28-1 is a *lot* of buckets on a 32-bit machine. There are a few Schemes that have the same size fixnums on both 32-bit and 64-bit builds (Elk, IronScheme, Larceny, Mosh, RScheme), but I imagine it wouldn't be horrendously hard to change that, except perrhaps for IronScheme because of its CLR dependency. > What makes hash functions special is that they form a contract > interacting with the hash table size limit, a contract which is exposed > to third parties who can write their own hash functions. The concern > then is that we could inadvertently make it _impossible_ to write a > hash table implementation which supported certain hash tables which > would otherwise be possible, i.e. we could be forcing an artificial > upper bound. Well, yes. If a user-written hash function never delivers a result larger than 2^20, then hash tables that would naturally require more than 2^20 buckets aren't going to work properly. I don't see how you can avoid this. > but a majority of Schemes use 2 bits, (Technically only a plurality: 14 out of 39 in 32-bit mode, and in 64-bit only 8 out of 35.) > But I'm serious about the seed arguments. I am too, but I'm not convinced that an argument is the correct strategy. I still incline to the idea of a global seed that is set randomly before any user code executes. This disallows hyper-paranoid implementations with a distinct seed per hash table, but makes management of hash functions outside of hash tables much simpler. -- John Cowan http://www.ccil.org/~cowan firstname.lastname@example.org Híggledy-pìggledy / XML programmers Try to escape those / I-eighteen-N woes; Incontrovertibly / What we need more of is Unicode weenies and / François Yergeaus.