Comments on SRFI 69
David Van Horn
(11 Aug 2005 14:38 UTC)
|
Re: Comments on SRFI 69 Panu Kalliokoski (12 Aug 2005 09:28 UTC)
|
Re: Comments on SRFI 69
Panu Kalliokoski
(12 Aug 2005 09:53 UTC)
|
Re: Comments on SRFI 69
felix winkelmann
(12 Aug 2005 12:06 UTC)
|
Re: Comments on SRFI 69
David Van Horn
(12 Aug 2005 19:09 UTC)
|
Re: Comments on SRFI 69
felix winkelmann
(12 Aug 2005 20:12 UTC)
|
Re: Comments on SRFI 69
David Van Horn
(12 Aug 2005 20:29 UTC)
|
Re: Comments on SRFI 69
felix winkelmann
(12 Aug 2005 20:35 UTC)
|
On Thu, Aug 11, 2005 at 10:23:47AM -0400, David Van Horn wrote: > The SRFI document states the following in the Abstract: > This SRFI specifies an API for basic hash tables. Hash tables are data > structures that provide a mapping from some set of keys to some set of > values associated to those keys. I can't help feeling a little bit perplexed. (1) The sentence following your citation was part of the definition of what hash tables are. (2) I thought item (1) was obvious. (3) If I had to write every definition in one sentence only, think about what it would make the definition of "appropriate hash function" like. > From your description of what hash tables are, the name "hash table" seems > too specific of a term. This issue was raised earlier by Marc Feeley, and Rather on the contrary, it's too broad. As you point out, not all hash table types allow destructive update, for instance. However, my claim of "broad applicability" is retained, because one can use SRFI-69 hash tables as if they were immutable. > as far as I can tell was never addressed in the document or discussion It was addressed in http://srfi.schemers.org/srfi-69/mail-archive/msg00021.html (part ...naming) > list. Perhaps, I'm just missing it. However, the more important issue here > is that what follows in the document is far from a basic API for structures > that provide a mapping from some set of keys to some set of values > associated to those keys. Yes, yes. The sentence you quoted is just an elaboration of this claim: "hash tables are mappings". But that's not all hash tables are. SRFI 69 does not attempt to define a basic API for mappings, but a basic API for hash tables. The constraints mentioned in the SRFI -- complexity constraints, requirement for enumeration and updating functions -- tie any implementation to be pretty close to what I think users expect when they think they are using "hash tables". The third sentence of the abstract summarises it quite cleanly IMO. Examples of implementation strategies that are ruled out because of API/complexity requirements include alists, search trees, copy-on-write hash tables, chained functions, and the like. So SRFI 69 definitely is not a definition of an API for mappings. > Luckily this SRFI does state its aims, however there are conflicts with > what is stated. Further, design choices have been made that violate these > aims, and rationales rarely appeal to the aims in a consistent manner. I can't understand what you're referring to, unless you really take "hash table" in the SRFI to refer to "mapping", ignoring the third sentence of the abstract, which I find hard to believe. > The primary aim of this SRFI is to provide a simple and generic > hash table API that will answer most of users' needs for basic > usage of hash tables. > This conflates two disparate and competing aims into one; that of providing You know, a wording like this means that the API will try to be all of: simple, generic, and broadly applicable. It does not claim that these goals are the same; rather, it means that the SRFI aims at fulfilling all of them simultaneously (with compromises, if need be). > patterns may be). The abstract makes no mention of most common hash table > usage, so this second aim seems out of the scope of this SRFI. But several If you want to give examples of typical hash table usage, I can add them into the abstract, no problem. But did somebody perhaps complain that what lists are is not articulated clearly enough in SRFI 1 because the abstract does not have examples of most common list usage? I think most common hash table usage is approximately as difficult to define as most common list usage. > choices are made contrary to the aim of simplicity and generality, such as > the ad-hoc collection of type-specialized hash table procedures and their > hash function counterparts. Appeals for generality in the API have been > discounted by the author saying such things as, "I'd rather define these > routines to account for the most common situation(s) and be done with it." And now I can't help feeling a little bit offended. You ignore almost half of the abstract, then proceed to claim that one clause in my purpose statement is void because it does not have a supporting definition in the abstract, _then_ use this as a rationale why the API does not meet the purpose statement. _And_, you ignore the fact that simplicity and generality are still conflicting goals, and say that appeals for generality have been "discounted" because I sometimes choose simplicity over generality. Moreover, you seem to be ignoring that I'm trying to make a generic hash table API, not a generic mapping API. Might you have a suggestion how the SRFI could be improved? Add a clause in the abstract that more precisely specifies what has tables are? But it's there already... > On the other hand, there is widespread use of immutable hash tables, tables > with weakly held keys, concurrency, and GC-sensitive tables, but this SRFI > addresses none of those common usages or the issues that arise in their > presence, and is therefore deficient on this second stated aim. Now, this is the first piece of concrete criticism that I see in this posting. So, why were these not addressed? immutable hash tables: because immutable hash tables are IMO adequately addressed with mutable hash tables. weak-key (or weak-value) hash tables: I think the issues of weak references and hash tables are quite orthogonal. Besides, there is no SRFI for weak references to standardise on. concurrency: Because SRFI 18 says: "It is the responsibility of the application to avoid write/read and write/write races through appropriate use of synchronisation primitives." IOW, concurrency in Scheme is not (and IMO, should not) be connected to general-purpose data structures. GC-sensitive hash tables: I don't know what they are. Care to clarify? Shortly put, I think immutable hash tables are covered, while the other issues are orthogonal to hash tables. IOW, the latter are not "common uses of hash tables", but common uses of separate language constructs with hash tables. > As it exists now, the > SRFI neither provides a basic API for a key-value mapping datastructure, > nor provides an API covering most, or even common, users' hash table needs. Summa summarum, I think your first statement is based on a bad assumption (caused by ignoring the third sentence of the abstract), while your second statement is false. > Also, I think a key aim that this SRFI should have, but does not, is the > selection of names that represent consistent conventions with existing > Scheme practices. What kind of Scheme practices are you referring to? Standard scheme with its vector and string types? You think they are not accounted? SRFI conventions? Which SRFI's I'm particularly at odds with? SRFI 44? Is there _any_ mapping API that follows SRFI 44? Existing implementations? Can you show me the way to more precisely match naming in existing implementations? Existing user code? Which portion of the world's Scheme code have you seen? Your criticism is full of grave concerns which are however not clearly enough stated that I could do anything about them. > The SRFI document states the following in the Rationale: > Hash tables are widely recognized as a fundamental data structure > for many kinds of computational tasks. Almost every non-minimal > Scheme implementation provides some kind of hash table > functionality. > This is certainly true. The majority of Scheme's I'm familiar with include > a hash table datastructure and their common operations, however the names > vary highly. This highlights what should be a primary concern in the > design of this SRFI, but which has been neglected; to identify a > consistent, and portable set of names and parameter conventions. You see half of the point. (BTW, what you say here is also written in the SRFI, right after the paragraph you quote.) The primary concern of this SRFI is to unify different hash table concepts into a portable whole. But while it's mostly a naming issue, it's also an issue of partially differing views on what a hash table really is. As a result of differing views on what a hash table is, different implementations expose the implementation in the API in varying degrees. For instance, guile does not have the equivalent of (make-hash-table); the documentation defines a hash table to be a vector /tout court/. My view of a hash table is external, governed by the API together with its complexity constraints. If you can show how this is not clear from the SRFI, please tell me. Of if you think this is a flawed picture, tell me what hash tables should be in your opinion. But it's no option to _only_ define names for various operations in the SRFI, leaving e.g. creation of hash tables to be an implementation-dependent detail. One could not write portable programs with such a hash table SRFI. > The author has chosen the name and parameter conventions that run counter > to several existing Scheme conventions, such as previous SRFIs, RnRS, and > numerous Scheme implementations. Some choices have no precedent Now, please point these out and make suggestions for improvements. > My preference for this SRFI is the following. Drop the aim of covering > common hash table usage. Writing such a SRFI is a very ambitious and > difficult thing to do, and it requires an extensive amount of surveying > common use. I would expect such a SRFI to discuss the design choices taken > by most Scheme implementations, the several related SRFIs, as well as > similar libraries from related languages such as ML and Lisp. SRFI 1 is a > good example of such a "common use" SRFI. Shivers surveyed R4RS/R5RS > Scheme, MIT Scheme, Gambit, RScheme, MzScheme, slib, Common Lisp, Bigloo, > guile, T, APL and the SML standard basis in designing that library. A good > common use hash table SRFI would need to do likewise. But what else does Shivers say in SRFI 1 but that he surveyed these systems? I dug the API bits out of Shiro's cross reference; I've used Gauche, Guile, PLT and Scheme 48 hash tables, as well as those of Python, Tcl, Java, awk and Ocaml; am I to be blamed because I didn't boast with those? > data structures that provide a mapping from some set of keys to some set of > values associated to those keys. All parts of this SRFI that do not > contribute to that aim should be dropped. All rationales that do not Do you really think that would make this SRFI _easier_? _You_ try to write a SRFI for generic mappings, and I'll show you how it is not general enough in 20 ways. Not only would it be harder, but also quite useless. I want standardised hash tables in Scheme implementations, I'm not here to improve my charisma in scientific writing or street-wiseness. > Scheme. Most notably, this SRFI should be consistent in its choice of > names and parameter conventions with SRFI 44 [1]. If the author chooses > against these conventions, this SRFI then conflicts and competes with SRFI > 44 (and others) and as such the "rationale should explain why the present > proposal is a substantial improvement" over these existing conventions, as > required by the process document. Would you like this explanation to be added to the SRFI? http://srfi.schemers.org/srfi-69/list-archive/msg00024.html There is also the problem that I was not active in the Scheme community when SRFI 44 was finalised. Yes, I think parts of SRFI 44 are flawed (if not too flawed); I can write a "competing SRFI" for SRFI 44 if I find the time (I have more important SRFI's in mind). If you want to continue discussion on this subject, please answer to the points in the message I give a pointer to above. As for this specific issue: > hash-table-equivalence-function, which a reader of SRFI 44 would expect to > return an equivalence over the items in the hash table collection, i.e. key > value pairs, whereas hash-table-key-equivalence-function would return what > SRFI 69 returns for hash-table-equivalence-function. Hash tables don't have use for item equivalence functions, and SRFI 44 specifies the item equivalence function to default to the key equivalence function for mappings, so hash-table-equivalence-function returns the correct value and hash-table-key-equivalence-function can be added by a compatibility layer. On a sidenote, I reread SRFI 44 and its rationales for naming of mapping functions -- but actually, I could not find any. Nor could I find any in the discussion archives. Would you like to point them out to me? While SRFI 44 probably has been a lot of work and some serious thought has gone into it, the part for mappings is IMNSHO not very honed out and does not follow any existing practice. Panu -- personal contact: xxxxxx@iki.fi, +35841 5323835, +3589 85619369 work contact: xxxxxx@helsinki.fi, +35850 3678003 kotisivu (henkkoht): http://www.iki.fi/atehwa/ homepage (technical): http://sange.fi/~atehwa/