> Bradd W. Szonye wrote: >> However, there's a significant class of collections -- any tree-based >> dictionary, for example -- where you can improve from O(N) >> performance to O(lg N + K) performance for sequential access to keys. >> That's both efficient and generic. xxxxxx@freenetproject.org wrote: > I agree, and I expect a tree implementation of dictionaries to specify > exactly those operators. But they don't necessarily extend to > hashtables, disk file databases, or any number of other dictionary > mappable datastructures... a set which is larger than the set of Tree > dictionaries. That's when you ask, "Are these interfaces actually inefficient for other kinds of dictionaries? They're still convenient, so is there a good reason not to provide them?" Consider procedures that operate on K sequential elements, starting with a given key. It's a common operation for dictionaries that permit duplicate keys; you often need to enumerate all elements that match a given key. You can always use the FOLD primitive to implement these operations, but it runs in O(N) time. (If you need to mutate a collection with an unstable enumerator, that increases to O(N-squared) time.) If you provide the interface as an atomic operation, however, a typical tree dictionary can cut that to O(lg N + K), and a typical hash table dictionary can cut it to O(K). Alists don't give you anything special, but they do still match the enumerator's O(N), and it may be able to the avoid O(N-squared) penalty for unstable mutations, if you implement it atomically. Do you know of any common dictionary types that would actually do worse with the atomic operation? (Given that you can always implement it with a full enumeration, I doubt it.) >> And Bear is trying to establish that some of these functions are not >> only common but important extensions. For most dictionary-type >> collections, primitives aren't sufficient. There's a convenience >> argument *and* a performance argument for extending the set. > But only if those operators generalize to all dictionaries. They'd better, or it'll be very difficult to deal with duplicate keys. IIRC, I mentioned this in my last review of dictionaries; the interface doesn't even have GET-ANY to fetch all matching elements. That's an important interface for dictionaries with non-unique keys. > Otherwise they impair the elegance and utility of the API that is > general to all dictionaries. Pardon my vulgarity, but screw elegance. It means different things to everyone, and it's the doom of many a design. The overzealous desire for so-called elegance is one of the things that leads to what Bear called APIs for stupid programs. I see this desire for elegance in designers of all kinds, and it almost always reflects the author's unreasoning infatuation with some design, whether it's actually useful or not. In my opinion, "elegance" should never, ever be a design goal; it's a primrose path that leads to Hell. Oleg wrote: >>> The intent is not to make it unnecessary to add more API in future >>> SRFIs. Rather, the intent is to make it unnecessary to remove >>> SRFI-44 features in future SRFIs. >> Why would you need to remove a function that is more convenient and more >> efficient? > Because they may not apply to a hypothetical collection. I'm sorry, what sort of dictionary does "fetch K elements from key" not apply to? This is a perfect example of being too attached to "elegance" and too willing to throw away useful concepts. >> Also, speaking of things which might need to be removed, I'm >> concerned about the object-oriented nature of the containers .... For >> example, the hierarchy looks like it's based on a traditional >> "objects as references" class-based system, when an "objects as >> values" prototype-based system may fit better into Scheme. >> (Especially given the way such a system incorporates primitive data >> structures according to content rather than type). > Hardly. The SRFI was defined before there was even an object system > selected for the reference implementation. We were very careful not > to evey specify polymorphism. The SRFI only requires that the more > general functions work for the 'subtypes' of the SRFI. It never says > that the collections actually be subtypes in some object system, its > just a useful metaphor. But the metaphor reeks of class-based, objects-as-references inheritance. I asked before, and I'll ask again: What's the OO design principle behind the classes? Should the Liskov Substitution Principle hold? Why isn't set a subtype of bag, or vice versa? Why aren't dictionaries and sequences more closely related? Is the bag type justified at all? Right now, I don't know why the hierarchy looks the way it does. For all I know, the design was, "These collection types seem kinda related. Let's make one a subtype of the other." What are the actual principles, and how will they hold up in a prototype-based system or with parametric polymorphism? Does the hierarchy constrain implementors in ways that you don't expect? With no explicit design goals, no complete implementation, and no examples of the collections in actual use, how are we supposed to trust that these won't be problems? > You seem to portray this as if I'm hiding some dark secret about > limitations to the API. I'm quite confident in the API in fact, and > just don't believe that I need 'prove' its good to you by writing a > bunch of code that won't wind up in the SRFI. That's nice that you're confident in your own design. But isn't that the whole point of the reference implementation? We're not *supposed* to take your implementation on faith. *You're* supposed to demonstrate that it's actually implementable. Extra points if you can show that it stands up well under real use. If it seems like I'm being a jerk about this, it's because my day job is 100% about quality -- code reviews, design reviews, testing, verification, making sure that we have more than just the developer's word that a system is good. This "I don't need to prove that my design is good" attitude doesn't fly in a code review. And isn't that exactly what the SRFI draft process is? > I have limited time for this SRFI among many many projects, so this is > strictly pragmatic. Again, that's nice. It's not going to convince me to lower my standards for code reviews, though. >> So while I agree that there is value in a stable API, I don't think >> there's any value in rushing an immature API to finalization just so >> that we can call it "stable." It'd be much better to propose a >> concrete collection or three, get some experience with their use, and >> it they're successful, factor out the interface and publish that as a >> SRFI. > This is arguably bad design. Writing many collections and then culling > the common operators is a great way to wind up with a poorly thought > out system. Not if those collections are themselves well-designed. Design a good system, implement it, determine what can be reused or factored out for future projects. That's how reuse works. If you haven't already, I strongly recommend that you study it. I can dig out some good titles if you like, although they're all for C++ rather than Scheme. Write some real, useful collections to your spec. Use them in a project or two -- doesn't need to be anything huge, just something bigger than a toy. Change the interfaces that are too cumbersome, maybe add a few. Review the design again. Factor out everything that worked well and publish that. Even better, just publish the useful collections and let the design speak for itself. That would be *much* more useful than an untested design doc. > Especially when the stated goal here is to create a general, > *interoperable* framework for collections. Once again: It isn't general or interoperable until you've seen it in use, until you've tried to use it and reuse it. Until then, it's just an interface that you *hope* is reusable. Many projects start out by assuming that interfaces like these will be reusable. They typically fail. Again, spend some time studying reusability if you haven't already; getting it right is a LOT harder than it seems. >> Actually, if they're successful, there's less need to publish the >> interface separately. Future developers will imitate it like they >> already imitate SRFI-1. That's a much more desirable outcome: We get >> useful collection types *and* a de facto standard interface that way. > Name one language with a useful generic collections library that got > it this way? You have heard of C++, right? -- Bradd W. Szonye http://www.szonye.com/bradd