Re: Interface view of dictionaries
scgmille@xxxxxx
(26 Oct 2003 04:27 UTC)
|
Re: Interface view of dictionaries
Bradd W. Szonye
(26 Oct 2003 04:55 UTC)
|
Re: Interface view of dictionaries
scgmille@xxxxxx
(27 Oct 2003 19:21 UTC)
|
Re: Interface view of dictionaries
Bradd W. Szonye
(27 Oct 2003 22:54 UTC)
|
Re: Interface view of dictionaries
scgmille@xxxxxx
(27 Oct 2003 23:16 UTC)
|
Re: Interface view of dictionaries
Bradd W. Szonye
(27 Oct 2003 23:43 UTC)
|
Re: Interface view of dictionaries
scgmille@xxxxxx
(28 Oct 2003 00:05 UTC)
|
Re: Interface view of dictionaries
Bradd W. Szonye
(28 Oct 2003 00:10 UTC)
|
Re: Interface view of dictionaries
Tom Lord
(28 Oct 2003 01:40 UTC)
|
Re: Interface view of dictionaries
Bradd W. Szonye
(28 Oct 2003 04:29 UTC)
|
Re: Interface view of dictionaries
bear
(26 Oct 2003 07:47 UTC)
|
Re: Interface view of dictionaries Bradd W. Szonye (26 Oct 2003 08:56 UTC)
|
bear wrote: >>> (*-get-n dict num) >>> (*-get-n-deleting! dict num) > xxxxxx@freenetproject.org wrote: >> Trivially defined with enumeration. > And with the same problem. On a lot of dictionaries, if I go in > knowing I want the first N elements, I can get them at one swell foop > and not waste time traversing the structure again and again for each > one. Enumeration isn't quite that bad. The smart way to do it is to traverse the collection only once, with a filter function that halts the enumeration as soon as it gets what it needs. Pseudocode: (cfold f (cons n '()) dict) where n is the number to fetch and f is (lambda (nr datum) (let [(n (car nr)) (result (cdr nr))] (cond [(= n 0) (<halt enumeration and return (reverse l)>)] [(<match filter>) (cons (1- n) (cons datum result))] [else nr]))) This will return the elements in O(N) time, which isn't horrible. However, it's also far from the best possible way to do it. For a typical tree-based dictionary, you can do it in O((lg N) + K) time: Finding the first element takes O(lg N), and traversing the structure to get all K elements takes O(K) time. There's a big difference between those performance characteristics. Unless K is very large (close to N), it's effectively O(lg N) for the "native" approach vs O(N) for the filtering enumeration. That's how it works for element fetching, anyway. For consecutive deletions, it's a bit more complicated. Unless the enumeration is stable, consecutive deletions are O(N-squared), because you need to enumerate the whole thing once for each iteration. A sophisticated native function may still be able to do it in O(lg N) time. This is why I complained that the SRFI didn't take performance into consideration. Sure, it's a bad idea to attempt construction-time optimizations too early in the design process. However, these are not construction-time optimization. They're significant algorithm choices, and they make a huge difference. In the worst case, it degrades performance from O(lg N) to O(N-squared). That kind of major algorithmic degradation *is* an appropriate consideration for a design doc. Tangent: In the example above, I used the traditional argument convention for higher-order functions: (proc f seed collections ...). One of the thing that annoys me about SRFI-44 is that it abandons good conventions like these. Collection-fold permits an arbitrary number of seeds but only one collection. That gives you something useless (if you want multiple seeds, use a list like I did) while taking away something useful (folding multiple collections). Tangent off the tangent: Why the heck did SRFI-1 define the folding function to accept the seed argument *last*? There's only one seed but many lists, which naturally suggests a (f seed . data) signature for folding functions that can work on any number of lists. Instead, it specifies (f datum0 datum1 ... seed). Why?! > Failure to provide a name for this function practically demands an > implementation that sucks, which will then be applied to all > dictionaries, whether the implementation of this function for that > dictionary had to suck or not. OK, I guess it's your turn to be hostile! > My API allows efficient implementations. The one proposed in the SRFI > does not. Because of that, they are not even remotely similar. Yes, > you could use the SRFI API to provide whacking slow versions of most > of the functionality. But every coder who has to live with them is > going to hate them, either because they will waste CPU time that does > not have to be wasted, or because redefining R5RS functions won't play > nice with the existing code and module system, or both. I generally agree, but with one reservation: While there are good reasons for the PLT module system to work that way, it's still more cumbersome than it needs to be. I've been discussing this on the PLT mailing list, and I hope to develop some methods to ease the pain a bit. I still think gratuitous incompatibilities are a bad idea, but it shouldn't be so hard to use well-designed extensions to existing bindings. -- Bradd W. Szonye http://www.szonye.com/bradd