Much simpler leak-free implementation possible?
Andre van Tonder
(14 Aug 2003 22:38 UTC)
|
Re: Much simpler leak-free implementation possible?
Andre van Tonder
(16 Aug 2003 14:07 UTC)
|
Re: Much simpler leak-free implementation possible? Phil Bewig (18 Aug 2003 14:53 UTC)
|
On Thursday, August 14, 2003 5:38 PM, Andre van Tonder [SMTP:xxxxxx@het.brown.edu] wrote: > Although late in the process and a newcomer to this group, the > following may be of some interest: Many thanks for an insightful and thoughtful post on implementing streams via continuation-passing style. I implemented my streams via promises, and it is highly interesting to see an implementation so totally different. I encourage you to extend your proof-of-concept to a full reference implementation. Having a second reference implementation would be useful for a variety of reasons. It would let the srfi discussion more easily separate concepts from implementation. Implementers would doubtless like to have two reference implementations, as one or the other might fit better with their scheme system. And it would be useful for testing, especially to be sure that libraries of stream functions are portable across the two implementations. I have several suggestions if you decide to extend your proof-of- concept. I've even appended some code that might help. First, you will need to implement streams in terms of SRFI-9 records or some other system that provides disjointness of types, so you can write type-checking functions like stream?, stream-null?, and stream-pair?. This isn't hard, and gives you an opportunity to hide some of the underlying mechanism. Second, you will need to decide on and implement some abstraction that hides the underlying cps mechanism. In my implementation, that abstraction is stream-delay, which is syntax that takes an expression and returns a stream. As an example of what your implementation would need to do, consider stream-map, in the style of the srfi reference implementation, on the left, and andre-map, the equivalent function using your cps-style streams, on the right (error checking has been removed for clarity, only one stream may be input, this code is untested use at your own risk): (define (stream-map f s) (define (andre-map f s) (stream-delay (codelay (lambda (k) (andre-map* f s k)))) (define (andre-map* f s k) (coforce s (lambda (s*) (if (stream-null? s) (if (null? s*) stream-null (k '()) (stream-cons (k (cons (func (stream-car s)) (func (car s)) (stream-map f (andre-map f (stream-cdr s))))))) (cdr s)))))))) The difficulties for a user of your stream library are that they have to understand all of the codelay/coforce and lambda/define stuff that forms the cps mechanism, and they have to understand why they should use car and cdr rather than stream-car and stream-cdr. That makes their life hard, needlessly; it is also a likely source of error. Better your life as the implementer should be hard, since you only have to do the hard work once. Third, you will have to extend your implementation of stream-unfold. Although I appreciate the insight behind your cps implementation, I belive your characterization of your stream-unfold as "trivial" and "vastly simpler" than the one in the srfi reference implementation is unfair, because you solve a considerably simpler problem. Here is an implementation of a stream-unfold1 that is similar to your unfold but based on promises as in the srfi (again, this is untested code): (define (stream-unfold1 gen seed) (stream-delay (let unfold1 ((seed seed)) (call-with-values (lambda () (gen seed)) (lambda (next result) (cond ((pair? result) (stream-cons (car result) (unfold1 next))) ((null? result) stream-null) ((not result) (unfold1 next)))))))) That is no more difficult or complicated than your code, just different. Extending this code to handle multiple output streams is hard. The problem is that each "turn of the crank" of the generator function may produce output for some of the streams, but not all of them, and keeping the various output streams in sync requires some tedious bookkeeping. I can testify from bitter experience that it is harder than it seems. The other source of complexity in the srfi implementation of stream-unfold is the desire for it to be fully lazy across all implementations. A major problem with the first reference implementation was that the degree of laziness of the various stream functions was highly dependent on the underlying scheme implementation. The current reference implementation solves this problem by doing everything it possibly can -- a private low-level delay and force, lambda lifting, and explicit tail recursion elimination -- to be lazy everywhere, on all scheme systems. Some of the effort may not be needed everywhere, and may amount to useless complication. Fourth, you will need to complete some missing functions. I have provided some of them for you in an attachment to this note. You suggested in your post that the cps implementation is more natural and more easily extended to other data types than the promise implementation. Perhaps you could talk more about these topics. I implemented my streams using promises, because that is traditional. I see your implementation using continuation-passing style as different, but neither more nor less natural, and not inherently simpler, or clearer, or more efficient, or easier to extend. Certainly it is not hard to use promises to implement other lazy data types besides streams, as Okasaki demonstrates. You said in a companion posting: By the way, the SRFI-40 reference implementation of stream-filter unfortunately seems to leak memory quite rapidly in MzScheme and even more rapidly in Chez Scheme. On my machine the times3 example using the SRFI stream-filter will run out of memory in <15 minutes in Mz and <5 minutes in Chez. I can tell you that times3 runs to completion and correctly evaluates (times3 1000000) on my Windows 98 machine, using both MzScheme and Petite Chez Scheme, without thrashing the hard disk looking for memory, although it does indeed run quite slowly (several minutes for each, I never bothered to time it). Additionally, I have heard no other reports of space leaks regarding the current implementation. Please don't take anything I have said as being critical; as I said earlier, yours is an interesting and insightful post. I am intrigued by your suggestions, and am hopeful you will provide a complete reference implementation. Phil ;;; andre-streams (define-syntax codelay (syntax-rules () [(codelay thunk-cps) (let ([memo-pair (cons #f #f)]) (lambda (k*) (if (car memo-pair) (k* (cdr memo-pair)) (thunk-cps (make-memoizer memo-pair k*)))))])) (define (make-memoizer memo-pair k) (lambda (x) (set-car! memo-pair #t) (set-cdr! memo-pair x) (k x))) (define (coforce promise k) (promise k)) (define-syntax match (syntax-rules () [(match lst [() exp1] [(h . t) exp2]) (cond [(null? lst) exp1] [(pair? lst) (let ([h (car lst)] [t (cdr lst)]) exp2)] [else 'match-error])])) (define andre-nil (codelay (lambda (k) (k '())))) (define (andre-nil? s) (coforce s (lambda (x) (null? x)))) (define-syntax andre-cons (syntax-rules () ((stream-cons x s) (codelay (lambda (k) (k (cons x s))))))) (define (andre-car s) (coforce s (lambda (x) (car x)))) (define (andre-cdr s) (coforce s (lambda (x) (cdr x)))) (define (andre-cutoff n s) (cond ((zero? n) '()) ((andre-nil? s) '()) (else (cons (andre-car s) (andre-cutoff (- n 1) (andre-cdr s)))))) (define (andre-map f s) (codelay (lambda (k) (andre-map* f s k)))) (define (andre-map* f s k) (coforce s (lambda (s*) (match s* (() (k '())) ((h .t) (k (cons (f h) (andre-map f t)))))))) (define (andre-countdown n) (codelay (lambda (k) (k (cons n (andre-countdown (- n 1))))))) (define (andre-ref n s) (coforce s (lambda (s*) (match s* (() (error 'andre-ref)) ((h . t) (if (zero? n) h (andre-ref (- n 1) t))))))) (define (andre-filter p? s) (codelay (lambda (k) (andre-filter* p? s k)))) (define (andre-filter* p? s k) (coforce s (lambda (s*) (match s* (() (k '())) ((h . t) (if (p? h) (k (cons h (andre-filter p? t))) (andre-filter* p? t k))))))) ;; andre-unfold1 : (b -> (#f | (cons (a | 'drop) b)) b -> stream a (define (andre-unfold1 f seed) (codelay (lambda (k) (andre-unfold1* f seed k)))) (define (andre-unfold1* f seed k) (cond ((f seed) => (lambda (res) (match res (() (k 'error)) ((h . t) (if (eq? h 'drop) (andre-unfold1* f t k) (k (cons h (andre-unfold1 f t)))))))) (else (k '()))))