Suggested change to the sample implementation, & other comments Göran Weinholt (21 Oct 2022 18:08 UTC)
|
||
(missing)
|
||
Fwd: Suggested change to the sample implementation, & other comments
Marc Nieper-Wißkirchen
(23 Oct 2022 11:34 UTC)
|
||
Re: Suggested change to the sample implementation, & other comments
Marc Nieper-Wißkirchen
(23 Oct 2022 11:36 UTC)
|
||
Re: Suggested change to the sample implementation, & other comments
Lassi Kortela
(21 Oct 2022 22:04 UTC)
|
||
Re: Suggested change to the sample implementation, & other comments
Marc Nieper-Wißkirchen
(22 Oct 2022 08:17 UTC)
|
||
Re: Suggested change to the sample implementation, & other comments
Marc Nieper-Wißkirchen
(22 Oct 2022 14:17 UTC)
|
Hello Marc, Thanks for another interesting SRFI. I noticed that the sample implementation uses a construction that neither Chez nor Loko optimizes: (expand/optimize '(define set-car+cdr! (lambda (p x y) (assert (pair? p)) (perform (set-car! p x) (set-cdr! p y))))) => (begin (set! set-car+cdr! (lambda (p x y) (let ([t (pair? p)]) (if t (void) (assertion-violation #f "failed assertion (pair? p)"))) (call-with-values (lambda () (set-car! p x)) (lambda g2 (call-with-values (lambda () (set-cdr! p y)) (lambda g3 (values))))))) (void)) This code allocates closures and is less efficient than just using begin. The sample implementation also has an unreachable syntax-case clause. The problem can be fixed by not using let-values: (define-syntax perform (lambda (stx) (syntax-case stx () [(_ expr ...) (with-syntax ([(tmp ...) (generate-temporaries #'(expr ...))]) #'(let ([tmp (begin expr #f)] ...) (values)))]))) The new code: (expand/optimize '(define set-car+cdr! (lambda (p x y) (assert (pair? p)) (perform (set-car! p x) (set-cdr! p y))))) => (begin (set! set-car+cdr! (lambda (p x y) (let ([t (pair? p)]) (if t (void) (assertion-violation #f "failed assertion (pair? p)"))) (set-cdr! p y) (set-car! p x) (values))) (void)) Now I'd like to address the usefulness of the perform construct with a real world example. The <https://gitlab.com/weinholt/struct-pack/> library has a pack form that expands to several bytevector-...-set! calls where the order doesn't matter. An example: (expand/optimize '(define (shm:put-image c drawable gc total-width total-height src-x src-y src-width src-height dst-x dst-y depth format send-event shmseg offset) (send-request c #f extension-name XCB_SHM_PUT_IMAGE (pack "LL6S2sCCCxLL" (xid-id drawable) (xid-id gc) total-width total-height src-x src-y src-width src-height dst-x dst-y depth format (if send-event 1 0) (xid-id shmseg) offset)))) => (begin (set! shm:put-image (lambda (c drawable gc total-width total-height src-x src-y src-width src-height dst-x dst-y depth format send-event shmseg offset) (send-request c #f extension-name XCB_SHM_PUT_IMAGE (let ([bv (make-bytevector 36)]) (bytevector-u32-native-set! bv 32 offset) (bytevector-u32-native-set! bv 28 (xid-id shmseg)) (bytevector-u8-set! bv 27 0) (bytevector-u8-set! bv 26 (if send-event 1 0)) (bytevector-u8-set! bv 25 format) (bytevector-u8-set! bv 24 depth) (bytevector-s16-native-set! bv 22 dst-y) (bytevector-s16-native-set! bv 20 dst-x) (bytevector-u16-native-set! bv 18 src-height) (bytevector-u16-native-set! bv 16 src-width) (bytevector-u16-native-set! bv 14 src-y) (bytevector-u16-native-set! bv 12 src-x) (bytevector-u16-native-set! bv 10 total-height) (bytevector-u16-native-set! bv 8 total-width) (bytevector-u32-native-set! bv 4 (xid-id gc)) (bytevector-u32-native-set! bv 0 (xid-id drawable)) bv)))) (void)) As can be seen, the expanded code is an overspecification of the original pack form. If the ...-set! calls were wrapped in perform it would allow the compiler to perform several optimizations: - It would be possible to do evaluation order determination on the whole block. In this example it would allow the compiler to first emit the code for writing the constants and the variables, and last emit the calls to xid-id when the register pressure is lower. - The compiler can also emit a single range check for the whole block, but still be free to order the writes any way it likes. - The compiler can emit a single type check for the bytevector. But I don't think perform really helps the compiler's analysis here. I'm thinking about the algorithm in [1] and I'm not sure it could be usefully combined with perform. Of course the ability to just say "the order doesn't matter here" is useful in itself. BTW, I think this SRFI might be misinterpreted as allowing for parallel evaluation order. You could use language similar to this from R6RS: Although the order of evaluation is otherwise unspecified, the effect of any concurrent evaluation of the operator and operand expressions is constrained to be consistent with some sequential order of evaluation. The order of evaluation may be chosen differently for each procedure call. [1] Michael D. Adams, Andrew W. Keep, Jan Midtgaard, Matthew Might, Arun Chauhan, and R. Kent Dybvig. Flow-sensitive type recovery in linear-log time. In Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’11, pages 483–498. ACM, New York, NY, USA, October 2011. ISBN 978-1-4503-0940-0. doi: 10.1145/2048066.2048105. Best regards, -- Göran Weinholt | https://weinholt.se/ Debian Developer | 73 de SA6CJK