Suggested change to the sample implementation, & other comments

Show/hide message thread

Suggested change to the sample implementation, & other comments Göran Weinholt (21 Oct 2022 18:08 UTC)
(missing)
Fwd: Suggested change to the sample implementation, & other comments Marc Nieper-Wißkirchen (23 Oct 2022 11:34 UTC)
Re: Suggested change to the sample implementation, & other comments Marc Nieper-Wißkirchen (23 Oct 2022 11:36 UTC)
Re: Suggested change to the sample implementation, & other comments Lassi Kortela (21 Oct 2022 22:04 UTC)
Re: Suggested change to the sample implementation, & other comments Marc Nieper-Wißkirchen (22 Oct 2022 08:17 UTC)
Re: Suggested change to the sample implementation, & other comments Marc Nieper-Wißkirchen (22 Oct 2022 14:17 UTC)

Suggested change to the sample implementation, & other comments GÃ¶ran Weinholt 21 Oct 2022 18:08 UTC

Hello Marc,

Thanks for another interesting SRFI. I noticed that the sample
implementation uses a construction that neither Chez nor Loko optimizes:

  (expand/optimize
    '(define set-car+cdr!
       (lambda (p x y)
         (assert (pair? p))
         (perform
           (set-car! p x)
           (set-cdr! p y)))))
  =>
  (begin
    (set! set-car+cdr!
      (lambda (p x y)
        (let ([t (pair? p)])
          (if t
              (void)
              (assertion-violation #f "failed assertion (pair? p)")))
        (call-with-values
          (lambda () (set-car! p x))
          (lambda g2
            (call-with-values
              (lambda () (set-cdr! p y))
              (lambda g3 (values)))))))
    (void))

This code allocates closures and is less efficient than just using
begin. The sample implementation also has an unreachable syntax-case
clause.

The problem can be fixed by not using let-values:

  (define-syntax perform
    (lambda (stx)
      (syntax-case stx ()
        [(_ expr ...)
         (with-syntax ([(tmp ...) (generate-temporaries #'(expr ...))])
           #'(let ([tmp (begin expr #f)] ...) (values)))])))

The new code:

  (expand/optimize
    '(define set-car+cdr!
       (lambda (p x y)
         (assert (pair? p))
         (perform
           (set-car! p x)
           (set-cdr! p y)))))
  =>
  (begin
    (set! set-car+cdr!
      (lambda (p x y)
        (let ([t (pair? p)])
          (if t
              (void)
              (assertion-violation #f "failed assertion (pair? p)")))
        (set-cdr! p y)
        (set-car! p x)
        (values)))
    (void))

Now I'd like to address the usefulness of the perform construct with a
real world example. The <https://gitlab.com/weinholt/struct-pack/>
library has a pack form that expands to several bytevector-...-set!
calls where the order doesn't matter. An example:

  (expand/optimize
    '(define (shm:put-image c drawable gc
                            total-width total-height
                            src-x src-y src-width src-height
                            dst-x dst-y depth format send-event
                            shmseg offset)
       (send-request c #f extension-name XCB_SHM_PUT_IMAGE
         (pack "LL6S2sCCCxLL" (xid-id drawable) (xid-id gc)
                total-width total-height src-x src-y
                src-width src-height
                dst-x dst-y depth format
                (if send-event 1 0) (xid-id shmseg)
                offset))))
  =>
  (begin
    (set! shm:put-image
      (lambda (c drawable gc total-width total-height src-x src-y src-width
               src-height dst-x dst-y depth format send-event shmseg
               offset)
        (send-request c #f extension-name XCB_SHM_PUT_IMAGE
          (let ([bv (make-bytevector 36)])
            (bytevector-u32-native-set! bv 32 offset)
            (bytevector-u32-native-set! bv 28 (xid-id shmseg))
            (bytevector-u8-set! bv 27 0)
            (bytevector-u8-set! bv 26 (if send-event 1 0))
            (bytevector-u8-set! bv 25 format)
            (bytevector-u8-set! bv 24 depth)
            (bytevector-s16-native-set! bv 22 dst-y)
            (bytevector-s16-native-set! bv 20 dst-x)
            (bytevector-u16-native-set! bv 18 src-height)
            (bytevector-u16-native-set! bv 16 src-width)
            (bytevector-u16-native-set! bv 14 src-y)
            (bytevector-u16-native-set! bv 12 src-x)
            (bytevector-u16-native-set! bv 10 total-height)
            (bytevector-u16-native-set! bv 8 total-width)
            (bytevector-u32-native-set! bv 4 (xid-id gc))
            (bytevector-u32-native-set! bv 0 (xid-id drawable))
            bv))))
    (void))

As can be seen, the expanded code is an overspecification of the
original pack form. If the ...-set! calls were wrapped in perform it
would allow the compiler to perform several optimizations:

 - It would be possible to do evaluation order determination on the
   whole block. In this example it would allow the compiler to first
   emit the code for writing the constants and the variables, and last
   emit the calls to xid-id when the register pressure is lower.

 - The compiler can also emit a single range check for the whole block,
   but still be free to order the writes any way it likes.

 - The compiler can emit a single type check for the bytevector. But I
   don't think perform really helps the compiler's analysis here. I'm
   thinking about the algorithm in [1] and I'm not sure it could be
   usefully combined with perform.

Of course the ability to just say "the order doesn't matter here" is
useful in itself.

BTW, I think this SRFI might be misinterpreted as allowing for parallel
evaluation order. You could use language similar to this from R6RS:

  Although the order of evaluation is otherwise unspecified, the effect
  of any concurrent evaluation of the operator and operand expressions
  is constrained to be consistent with some sequential order of
  evaluation. The order of evaluation may be chosen differently for each
  procedure call.

[1] Michael D. Adams, Andrew W. Keep, Jan Midtgaard, Matthew Might, Arun
    Chauhan, and R. Kent Dybvig. Flow-sensitive type recovery in
    linear-log time. In Proceedings of the 2011 ACM International
    Conference on Object Oriented Programming Systems Languages and
    Applications, OOPSLA ’11, pages 483–498. ACM, New York, NY, USA,
    October 2011. ISBN 978-1-4503-0940-0. doi: 10.1145/2048066.2048105.

Best regards,

--
Göran Weinholt   | https://weinholt.se/
Debian Developer | 73 de SA6CJK