Jim Rees <xxxxxx@gmail.com> schrieb am Mi., 6. Juni 2018 um 21:31 Uhr:
Let me amend what I have said....

So, I wrote the maximum-memory-consumed test alluded to in the last email.    That 200MB/second was still O(N) consumption relative to the number of steps in xtake, just with a much higher scalar constant!

The reason is because my Scheme is still immature in design -- all continuations & lexical environments are heap objects that are shared and never copied.   So, when we say the continuation of H[n+1] is "bigger" than that of H[n], it's true, but all of H[n]'s space is shared by H[n+1], so the price per step is only the cost of the extra amount, which is fixed, so we get O(N) behavior.

I also tested my code with (srfi 154) back to normal, but without the mark-key-filtering optimization in (srfi 155) mentioned in the last email.   As expected, I get O(N) behavior there too, since the cost at each step is fixed.  Of course, I also get O(N) behavior with the canonical (force (loop)) test, for which I really prefer O(1) space behavior ;-)   Hence, the mark-key-filtering achieves that.

Your comment about the O(1) space behavior strikes a nerve. Correct me if I am wrong, but in the following modified example, (x+ (srfi-155-integers) N) has O(1) space behavior with R7RS-small's (scheme lazy), but O(N) space behavior with any currently discussed implementation of (srfi 155). (Here, I am assuming a "correctly" implemented parameterize as in SRFI 157.)

(define dummy (make-parameter #f))

(define (srfi-155-integers)
  (let next ((n 0))
    (parameterize ((dummy 1))
      (delay (cons n (next (+ n 1)))))))

(define (x+ stream n)
  (let loop ((s stream) (r 0) (n n))
    (if (= n 0)
      (loop (cdr (force s))
            (+ (car (force s)) r)
            (- n 1)))))

So (srfi 155) would be regression to (scheme lazy) in this regard.
To get O(N**2) space behavior on xtake, using (current-dynamic-extent) that uses call/cc, call/cc would have to make a copy of the continuation that scales in size with its depth.   I understand some Schemes do just that, treating all continuations as "one-shot", in order to achieve superior performance when call/cc is not used.   As an experiment, I simulated this copying, and then xtake clearly showed O(N**2) consumption relative to n.

This is not a critique of call/cc's which copy the stack!    You just have to implement (srfi 154) correctly (ie. without storing the result of call/cc).

Thanks for emphasizing the role of the implementation of call/cc. While I think that good Scheme implementations should not discourage the use of call/cc and that an O(N) behaviour is possible if the captured continuations all share their common tails, it is important that by properly implementing (srfi 154) (necessarily implementation-dependent), the exact implementation of call/cc or the spaghetti stack does not matter.

On Wed, Jun 6, 2018 at 1:03 PM Jim Rees <xxxxxx@gmail.com> wrote:
Yes, the continuation of H[n+1] is larger than that of H[n] -- in part because with-dynamic-extent needs to do something to switch back to a different dynamic-extent, and also because it has to perform a cons.    And I agree that if these extra steps have the same "cost" for each step, we expect O(N) behavior.   If successive steps have greater cost than the prior ones, then we expect higher-order complexity.

I see where the problem lies.   Shiro refers to dynamic-extent capturing the current continuation which is "strictly larger" than the prior one.   That's the problem.    His (srfi 154) is using the "portable" implementation which uses call/cc to effectively capture the dynamic extent.

Semantically, the continuation is overkill w.r.t. dynamic extent.  Dynamic extent contains whatever information you need to know whether dynamic-wind in/out thunks need running, and your current continuation marks.

Here's my record type internal to my (srfi 154) code:

      (define-record-type <dynamic-extent>
        (make-dynamic-extent wind cmarks)
        (wind dynamic-extent-wind)
        (cmarks dynamic-extent-cmarks))

Just the wind and the marks.

So, back to the example, you see neither wind nor marks changes from H[n] to H[n+1] -- so every new extent is referring to the same wind and marks, so the price at each step is the same and O(N) behavior is achieved.

I ran some tests -- I don't have an easy way to test for maximum heap consumed by a form that terminates right now, but I was able to extract some rates of consumption when I try to do xtake with large N.     My own code consumes 5-ish megabytes per second, but if I change my (srfi 154) and add the current continuation to the record, the rate changes to 200MB/second.

***CAVEAT -- there actually IS a change to cmarks that was omitted in the expansion, perhaps it's not relevant to an xtake example -- but the tail-call-detection hack (for force of of a force) -- it adds a mark to the dynamic extent.     However, in both places in the (srfi 155) code where (current-dynamic-extent) is called, the tail-call-mark is irrelevant -- a resurrected extent can never trigger the (force (force -- collapse.    So, I wrote an extra API for my (srfi 154) which given an extent, produces a new extent with all marks associated with a given key removed.  I remove the tail-call-key marks, so then I'm back to fixed-size extents at each step for this example -- and O(N) behavior.

On Wed, Jun 6, 2018 at 9:14 AM Marc Nieper-Wißkirchen <xxxxxx@nieper-wisskirchen.de> wrote:
Dear Jim,

thank you a lot for your shared thoughts and your experiments with SRFI 154, 155 and 157.

I have a question about the behavior of your code in the absence of current-forcing-extent. Shiro Kawai gave the following example code:

(define srfi-155-integers
  (let next ((n 0))
    (delay (cons n (next (+ n 1))))))

(define (xtake stream n)
  (let loop ((s stream) (r '()) (n n))
    (if (= n 0)
      (reverse r)
      (loop (cdr (force s))
            (cons (car (force s)) r)
            (- n 1)))))

He observed an O(n^2) behavior with the naive implementation of SRFI 155. Neglecting current-forcing-extent (which is just a bonus), the expression H[n] given by

(delay (cons n (next (+ n 1))))

is expanded to:

(let ((dynamic-extent (current-dynamic-extent)))
     (with-dynamic-extent dynamic-extent
        (lambda ()
           (cons n H[n + 1]))))))

(Here, scheme-delay is imported from (scheme lazy).) 

The continuation of the expression H[n + 1] is strictly larger than the continuation of the expression H[n]. Therefore, the dynamic extent (which includes, in principle, the current continuation) captured in the evaluation of H[n + 1] is strictly larger than the dynamic extent captured in the evaluation of H[n]. This causes an O(n^2)-behavior.

How does your implementation circumvent this problem?

-- Marc
Jim Rees <xxxxxx@gmail.com> schrieb am Fr., 1. Juni 2018 um 06:14 Uhr:
I've spent a ton of time in the past week on the 157/154/155 triumvirate in my private scheme implementation.    As of tonight, I've had the last of my eurekas to solve all the space blow-up problems, and have clean interfaces which pass all tests now.

For reference, I copied all the bounded space tests from the (srfi 45) document and run those automatically.

Maybe you already have this all sorted out, but I thought I'd chime rather than wait for your "final" code updates and then raise more issues.

Referencing the "srfi 157 version" of the (srfi 155) sample code...

Using a parameter for current-forcing-extent is not ideal.   It's clear as you already know, that if parameters are based on, say, dynamic-wind, the bounded space tests will fail.   So, presumably, you're depending on parameters being based on continuation marks.  In that case, what you effectively have are two marks getting set in succession -- never otherwise apart.  For  my own code, I simplified this by simply cons'ing c and forcing-extent together into one object and one mark setting.    By doing this, I eliminated any dependency on how parameterize works.

Next up -- setting that continuation mark still blows up the space tests -- because delay captures the current extent, which includes this mark, so each pass of, say, (delay-force (loop)) as the simplest example --  keeps capturing a deeper and deeper list of marks.

I realized that when delay captures the current extent, and when force captures the current extent (for forcing-extent), any values associated with key in those captured extents will never be used, nor can they be observed outside (srfi 155) since all the (srfi 157) interfaces requires that you know the keys you're interested in looking at.    So, I added another API function to (srfi 154) extent-remove-marks which filters a given extent to remove all marks associated with a given key, and used this in those two places where (current-dynamic-extent) was called.

Boom.   All tests pass now, including bounded space tests.


To unsubscribe from this list please go to http://archives.simplelists.com

To unsubscribe from this list please go to http://www.simplelists.com/confirm.php?u=AR67FkMcUwd3BgWrS2KvXlvtKOubWJXD