Re: Remaining changes

Show/hide message thread

Remaining changes Wolfgang Corcoran-Mathe (04 Sep 2020 17:12 UTC)
Re: Remaining changes John Cowan (05 Sep 2020 03:41 UTC)
(missing)
(missing)
(missing)
Fwd: Remaining changes Marc Nieper-Wißkirchen (06 Sep 2020 07:43 UTC)
Re: Remaining changes Marc Nieper-Wißkirchen (06 Sep 2020 09:33 UTC)
Re: Remaining changes Wolfgang Corcoran-Mathe (06 Sep 2020 17:24 UTC)
Re: Remaining changes Marc Nieper-Wißkirchen (06 Sep 2020 17:30 UTC)
Re: Remaining changes Wolfgang Corcoran-Mathe (06 Sep 2020 17:40 UTC)
Re: Remaining changes John Cowan (06 Sep 2020 20:04 UTC)
Re: Remaining changes Marc Nieper-Wißkirchen (06 Sep 2020 20:40 UTC)
Re: Remaining changes John Cowan (07 Sep 2020 00:03 UTC)
Re: Remaining changes Marc Nieper-Wißkirchen (07 Sep 2020 06:31 UTC)
Re: Remaining changes Wolfgang Corcoran-Mathe (07 Sep 2020 15:46 UTC)
Re: Remaining changes Wolfgang Corcoran-Mathe (07 Sep 2020 20:56 UTC)
Re: Remaining changes John Cowan (07 Sep 2020 21:16 UTC)
Re: Remaining changes Wolfgang Corcoran-Mathe (07 Sep 2020 21:57 UTC)
Re: Remaining changes Marc Nieper-Wißkirchen (08 Sep 2020 14:25 UTC)
Re: Remaining changes John Cowan (08 Sep 2020 15:26 UTC)
Fwd: Remaining changes John Cowan (05 Sep 2020 17:48 UTC)
Fwd: Remaining changes Marc Nieper-Wißkirchen (05 Sep 2020 12:59 UTC)
Re: Remaining changes Marc Nieper-Wißkirchen (05 Sep 2020 13:07 UTC)

Re: Remaining changes Marc Nieper-WiÃkirchen 06 Sep 2020 20:40 UTC

(Am So., 6. Sept. 2020 um 22:04 Uhr schrieb John Cowan <xxxxxx@ccil.org>:

Thank you for your detailed feedback on my proposal.

> I appreciate very much the work you've put into this patch, and I am incorporating most of it unchanged (correcting typos and English infelicities, of course).  I do have some concerns:

> 0) As a general matter, do not make formatting changes and substantive changes in the same patch or PR.  My labor was greatly increased by having to check many lines that were quite unchanged except that a few words were moved to adjacent lines.  It is always good practice to avoid this, even if some lines become uncomfortably long or short.

I have to admit that I haven't taken care of these things. (Moreover,
my Emacs may have done some local reformatting in the background when
I split lines too long.) The next patch I send you will be more
careful. (In my defense I should add that, initially, the patch was
created as a basis for discussion.)

> 1) Although the notion of "range-vector" is gone, it is replaced by the undefined notions of "size of a range" and "size of an indexer".  The latter is deeply obscure to me: I can't find any notion of "size" that would be relevant to a procedure of any kind.  I suppose the idea is this: if an indexer closes over a vector or similar object, its size includes that vector, and by the same token the size of a range is asymptotically equal to the size of its indexer.  But how are we to distinguish between a vector closed over as a source of the range elements and a vector closed over merely to access certain elements of it?

I took the notion "size of a range" from the previous drafts of this
SRFI ("The size of a range object is independent of the number of
elements it contains, ..."). It was already an undefined notion (in
the strict sense) but I tried make a conservative change here.

And indeed, giving "size" a strict meaning is hard as your example of
the indexer procedure shows. And when we think about it, we see that
we cannot distinguish between a vector closed over as a source of the
range elements and a vector closed over merely to access certain
elements of it. Initially, I thought one could make a difference
between vector->range and vector-range in this regard. The "size" of
the range returned by vector->range clearly has to include the vector
copy. But it is not true that one can skip unconditionally the vector
when calculating the size for a range returned by vector-range.
Namely, the answer depends on whether the vector will still be shared
by the rest of the code or will only be accessible through the
constructed range as the example (vector-range #(1 2 3)) shows.

> What's important to the user is to know that a range containing the first million natural numbers, if created by `range` in the obvious way or by `numeric-range`, occupies only a token amount of space.  But if it is transformed by `range-map`, you will need a million times as much space as before.  That notion is obscured by the current language.  Please try to work through this and come up with something better.

The current proposal says:

a) The size of a range constructed by numeric-range is O(1), which
means limited by some (implementation-defined) constant.
b) The size of a range constructed by range-map is O(n) where n is the
number of elements.

I don't think this is obscure (given some understanding of what "size"
means). Or where do you see the danger of misinterpretation?

> 2) `subrange`: I do not see why the size of the result needs to be asymptotically equal to the size of the source range.  That would imply that it might be necessary to copy elements stored inside the source range, but it never is; it can be shared.  The exception is if the indexer supplied to `range` is closed over, which gets us into trouble with "size of an indexer" again.

Again, the problem is: What do we mean by the size of an object? (And
again: that question was already there before my proposal.) The
subrange shares data with the original range but when the original
range gets garbage-collected, the shared data remains with the
subrange. That's why I have included it into the size bound.

> 3) `range=?` lacks a running time guarantee.

Indeed. It was on my TODO list but I missed it nevertheless. The
running time should be the same as for range-every (but see below).

> 4) In range-append, "O(n) + O(k) time" should I think just be "O(n) time", since k is bounded above by n.  The running time with k ranges of 1 element is O(n), since n = k, and the running time with 1 range of n elements is also O(n).

The problem are ranges with zero elements. I can have zillions of
empty ranges. So, actually, procedures like range-every (that have to
inspect all ranges at least once) also need addition of O(k).

> Terminological change: I have converted "O(n) where n is the sum ..." to be "O(s) where s is the sum" so that O(n) always implies the number of elements.

+1

So the remaining construction site [is this an idiom one can use in
English as well?] seems to be to make sense of "the size of an
object".

-- Marc

> On Sun, Sep 6, 2020 at 5:33 AM Marc Nieper-Wißkirchen <xxxxxx@nieper-wisskirchen.de> wrote:
>>
>> So here is some constructive proposal to (hopefully finally) resolve
>> the problem with the running time guarantees. Instead of listing all
>> proposed changes, I applied them to the latest draft in John's
>> repository and you'll find the corresponding patch file attached to
>> this email.  Please go over the proposed changes.
>>
>> Besides filling in some running time and size guarantees that were
>> missing and removing/rewriting some not absolutely correct statements,
>> the major changes were: Add the notion of an average accessing time,
>> on which everything else is built. For each procedure returning a
>> range, the average accessing time is given. Drop the notion of a
>> vector-style range, which had no clear meaning.
>>
>> Please ask if you don't understand my rationale behind one or the other thing.
>>
>> Thanks,
>>
>> Marc