Re: range->vector - Simplelists

Show/hide message thread

range->vector Marc Nieper-Wißkirchen (01 Sep 2020 11:29 UTC)

Re: range->vector John Cowan (01 Sep 2020 15:29 UTC)

Re: range->vector Marc Nieper-Wißkirchen (01 Sep 2020 15:45 UTC)

Re: range->vector Wolfgang Corcoran-Mathe (01 Sep 2020 16:33 UTC)

Re: range->vector John Cowan (01 Sep 2020 17:12 UTC)

Re: range->vector Marc Nieper-Wißkirchen (01 Sep 2020 17:27 UTC)

Re: range->vector Wolfgang Corcoran-Mathe (01 Sep 2020 17:34 UTC)

Re: range->vector Marc Nieper-Wißkirchen (01 Sep 2020 17:36 UTC)

Re: range->vector Arthur A. Gleckler (01 Sep 2020 17:37 UTC)

Re: range->vector Marc Nieper-Wißkirchen (01 Sep 2020 17:38 UTC)

Re: range->vector Wolfgang Corcoran-Mathe (01 Sep 2020 17:46 UTC)

Re: range->vector John Cowan (01 Sep 2020 18:23 UTC)

Re: range->vector Arthur A. Gleckler (01 Sep 2020 18:40 UTC)

Re: range->vector John Cowan (01 Sep 2020 18:42 UTC)

Re: range->vector Marc Nieper-Wißkirchen (01 Sep 2020 18:52 UTC)

Re: range->vector Marc Nieper-Wißkirchen (01 Sep 2020 19:22 UTC)

Re: range->vector John Cowan (03 Sep 2020 00:15 UTC)

Re: range->vector Wolfgang Corcoran-Mathe (03 Sep 2020 02:27 UTC)

Re: range->vector John Cowan (03 Sep 2020 03:35 UTC)

Re: range->vector Marc Nieper-Wißkirchen (03 Sep 2020 06:47 UTC)

Re: range->vector Wolfgang Corcoran-Mathe (03 Sep 2020 18:04 UTC)

Re: range->vector Wolfgang Corcoran-Mathe (03 Sep 2020 18:27 UTC)

Re: range->vector Marc Nieper-Wißkirchen (03 Sep 2020 19:10 UTC)

Re: range->vector Wolfgang Corcoran-Mathe (03 Sep 2020 20:32 UTC)

Re: range->vector Wolfgang Corcoran-Mathe (03 Sep 2020 07:11 UTC)

Re: range->vector Marc Nieper-Wißkirchen (03 Sep 2020 07:14 UTC)

Re: range->vector Marc Nieper-WiÃkirchen 03 Sep 2020 06:47 UTC

With all due respect, John's analysis below is not correct. This is
probably mostly because the SRFI still contains some non-meaningful
statements about running times.

Saying that something runs in O(g(n)) makes sense if you have a
clearly-defined function of running times in terms of a natural number
n. This is, for example, the case in the situation of "string-range".
The number n here is clearly the number of characters in the string.
If I double the number of characters, "string-range" will (at most)
roughly run twice as long.

An instance, where the stated running time is rather meaningless is,
for example, the case of "range->list". It claims that the running
time is O(n) where n is the number of elements in the range. However,
as ranges aren't uniform, I can easily create a sequence of ranges r1,
r2, ... with rn of length n such that the sequence of running times of
(range->list r1), (range->list r2), ... grows quadratically or worse.

Finally, we have this statement in the specification section:
"Statements about running time exclude time spent in the indexer
procedure. However, this SRFI recommends that indexers run in O(1)
time." The first of the two sentences assumes that there is a definite
indexer procedure for each range. But the only general indexer
function we have is (lambda (i) (range-ref i range)), but this cannot
be meant. And the second sentence about that the running time of
indexers is recommended to be O(1) does not make sense unless the
independent variables are known.

To solve all this, we can either try hard and make everything sound in
the specification section. Or, alternatively, I would strip the
specification of all logically meaningless things and put it into a
detailed section about algorithmic complexity in the rationale.

But let us now come back to the issue raised with range-append and
range-reverse. All running time guarantees in the current document are
functions in the number n, which is meant as the number of elements in
the range. Now the number of arguments of range-append is a number
completely unrelated, so let us call it k. So given the original
implementation of range-append, it fulfills all prescribed algorithmic
complexities (insofar they can be given a meaning). If range-append
copied all elements that would just be unnecessary practically and,
moreover, would make it rather useless.

Luckily, the analysis above that takes care to distinguish between n
and k helps us here to find the correct solution (that will also apply
to range-reverse). In the implementation (not necessarily exposed to
the user), range objects receive another field, namely a "complexity
number" k. Ranges created by range, numeric-range, vector-range or
string-range have complexity 0. When "range-append" appends k ranges,
the resulting complexity is the sum of the complexities. Likewise,
"range-reverse" increases the complexity number by one. When, and only
when, the complexity number reaches an implementation-defined
threshold (or the logarithm of the length of the resulting range), the
constructor unpacks all values in a newly allocated vector and resets
the complexity to 0. To make it even more efficient, the unpacking
happens lazily, so not before the first access through range-ref.

I think this is a very satisfying solution. It fulfills all intended
running guarantees, it makes range-append fast, and the concept is
easily extendable to other future range combinators.

Happy coding!

Marc

Am Do., 3. Sept. 2020 um 05:35 Uhr schrieb John Cowan <xxxxxx@ccil.org>:
>
> Range-reverse is safe unless you reverse the range over and over (which is unlikely), because the reverser part is trivially O(1).
>
> On Wed, Sep 2, 2020 at 10:27 PM Wolfgang Corcoran-Mathe <xxxxxx@sigwinch.xyz> wrote:
>>
>> On 2020-09-02 20:15 -0400, John Cowan wrote:
>> > So the safe implementation is to unpack all the elements of all the ranges
>> > into a single vector-style range, which makes string-append O(n) but
>> > preserves the guarantee of an O(1) range-ref in all cases.
>>
>> ACK.  I'll update the sample implementation of range-append shortly.
>>
>> > I added a note that the SRFI recommends that all indexers are O(1).
>>
>> This means that range-reverse should also return a vector-style range.
>> Since the earliest days of SRFI 196, the sample implementation of this
>> function has used a composed indexer (which was the only option,
>> pre-vector-range).
>>
>> --
>> Wolfgang Corcoran-Mathe  <xxxxxx@sigwinch.xyz>
>>
>> "The intellectual advancement of man depends on how often he can
>> exchange an old superstition for a new truth." --Robert G. Ingersoll