Email list hosting service & mailing list manager


Re: vector compare Jens Axel Søgaard 09 Jun 2005 15:43 UTC

Per Bothner wrote:
> Sebastian Egner wrote:

>>  > The default compare for vectors is unusual, and more critically it is
>>  > incompatible with the default compare for lists and strings.
>>  > The latter are both lexicographic compare.
>>
>> Correct. In this SRFI, vectors are compared first by length and then
>> lexicographically by default.

> Do you have any examples or use cases or algorithms that are simpler
> when using this "natural order"?  What prior art is there for
> using this order?

As Sebastian argues, there not 1 natural order of vectors, but the
ordering in the srfi is /a/ natural order. A concrete
example is the sorting of polynomials:

      x^3 > x^2 + 1 > x^2 > 42

A concrete representation in terms of vectors yields:

   #(3 0 0 0) > #(2 0 1) > #(2 0 0) > #(42)

The choice of default order should also be seen in the light of:

   "The purpose of default-compare is providing some well-defined way
   of comparing two arbitrary Scheme values. This can be used in all
   situations in which the user is unwilling to define a compare
   procedure explicitly, for example because the actual details of
   the total order do not really matter."

Consider now the case where you have a list of vectors of numbers,
and wants to convert it to a set of of vectors of numbers. An
obvious plan is to sort the list, and then make a single traversal
of the sorted list skipping duplicates. In this scenario the
actual order is unimportant.

Suppose now that we have the following list:

   (list #10000(0 ... 0 1)
         #10001(0 ... 0 2)
         #10002(0 ... 0 3)
         #10003(0 ... 0 4))

In this scenerio the order used in the srfi is better
efficiencywise than the lexicographical order -- and not
only by a constant factor.

It is also possible to find a list where the srfi-ordering
is slower than the lexicograpic, but the slow down
will be bounded by a small constant factor. E.g.

   (list #10000(0 ... 0 1)
         #10000(0 ... 0 2)
         #10000(0 ... 0 3)
         #10000(0 ... 0 4))

--
Jens Axel Søgaard