SRFI-234: Various comments Maxime Devos 03 Oct 2022 14:17 UTC
Hi,

(I'm new to the SRFI process)

From: Shawn Wagner
> Got a kick out of this bit from the reference implementation comments:
>
>>I guess lots of Scheme programmers have already written similar code.

Here's another one:
<https://git.savannah.gnu.org/cgit/guix.git/tree/guix/import/utils.scm#n518>.
  However, it has a bug <https://issues.guix.gnu.org/58198>, so I looked
for another implementation to replace it.  I recommend the test case
from that bug report, if the test suite hasn't such a thing already.

(Cyclic graphs):
> Please replace raising an exception in the case of a cyclic graph with a non-exception-based return method, e.g. by returning `#f`.
>
> The rationale behind this is that exceptions should be raised in error situations (violations, in terms of R6RS, in case of programmer errors, and errors in case of user/environment errors) and not for conveying a different-from-normal return value.

I recommend an exception -- the procedure is for topological sorting,
which is by definition on a non-cyclic graph (cyclic graphs don't have a
topological ordering (except for self-cycles I suppose)).  Hence,
passing a cyclic graph is a type error.

Additionally, in Guix (a Scheme program of some size), we have had
several cases where procedures returned #false to signal problems but
where the users assumed the procedure was used correctly, even though it
was regularly used incorrectly, leading to mostly silent incorrect
behaviour.  Raising an exception instead would have prevented that.

In fact, we are moving away from procedures returning #false to
procedures raising an exception instead.

> On Thu, Aug 11, 2022 at 2:04 AM Marc Nieper-Wißkirchen <xxxxxx@nieper-wisskirchen.de> wrote:
>
>     Am Do., 11. Aug. 2022 um 10:54 Uhr schrieb Shawn Wagner <xxxxxx@gmail.com>:
>
>         Or even have it allow cycles. tsort(1) can handle them, so there's got to be at least one algorithm that can deal with them in a graph....
>
>
>     What is the output of `tsort` in the case of cycles?  I think it is important to be able to detect when not all dependencies can be satisfied because of cycles and not be silent about it.  (But an exception would be too noisy for general code.)
>
> The GNU version at least prints out a warning to stderr about it: [...]

I think an exception to indicate there is no topological ordering would
be rather clear 'warning'.

> Another option for error handling if cycles aren't allowed is to take an optional third argument - if it's a value, return that instead, if it's a thunk, run it. Default can be #f or a lambda that raises an error, whatever more people prefer. That way you can customize the behavior at point of use.

That's already possible:

   (or (topological-sort ...) (error "oops"))

or

   (guard (c ((some-exception? c) #false))
     (topological-sort ...)).

Also, procedures are values, sometimes you really want to return a
thunk.  To avoid this ambiguity, if you go for an additional argument,
you can tell people to do (const #false) (or (lambda () #false) if your
Scheme doesn't have 'const').

Now I think of it, returning #false by default or running appropriate
thunk argument/returning some argument has a benefit above exceptions:
in some odd cases, '=' might call topological-sort, an exception raised
by the inner topological-sort could easily confused for an exception
raised outside.  An argument _for_ returning #false this time?

 > It is an error if the same node (in the sense of =) appears in more
 > than one connection.

You mean: (0 1 1), right?

Perhaps the following should be an error too (to avoid complicating the
implementation):

   ((0 1)
    (0 2))

-- AFAICT such graphs are currently allowed by the specification.

Also, you haven't specified what the default '=' is -- is it eqv?, which
in typical situations can't be used meaningfully on strings, is it =
which can't be used on symbols and numbers, or is it equal? which can be
used on almost everything?  I recommend 'equal?' as a default, to not
limit ourselves to numbers.  (IIRC, I have used such a procedure on
strings or records.)

> Mentioned by Marc in the other thread,
> https://en.wikipedia.org/wiki/Tarjan's_strongly_connected_components_algorithm

I suppose this is an option, but the current title of the SRFI is
'Topological sorting', not 'strongly connected components', so it
extends the scope of this SRFI (not necessarily a bad thing though).

Also, a proposal for another interface -- in Guix, the API is

(define (topological-sort nodes
                           node-dependencies
                           node-name)
   "Perform a breadth-first traversal of the graph rooted at NODES, a
list of
nodes, and return the list of nodes sorted in topological order.  Call
NODE-DEPENDENCIES to obtain the dependencies of a node, and NODE-NAME to
obtain a node's uniquely identifying \"key\"."
   ...)

(please disregard the 'graph rooted at', that's not part of my proposal)
-- there, 'nodes' is an initial list of nodes, and the remaining nodes
are discovered automatically with 'node-dependencies' (I don't recommend
the 'node-name' though, I would keep the '=' argument).

I don't have any arguments beyond 'looks nice' though.

Greetings,
Maxime.