Tieing up loose ends Dr. Arne Babenhauserheide 12 Jun 2024 13:24 UTC
Hi,

I implemented the changes I see as the result of the discussion for draft #3.

Here I’ll first line out the changes and then note how the open threads
are addressed.

Open question: should the edgelist->graph and edgelist/inverted->graph
procedures also take a nodes argument, so complex graph algorithms which
retrieve the edges via the node values from some backing store could use
the same API? In that case asc would be the accessor and the edgelist
would be #f.

Changes:

- remove exception throwing version

- add an optional nodes vector to topological sort which can be used when
  operating on indizes. It is what typical graph algorithms call
  nodelist.

- edgelist->graph and edgelist/inverted->graph take an optional asc
  argument to retrieve a dependency element from the graph. When using
  eq? for =, it must be assq for asc.

- note in SRFI that for non-hashable or non-hashable data the
  sorting should be done on index-values to the nodes vector
  - rationale: moving graph algorithms into abstract access risks
    theoretically optimal but actually extremely slow algorithms. I’ve
    seen this play out in decentralized networks: either it needs
    complex caching, or access becomes a bottleneck. For such cases,
    the algorithm should instead operate on minimized data that gets
    dereferenced to the nodes after sorting.

Open threads:

> * Alternative topological sorting implementation

I did not address this. It would also rely on implementation-specific
features, so I don’t think it’s the best version for this SRFI.

> * Cyclic graphs
> * Tarjan's SCC

I added procedures to find connected components to address that; but a
simpler version than Tarjan’s SCC.

Also I switched to returning #f in case of cycles (with additional
information as additional values).

> * SRFI 234: Topological Sorting

Adding algorithmic bounds could be useful. These need to be decided,
though, because they should not restrict alternate implementations too
much.
The current implementation seems to be worst case O(n^2), due to using
assoc to get dependents.

For taking a comparator, I’m not sure:
https://srfi-email.schemers.org/srfi-234/msg/20488616/

That would make usage much more complex, because while we have 4
equality-predicates (eq?, eqv?, equal?, =), each data type has different
comparators (string<, <, whatever for lists, …) and accessors.

When using a more advanced graph datastructure, such an accessor could
be stored in the graph created by the matching edgelist->graph
implementation while staying compatible to the API. Then the retrieved
values will have to be double-checked with the equality proc (=).

> * SRFI-234: Various comments

The decision-process about exception or #false is done and implemented
locally (#false), I don’t think there are other parts that need
addressing.

> * Topological Sorting in the Cookbook

That’s for after finalization.
⇒ update https://cookbook.scheme.org/topological-sort/

Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de