Thanks for having take a look at my write-up.

Am Do., 24. Juni 2021 um 21:05 Uhr schrieb Shiro Kawai <xxxxxx@gmail.com>:

Thanks for the write-up. So, this is to allow sharing substructures in functional interface, yet you can safely pass partially-shared object to linear-updating procedure, because such sharing will be automatically un-shared, right? It certainly allows an efficient implementation with the current functional/linear-updating API.

Yes, indeed. And one can make the linear update procedures look like normal imperative update procedures because it will be mandatory to update the argument.

Multi-threaded implementation may have difficulty to adopt it, though. However, it can have internal persistent/transient flag and prohibit sharing between transient objects; that is, linear updating procedure deep-copies input if it receives a functional one, while substructures of functional objects can be freely shared among other functional objects. It is less efficient than your share-counting scheme, but we can keep the API as is (without "functional interface always return a new object" restriction).

In my C Hamt implementation, I have atomic shared counters when the program is multi-threaded. Note that this is all that is needed. Any substructure is only modified in place if the shared counter is one. (It is just an error to modify the very (!) same object in two threads concurrently, but this is a normal restriction.)

So, multi-threaded implementations do not have to be treated specially.

-- Marc

It's better if we don't expose persistent/transient distinction in the specification level but leave it to the implementation. Then the required change of existing spec would be minimal. Some sort of a guide for possible implementation strategies would help.

On Wed, Jun 23, 2021 at 11:11 AM Marc Nieper-Wißkirchen <xxxxxx@nieper-wisskirchen.de> wrote:
I would like to discuss a different approach to the idea of destructive updates with functional data structures.

To recall, the current approach in SRFI 113 and SRFI 146 is that functional update procedures and linear update procedures get along well because the functional update procedures have to return newly allocated structures when non-trivial linear update procedures are present. While this works and is easy for the user, it makes the functional update procedures far too costly.

To remedy this, Shiro and I have discussed another approach, which would make it an error to reuse a value that has been an argument of a linear update procedure. By providing conversion functions between "persistent" and "transient" versions of the data structure at hand, the implementation can even signal an error, which will also detect the typical error of not using the result of a linear update procedure but its argument after the call. The disadvantage of this approach is that the user has to write more code (converting between transient and persistent versions).

The third approach I would like to discuss here is the one I have used in my Hamt module for Gnulib. There, I have functional update procedures and destructive procedures. Each node in my data structure is shared between different data structures and the amount of sharing bookmarked with a shared counter. When a destructive update happens, a node is copied first when the shared counter is greater than 1.

The idea would be to use the same for SRFI 113, SRFI 146, SRFI 224, etc. For maximal efficiency, it needs some support from the GC (e.g. by transport guardians or finalizers) to decrease the shared counters. A fully portable implementation cannot provide this, of course, so destructive updates would become more or less equivalent to functional ones (because copying is taking place).

The good thing with this approach is that destructive update procedures would no longer be linear (with the problem that the result and not the argument has to be used) but would look like ordinary imperative update procedures.

For example, we would have (mapping-adjoin m k v), returning a new mapping and leaving M intact, and (mapping-adjoin! m k v), modifying M in place but keeping earlier results derived from M intact.

Marc