From: John Cowan <xxxxxx@ccil.org>
Date: Saturday, July 06, 2019 1:22 PM

On Sat, Jul 6, 2019 at 11:21 AM <xxxxxx@ancell-ent.com> wrote:

And 'no-transaction is intended to be something of a lie, in that for databases that support transactions, the single procedure using it as its transaction handle must execute it in a transaction.  The idea is to give the library code the information it needs to execute a single operation with the highest efficiency and safety.  And I'm pretty sure we don't want 'no-transaction to skip triggers.

I think it's better to call it 'auto-commit' for that reason, and to represent it by passing a database instead of a transaction to the various operators.  The effect is as if the transaction is begun, executed, and committed all at once.  How many network round-trips this takes is completely dependent on the implementation, of course.

Amirouche and I agree, and this has been done to the work in progress, except for calling it "auto-commit", and some suggested cleanup.

[...]
 
For maintenance purposes, we need a way to skip trigger procedures, but allow transactions. 

I think constraints are a better name than triggers even on the server side, particularly if they are going to be pure and functional (as I think they really must, for safety's sake).

This class of databases is not pure and functional, are not as ambitious as Datomic, which adds the dimension of time and conceptually never deletes anything (it has an out for legally required deletions, and pays for this by for example a relatively low insertion rate).

That said, requiring the procedures to be pure and functional is absolutely the right thing, rather than trying to automagically fix up goofs such as requiring cascading deletes to maintain integrity (and come to think of it, I never designed a database that implemented them, too easy to accidentally lose important data).  And we will have a tool in SRFI-172, Two Safer Subsets of R7RS ... well, that would restrict the number of Scheme implementations that can do this properly and Officially.  Can't speak to how easy it would be to do roughly the same thing in earlier revisions of Scheme, but it's something to aspire to.

And SQL's CONSTRAINTs are indeed functional except for the special case of DEFAULT, which supplies a set value for a column if it has none, which is very handy for adding a column to a table without updating all existing rows.  If that sort of behavior is desired, it would have to be at the level of SRFI-168, for -167 the values of keys are completely opaque.

I still really don't know how this should work  I think they have to be done in Scheme and most probably on the client side in the library that provides this SRFI.

Those are certainly fine ways of doing it.  The SRFI-167 library talks directly to the database, which probably doesn't have such concepts, let alone allowing you to easily implement them in Scheme.

A constraint-registering procedure should accept a prefix and a procedure (which gets the actual key, the value, and an insert/delete flag)

I have a feeling it would be better if there were separate procedures for insertion and deletion; what code could be easily shared if there was just one?

and returns a pass/fail indication.  An open question is whether failure should be a boolean or an exception (in which case passing means returning an unspecified value).

A boolean would imply recovery is possible, but normally a constraint failure signals a data integrity or validation error which shouldn't have happened, or an error in coding the constraint, so it should raise an exception, perhaps a non-continuable one except for validation ones (like "is age > 18").  Which fits this SRFI's interaction style, which does not use error returns.

But the proper scope in time and space of a registered constraint (global, per-machine, per-process, per-thread, per-connection, per-transaction) is still a complete mystery to me.

Needs to be centralized maintain data integrity, so "per database", whatever that means, at a low enough level that all mutation transactions must go through it, i.e. implemented in this SRFI.  The scope in time I'm thinking of is from the initialization of the higher level library to program exit, unless you disable one or more constraints for maintenance, and then hopefully remember to enable it.  Which suggests a wrapper rather than explicit disable and enable procedures, but that's not ideal for maintenance.

So it would apply to every properly initialized process on every machine, thread, connection, and transaction.

There should be no escaping this, other than failing to initialize the library above this SRFI, or explicitly disabling them afterwords.  And there's the idea of that an implementation would store the procedures in the database, and initialize them during okvs-open.  The only reasons to not make that mandatory are the added implementation specific complexity, like Chicken requiring a trip to a C compiler to make them fast, and security in locking down your databases so people can't change them out of band.

[...]
 
That allows two approaches for deciding which triggers to call: either the one that matches the largest fraction of the key prefix, or all that match some fraction of the prefix, starting from the greatest to the least matches, where if a no prefix trigger exists it would be called for every okvs-set!, okvs-delete! or okvs-range-remove! mutation operation (for okvs-range-remove!, for each key-value pair deleted).

How about extending the interface above so that a registered constraint procedure gets a continuation argument which it tail-calls to let higher-level constraints decide?  Best of both worlds and very Schemey.

But that would require me learning continuations for real....

More seriously, how much do we care about Scheme implementations that implement continuations slowly, imperfectly, or not at all?

This sounds like it would require running every registered constraint for every key to "decide" if it should fire, vs. using an index which would allow the implementation to efficiently decide which to call in the desired order.  Any suggestions for such an index type?

Does a hierarchy of prefixes make sense

I'd say it makes every kind of sense.  Consider group-of-tables, table, and column specified with longer and longer prefixes.

I think so as well, but I'd like to hear Amirouche's opinion.  For now, I'm going with a hierarchy.

[ If constraints, not triggers, no need for before and after ones, or multiple for a prefix sorted by name, very unSchemey.  And an issue that's irrelevant if we allow multiple constraints per key. ]
 
We need to make this non-local implicit behavior discoverable, with something like (okvs-trigger-print 'set! prefix) that prints the procedure or procedures that get called; if multiple, in order along with when and where the mutation is done.

If it really must be non-local, then yes.

The concept is inherently non-local, it's to enforce error catching when you make mistakes in your higher level library code.

 [ I suspect we're all agreed that it's beyond the scope of this SRFI to lock down access and mutation of the database. ]

- Harold