Re: Harold steps forward (John steps back)

Show/hide message thread
Harold steps forward (John steps back) John Cowan (23 Sep 2019 12:04 UTC)
Re: Harold steps forward (John steps back) Lassi Kortela (23 Sep 2019 13:23 UTC)
Re: Harold steps forward (John steps back) hga@xxxxxx (23 Sep 2019 16:26 UTC)
Re: Harold steps forward (John steps back) Arthur A. Gleckler (23 Sep 2019 14:17 UTC)
Re: Harold steps forward (John steps back) hga@xxxxxx (23 Sep 2019 14:33 UTC)
Re: Harold steps forward (John steps back) hga@xxxxxx 23 Sep 2019 16:26 UTC
> From: Lassi Kortela <xxxxxx@lassi.io>
> Date: Monday, September 23, 2019 8:23 AM

>> Harold has kindly volunteered to lead the effort to create a SRFI
>> specifying a Scheme API for talking to relational-style databases.
>> I am very grateful to him for this: as chief cook and bottle-washer
>> for the ongoing R7RS-large effort, I have more than enough on my
>> plate without the database API too, and I was starting to feel
>> "thin and stretched", as Bilbo says.  I have bee impressed with
>> Harold's effort and dedication on the SRFI 170 (Posix) design and
>> implementation, and I am confident that he can produce a good
>> result that will unify this very complex and difficult problem.
>
> Excellent. John, your indefatigable efforts with the larger R7RS are
> much appreciated.

Indeed, and why I'm stepping up to the plate for this.

>> By "relational-style" I mean that the query language is not limited
>> to SQL or even the general family of relational algebras.  It can
>> be anything at all, provided the result is organized as rows and
>> named columns. Some graph query language provide this sort of
>> output, and will be under the umbrella: others don't, and will
>> require their own API.  (Perhaps "rectangular" would be a better
>> word than "relational" in "RDBMS".)
>
> Bright idea to limit the problem domain based on result shape.

That was the key to precisely scoping which databases to support.

Even for relatively alien data relation-shipping like graphs, the
greater database community has, and as a formal standard at the level
of SQL is creating query languages with rectangular results.  That a
query language is weird, well, that's not exactly unheard of with SQL,
or things like the PostgreSQL query language extensions for JSON data.

>> Harold tells me that he hopes to provide drivers for at least
>> PostgreSQL, SQLite, and MySQL/MariaDB (the Widenius sisters), the
>> most widely used open-source RDBMS engines.  He will also be taking
>> Oracle and Microsoft SQL Server, the most widely used proprietary
>> engines, into account as far as possible, along with whatever else
>> makes sense.
>
> Harold, we could decide how to split the implementation workload (of
> course assisting each other as needed, but development will probably
> go easier and produce better code if one person is the main
> developer / code reviewer of a particular backend).

That's a very good point, although database backends will consist of a
number of parts, including I think some mostly? shared database
specific type round-tripping libraries, as well as multiple Scheme
implementation specific libraries where supervision is needed.

> - I could finish the subprocess backends of SQLite and Postgres.

Let's call that concept a "connector" or "connection method".

That would be fine, but I feel I must personally get my hands dirty
with both of those databases and several connection methods.  Based in
part on my experience in creating a Chibi Scheme SRFI 170 sample
implementation, I plan for it to be my primary development platform.

A FFI is natural for SQLite3 as an embedded database, and there's a
(chibi sqlite3) Snow library by Alex Shinn to get me started there.

There's a R7RS portable (postgresql) Snow library by Takashi Kato that
uses the SRFI 106 Basic socket interface for that database's wire
protocol.  SRFI 106 isn't a part of Chibi Scheme, but likely most or
all of its hard parts are in the (chibi net) package, which is getting
some serious attention right now.

Your subprocess concept will likely be of particular value in
supporting other Schemes, and we should definitely provide official
good if not first class support to at least one other implementation.

If you're far along on a PostgreSQL one, perhaps I should make that
my primary development one for Chibi, I have yet to really scope the
effort required for the above plan.

But I think the single all around most leveraged thing you could
immediately do would be a subprocess (and maybe TCP/IP) connector to
JDBC using Java.  Which would also exercise serialization with
something in addition to C, Java of course being awfully popular.

> - Do you want to do C FFI backends of those databases?

See above.  A C FFI PostgreSQL backend for Chibi Scheme would be nice,
but probably not necessary.  Any client/server database with a stable
enough wire protocol is a lower priority for C FFIs, perhaps that's
more needed for newer non-SQL database if they don't yet make promises
about the stability of their wire protocols.  Plus the two of those
I'm personally targeting have C libraries that are highly thought of.

> - What about the JDBC subprocess backends you expressed interest in?
> I looked into JDBC cursorily; comes standard with Java (no
> dependencies other than JVM) and seems nice enough.

See above, and as we've previously discussed JDBC seems to be the only
practical generic FOSS connector for supporting a host of databases.

> - We could also explore a Python backend, but DB support would
> mostly overlap Java and C. I can't find a list of databases
> supported by Python. So far the C libraries have not been hard to
> do; the effort went into spending a day writing the S-expression
> support for C.

I don't know Python, and as of now I don't see a need for it as you
note.  Java (and/or Kawa) should suffice for JDBC, Chibi Scheme C FFIs
for that route (other Scheme implementation fans please speak up!), or
any Scheme implementation that can use sockets for connectors to
server databases with a stable wire protocol.  And I suppose in theory
we ought to standardize that on SRFI 106.

For the most important commercial databases that are officially on our
"provide very good at minimum second class support" (anything adding a
3rd process like JDBC is "second class"), John previously noted:

> I seriously doubt that the wire protocol for either Oracle or
> Microsoft is unstable.  The wire protocol for Oracle is the same
> back to 8.1.7 (2000), and for SQL Server back to 2005, IIUC.

And thus they should at some point get mostly independent of Scheme
implementations connectors.

> - I'd be happy if more people want to write backends for something;
> SQLite and Postgres is enough for me :)

I've tentatively signed up do Apache Cassandra and NeoJ4, see more in
my following posting on the entire project.

>> I look forward to seeing what he comes up with.
>
> John, do you still want to be actively involved with developing the
> text/binary data encodings despite leaving the database stuff to Harold?

I don't consider the serialization effort to be in the *design* remit
of my Scheme DBI/DBD, rather, it'll be necessary plumbing for things
like the JDBC connector, and the subprocess ones you are already
working on.

- Harold