Re: Constructing master lists of data types

Show/hide message thread

Constructing master lists of data types hga@xxxxxx (30 Sep 2019 14:51 UTC)

Re: Constructing master lists of data types Lassi Kortela (30 Sep 2019 15:09 UTC)

Re: Constructing master lists of data types hga@xxxxxx (30 Sep 2019 18:00 UTC)

Re: Constructing master lists of data types John Cowan (02 Oct 2019 17:15 UTC)

How to store the master lists of data types hga@xxxxxx (02 Oct 2019 17:52 UTC)

Re: How to store the master lists of data types Arthur A. Gleckler (02 Oct 2019 21:10 UTC)

Re: How to store the master lists of data types Lassi Kortela (02 Oct 2019 21:31 UTC)

Re: How to store the master lists of data types hga@xxxxxx (02 Oct 2019 21:54 UTC)

Re: How to store the master lists of data types hga@xxxxxx (02 Oct 2019 21:42 UTC)

Re: How to store the master lists of data types Arthur A. Gleckler (03 Oct 2019 04:11 UTC)

Re: How to store the master lists of data types hga@xxxxxx (03 Oct 2019 12:27 UTC)

Re: How to store the master lists of data types Lassi Kortela (03 Oct 2019 14:55 UTC)

Re: How to store the master lists of data types Arthur A. Gleckler (03 Oct 2019 15:07 UTC)

Re: Constructing master lists of data types Alaric Snell-Pym (01 Oct 2019 09:11 UTC)

Re: Constructing master lists of data types John Cowan (30 Sep 2019 21:59 UTC)

Re: Constructing master lists of data types hga@xxxxxx (30 Sep 2019 22:14 UTC)

Re: Constructing master lists of data types John Cowan (01 Oct 2019 20:05 UTC)

Re: Constructing master lists of data types Alaric Snell-Pym (02 Oct 2019 16:15 UTC)

Re: Constructing master lists of data types Alaric Snell-Pym (01 Oct 2019 09:33 UTC)

Re: Constructing master lists of data types hga@xxxxxx 30 Sep 2019 17:59 UTC

> From: Lassi Kortela <xxxxxx@lassi.io>
> Re: Constructing master lists of data types
> Date: Monday, September 30, 2019 10:09 AM
>
>> What are the various levels of Scheme data types we want to support,
>> and how, for databases?
>
> In my experience this kind of task is usually best to approach by
> making a survey and putting the results in a huge table. It gives
> confidence that we're making the right choices, and the survey usually
> finds things that are surprisingly easy or surprisingly hard. That can
> change the entire design.

I've come up with a better idea, or rather, a better form of table.

Let's put this in ... a database ^_^!

Seriously, especially if we keep with my idea of using import
name spacing (see below), we'll want to programmatically create a
variety of artifacts looking at this type data in different ways.

We'll want our packaging? building? code to ask questions like, what are
the type quirks of PostgreSQL?  The same, but also a major version.  And
the bugs we must work around.

At another level, what data types does *this* Scheme need supported, and
what are its quirks and bugs?

Another advantage would be eating our own dogfood, this would be a
simple enough but not quite trivial database application.  I'll still
need to find one or more synthetic ones that e.g. exercises all of
SQL-92 that's relevant to sdbi, but this will be something "real".

> [ "Text is messy" for a later reply. ]
>
>> At the database side, I need to construct something that contains all
>> the types a column (or whatever for e.g. graph databases) can have,
>> which are supported by which databases, and with what quirks. This
>> will eventually become architecture and code to Do The Right Thing in
>> the middle of the sdbi stack, automagically, or at the direction of
>> the user.
>
> Auto-conversion is very nice when it works. I suspect the problem domain
> is complex enough that users should be able to configure what
> auto-converter (if any) to use for each data type. It can help in
> scenarios that require high performance on big data sets to turn off
> some unused conversions. Or if you're interfacing to a legacy DB that
> has particularly weird data, some standard conversions may not work right.

That's what I've been thinking.  If you just want John's simple style of
Scheme persistence for a greenfields application, or the automagic
conversions are "obvious", you can use the API's defaults.  If you need
conversion specificity, you supply the procedure(s) necessary, which may
come from general sdbi libraries.

> From: Lassi Kortela <xxxxxx@lassi.io>
> Subject: Re: sdbi design in detail and MariaDB CONNECT
> Date: Monday, September 30, 2019 11:28 AM
>
> [...]

>> This is a very good point.  To use the current (iffy) hierarchy of sdbi
>> for this, I have, for example, running in bare direct prototype form:
>>
>> (sdbi connector ffi sqlite3)
>>
>> And:
>>
>> (sdbi connector net postgresql)
>>
>> To which Lassi is proposing to add something like:
>>
>> (sdbi connector subprocess-jdbc)
>>
>> Which would be used as:
>>
>> (sdbi connector subprocess-jdbc [obscure database])
>
> The ffi/net/subprocess hierarchy is very nice!

Thanks, although this discussion revealed a hole in it, which you have
filled below:

WRT to my comments on using a database to store type info, a database
will be wanted for such a hierarchy, to supply the details to construct
general boilerplate code for the parts below sdbi.

> I thought it'd be just (sdbi connector subprocess). The JDBC
> subprocess would speak the same generic protocol as all other
> subprocesses, and the specific database to which JDBC should connect
> can be given parameters in the "connection string" when opening the
> connection. At least it's been my impression of JDBC that it's generic
> enough that this kind of thing would work :)

Yeah, that'll work from my impression about JDBC.  Would require a bit
of magic to dynamically load into the JVM the specific database driver.

Which would appear to duplicate information, but there's a big
difference between the last of (sdbi connector subprocess postgresql),
and JDBC being told to use the org.postgresql/postgresql 42.1.4 library.

It implies a subprocess capability that'll accept all that vs. being
told to run the SQLite3 library, but if you say it can....

In any case, we need to think hard about what's in the import hierarchy,
what'll be in the "connection string" alist, or the configuration
optional following equivalent argument, and that we not duplicate
information in these two sets.

>> Perforce, there will be a generic protocol between the connector
>> layer and the one below it, although in John's favor I need to look
>> at the PostgrSQL and Widenius wire protocols for ideas.
>
> If I have understood JDBC correctly, it can use the generic subprocess
> protocol with no changes.

We'll figure it out in due course.

> [ CONNECT, except for: ]
>
> Probably :) Here too I'd say the main point is to design the Scheme
> side of the database framework so that any net/ffi/subprocess is easy
> to add.  Then if CONNECT turns out to be dodgy we can implement
> another route.

As I see it based on our understanding of what John said, by virtue of
it having its own SQL engine that the user creates queries for, and
e.g. connects to for example CSV files, it would count as its own
database.  It would fit like this:

(sdbi connector [ ffi | net | subprocess] mariadb-connect)

And/or it would be dynamically told which CSV files to open.  For which
we need a mechanism; noted.

Which brings up the question, why did I insert "connector" in this?  The
answer is human comprehensible name spacing, underneath connector is a
large set of system and database specific things.  Under sdbi type??
might be "do-not-leave-home-without-these-utf8-procedures".

If something is generic but within the remit of sdbi, it wouldn't live
under connector.  Whereas each database would probably have a
"final-type-conversion" library under it, like

(sdbi connector net mariadb-connect type-conversion)

Which the user would not need to know about.  The PostgreSQL Snow
library I've adopted already does a lot of that to good effect.

- Harold