Re: Constructing master lists of data types

Show/hide message thread

Constructing master lists of data types hga@xxxxxx (30 Sep 2019 14:51 UTC)

Re: Constructing master lists of data types Lassi Kortela (30 Sep 2019 15:09 UTC)

Re: Constructing master lists of data types hga@xxxxxx (30 Sep 2019 18:00 UTC)

Re: Constructing master lists of data types John Cowan (02 Oct 2019 17:15 UTC)

How to store the master lists of data types hga@xxxxxx (02 Oct 2019 17:52 UTC)

Re: How to store the master lists of data types Arthur A. Gleckler (02 Oct 2019 21:10 UTC)

Re: How to store the master lists of data types Lassi Kortela (02 Oct 2019 21:31 UTC)

Re: How to store the master lists of data types hga@xxxxxx (02 Oct 2019 21:54 UTC)

Re: How to store the master lists of data types hga@xxxxxx (02 Oct 2019 21:42 UTC)

Re: How to store the master lists of data types Arthur A. Gleckler (03 Oct 2019 04:11 UTC)

Re: How to store the master lists of data types hga@xxxxxx (03 Oct 2019 12:27 UTC)

Re: How to store the master lists of data types Lassi Kortela (03 Oct 2019 14:55 UTC)

Re: How to store the master lists of data types Arthur A. Gleckler (03 Oct 2019 15:07 UTC)

Re: Constructing master lists of data types Alaric Snell-Pym (01 Oct 2019 09:11 UTC)

Re: Constructing master lists of data types John Cowan (30 Sep 2019 21:59 UTC)

Re: Constructing master lists of data types hga@xxxxxx (30 Sep 2019 22:14 UTC)

Re: Constructing master lists of data types John Cowan (01 Oct 2019 20:05 UTC)

Re: Constructing master lists of data types Alaric Snell-Pym (02 Oct 2019 16:15 UTC)

Re: Constructing master lists of data types Alaric Snell-Pym (01 Oct 2019 09:33 UTC)

Re: Constructing master lists of data types Lassi Kortela 30 Sep 2019 15:09 UTC

> What are the various levels of Scheme data types we want to support, and how, for databases?

In my experience this kind of task is usually best to approach by making
a survey and putting the results in a huge table. It gives confidence
that we're making the right choices, and the survey usually finds things
that are surprisingly easy or surprisingly hard. That can change the
entire design.

> Text is *messy*

Yes. We may have to delegate most of the messes to DB engine
configuration or leave them as the user's problem :)

> databases tend to be set to one encoding

Yes, but the percentage of data in a large database that actually uses
the claimed encoding is another matter :) I've dealt with huge databases
where there were continual character encoding problems. Likewise, people
who administer long-lived web forums can be heard complaining about it.

> but SQLite3 is more flexible. Suggestions on precisely what to do here are solicited, this is not a thing I know very well.

If the DB engine encoding is known, and we're running in a Scheme
implementations like Gauche that supports many character encodings, we
can construct strings with the correct encoding right off the bat.

In Schemes that use only one character encoding internally, and the
query results use a different encoding, we should probably return the
results as bytevectors instead of strings. Then the user can hook up
some charset conversion library if they want strings (or reconfigure
their database to emit the encoding used by the Scheme).

DB engines are big pieces of software so maybe many of them have charset
conversion engines built in, and we can pick the encoding we want to
receive in the connection string (https://www.connectionstrings.com/).

> At the database side, I need to construct something that contains all the types a column (or whatever for e.g. graph databases) can have, which are supported by which databases, and with what quirks. This will eventually become architecture and code to Do The Right Thing in the middle of the sdbi stack, automagically, or at the direction of the user.

Auto-conversion is very nice when it works. I suspect the problem domain
is complex enough that users should be able to configure what
auto-converter (if any) to use for each data type. It can help in
scenarios that require high performance on big data sets to turn off
some unused conversions. Or if you're interfacing to a legacy DB that
has particularly weird data, some standard conversions may not work right.