3 databases in 3 days hga@xxxxxx (30 Sep 2019 00:36 UTC)
Support for Scheme standards and implementations Lassi Kortela (30 Sep 2019 08:11 UTC)
Re: Support for Scheme standards and implementations hga@xxxxxx (30 Sep 2019 11:25 UTC)
Scheme implementations and portability Lassi Kortela (30 Sep 2019 13:14 UTC)
Re: Scheme implementations and portability John Cowan (30 Sep 2019 19:27 UTC)
Scheme implementations, portability, FFIs Lassi Kortela (30 Sep 2019 21:16 UTC)
Re: Scheme implementations, portability, FFIs John Cowan (30 Sep 2019 22:10 UTC)
JDBC Lassi Kortela (30 Sep 2019 13:15 UTC)
Re: JDBC hga@xxxxxx (30 Sep 2019 13:24 UTC)
Re: JDBC and subprocess protocol Lassi Kortela (30 Sep 2019 14:29 UTC)
Re: JDBC and subprocess protocol hga@xxxxxx (30 Sep 2019 15:16 UTC)
Re: JDBC and subprocess protocol Lassi Kortela (30 Sep 2019 15:47 UTC)
Re: JDBC and subprocess protocol Lassi Kortela (30 Sep 2019 15:55 UTC)
Re: JDBC John Cowan (30 Sep 2019 15:10 UTC)
Re: JDBC Lassi Kortela (30 Sep 2019 15:26 UTC)
Re: JDBC Lassi Kortela (30 Sep 2019 15:34 UTC)
sdbi design in detail and MariaDB CONNECT hga@xxxxxx (30 Sep 2019 16:14 UTC)
Re: sdbi design in detail and MariaDB CONNECT Lassi Kortela (30 Sep 2019 16:28 UTC)
Re: sdbi design in detail and MariaDB CONNECT John Cowan (30 Sep 2019 20:25 UTC)
Re: JDBC John Cowan (30 Sep 2019 16:44 UTC)
Re: JDBC Lassi Kortela (30 Sep 2019 20:52 UTC)
Re: JDBC Alaric Snell-Pym (01 Oct 2019 09:26 UTC)
Re: JDBC hga@xxxxxx (01 Oct 2019 09:55 UTC)
Re: JDBC Alaric Snell-Pym (01 Oct 2019 11:09 UTC)

Re: JDBC Alaric Snell-Pym 01 Oct 2019 09:26 UTC
On 30/09/2019 16:25, Lassi Kortela wrote:
>> That is not at all clear to me.  As I pointed out, both designs involve
>> three processes: the client, the server, and an intermediate program
>> involving Java.
>
> I expect the solution without MariaDB will be much lighter. DB engines
> are quite big pieces of software and non-trivial to install.

Let me get this straight (I'm catching up on these emails, having had
not enough sleep): We're wondering how to connect to the rich world of
JDBC and the options on the table are:

1) Use a Java subprocess to use the JDBC libraries directly, sending
actual SQL query strings into the JDBC implementation that get sent to
the actual backend database and executed there.

2) Connect to a MariaDB server using a native-scheme protocol client,
and ask the MariaDB server to use CONNECT to proxy individual tables
from the JDBC backend.

...is that right?

In which case, I'm afraid to say, option (2) offers only rather poor
access to JDBC databases.

Say you have a massive database accessible via some JDBC driver (perhaps
it's a large Oracle cluster, the kind that big companies spend millions
of $ on).

With option (1) we could send a query like "SELECT
user_addresses.region, COUNT(*) FROM users INNER JOIN user_addresses ON
users.id = users_addresses.user_id WHERE user.type = 5 GROUP BY
user_addresses.region", and the Oracle cluster would:

1) Use an index on "user.type" to find the ten thousand user records
with "type = 5" out of the millions of user records.

2) Do fancy distributed algorithms to implement the INNER JOIN and the
GROUP BY, using the massive amounts of RAM in the database cluster to
buffer intermediate results and lots of cool tech like that.

3) Send back the result - perhaps ten records in total - over the JDBC
connection when it's ready.

With option (2), however, we'd use CONNECT to proxy the user and
user_addresses tables into a MariaDB server, and we'd send the same
query to the MariaDB server, which would:

1) Issue the following two queries via JDBC to the Oracle server:

   - "SELECT * FROM users WHERE type = 5"
   - "SELECT * FROM user_addresses"

...bringing back ten thousand user records from the first query, and
millions of user_addresses records from the second query. The Oracle
server gets to use the index on "user.type" but is otherwise just a file
server for tables.

2) Do the INNER JOIN between those and GROUP BY on the MariaDB server
set up just to do this proxying, while the expensive oracle server sits
idle.

3) Send the ten records of result back via the MariaDB protocol.

My point is, although these "proxy external tables" things are really
handy for getting some kind of access to the data, they're a pain to set
up (you need to configure each table separately) and they disallow using
many of the features of the underlying database. If the Oracle server
had GIS extensions installed, for instance, they probably wouldn't be
usable from MariaDB, unless perhaps you configured special CONNECT
tables just for that query that used them in the SRCDEF queries - you
wouldn't be able to just send a query using Oracle extensions into
MariaDB and expect it to work.

--
Alaric Snell-Pym   (M7KIT)
http://www.snell-pym.org.uk/alaric/