Encoding projects to kick off this year
Lassi Kortela
(08 Jul 2020 14:13 UTC)
|
Re: Encoding projects to kick off this year
Lassi Kortela
(08 Jul 2020 14:24 UTC)
|
Re: Encoding projects to kick off this year
John Cowan
(08 Jul 2020 15:00 UTC)
|
Re: Encoding projects to kick off this year
Lassi Kortela
(08 Jul 2020 15:11 UTC)
|
Re: Encoding projects to kick off this year
Arthur A. Gleckler
(08 Jul 2020 15:11 UTC)
|
Re: Encoding projects to kick off this year
Lassi Kortela
(08 Jul 2020 15:17 UTC)
|
Re: Encoding projects to kick off this year
Arthur A. Gleckler
(08 Jul 2020 18:23 UTC)
|
Re: Encoding projects to kick off this year
Arthur A. Gleckler
(08 Jul 2020 18:30 UTC)
|
Re: Encoding projects to kick off this year
Alaric Snell-Pym
(10 Jul 2020 16:43 UTC)
|
Re: Encoding projects to kick off this year Alaric Snell-Pym (10 Jul 2020 16:37 UTC)
|
On 08/07/2020 15:13, Lassi Kortela wrote: > Things are converging such that I need to start putting more time into > encodings again. > > > == Subprocess protocol > [...] > Basically, a program `parent` would run another program `child` as a > subprocess, with a binary pipe to and from the child's stdin/stdout. The > pipe would speak a standard, very lightweight messaging/PRC protocol > (not yet decided which one). Yeah! I did exactly this for my content-addressible storage system, Ugarit - the reasoning there being (a) you get a "plugin" system for third-party storage backends without needing to mess with shared libraries portable and (b) you can make pipelines with ssh to access backends on remote systems. To clarify the latter: the configuration lets you specify a command line for the storage backend process, which might be: "ugarit-foo-backend /path/to/directory" or it might be: "ssh xxxxxx@server 'ugarit-foo-backend /path/to/directory/on/server'" I used a basic protocol I hacked together for the task at hand. Requests and responses are (depending on type) either just an s-expression or an s-expression including a length followed by that many bytes of raw data (as Ugarit's all about throwing around large blocks of data), which is clearly very tailored to the task I had at hand. However, I want to expand it to support TCP sockets and UNIX-domain sockets as well as just subprocesses (mainly because using ssh as a way to access remote servers introduces some messy latency at times), so I intend to extend a separate project of mine, "bokbok", which provides RPC services over TCP or UNIX-domain sockets, to also support subprocesses in the same framework, and port Ugarit to use Bokbok. This can also allow me to make Ugarit more efficient, as Bokbok's protocol allows for multiple request in progress; the old Ugarit encoding was strictly "send a request, wait for the response", and Ugarit could benefit from increased parallelism in its operations. That high latency of ssh means when backing up to a remote backend, a lot of time is spent with the Ugarit frontend and backend processes doing nothing other than waiting for ssh/the network to handle a small request. So, yeah, this is cool, but I recommend NOT making it just specific to subprocesses, and to share the work of defining request/response/error formats with a more general RPC system! Relatedly, while working on the binary encoding of Scheme values used by bokbok on the wire (I didn't want to just use read/write as the Ugarit protocol does as I want to use bokbok in environments with less trust, as read/write are too complicated to secure, especially with implementations including reader extensions that can execute arbitrary code), I implemented a subset of John's ASN.1-based binary sexpr thing that I'd like to spin off as a separate project that becomes a full implementation one day! > Binary S-expressions need to be done. John is rooting for ASN.1; still > an open question whether it is the best foundation to build on. I think that actual compatibility with ASN.1 as specified isn't a very interesting goal (who, exactly, wants to interoperate between Scheme and existing ASN.1 things using this?), but using ASN.1 BER as an inspiration to draw upon (and to copy the useful parts of when there's no downside to doing so) is... pragmatic. Some thought was put into the basic tagged value structure of BER, so why not use it? I have a few issues with specifics of John's proposal, though, which I can expound upon when the time comes! ABS -- Alaric Snell-Pym (M7KIT) http://www.snell-pym.org.uk/alaric/