Encoding projects to kick off this year Lassi Kortela (08 Jul 2020 14:13 UTC)
|
Re: Encoding projects to kick off this year
Lassi Kortela
(08 Jul 2020 14:24 UTC)
|
Re: Encoding projects to kick off this year
John Cowan
(08 Jul 2020 15:00 UTC)
|
Re: Encoding projects to kick off this year
Lassi Kortela
(08 Jul 2020 15:11 UTC)
|
Re: Encoding projects to kick off this year
Arthur A. Gleckler
(08 Jul 2020 15:11 UTC)
|
Re: Encoding projects to kick off this year
Lassi Kortela
(08 Jul 2020 15:17 UTC)
|
Re: Encoding projects to kick off this year
Arthur A. Gleckler
(08 Jul 2020 18:23 UTC)
|
Re: Encoding projects to kick off this year
Arthur A. Gleckler
(08 Jul 2020 18:30 UTC)
|
Re: Encoding projects to kick off this year
Alaric Snell-Pym
(10 Jul 2020 16:43 UTC)
|
Re: Encoding projects to kick off this year
Alaric Snell-Pym
(10 Jul 2020 16:37 UTC)
|
Things are converging such that I need to start putting more time into encodings again. == Subprocess protocol Based on that database subprocess thing we did, I have a general idea to establish a protocol for subprocess servers of all kinds. It would be programming language agnostic: there's no need to tie it to Lisp/Scheme. Basically, a program `parent` would run another program `child` as a subprocess, with a binary pipe to and from the child's stdin/stdout. The pipe would speak a standard, very lightweight messaging/PRC protocol (not yet decided which one). The protocol would have standard data types (approximately the same set that JSON has); standard ways to mark messages as command, answer, and error; and a standard facility for reflection (i.e. finding out which messages are supported). Of course, since it's just a pipe, one can transparently switch to a socket (perhaps with TLS) instead. Subprocesses like this could be made for file systems, file formats, databases, query engines, data sources, and anything else one can think of. Most acutely we'd need the databases for Schemepersist, but this feels like the kind of thing that would turn into a thriving cottage industry as long as the protocol is simple and we seed it with a set of useful programs. There are a zillion encodings that could be used for the protocol: JSON-RPC, MessagePack, S-expressions, ASN.1, etc. etc. Suggestions gratefully accepted. Two things I don't like about most of the alternatives: * Too many data types that are only marginally useful. * Arbitrary limits or complexities in integer encoding. One main question about the protocol is how to represent messages: Lisp-style lists (message-type . args) or JSON-style objects {"message-type": "foo", "args" ...}. == Cataloguing S-expression variants I want to start an "encyclopedia" of every variant of S-expressions ever. It can be seeded with the syntaxes of the major Lisp dialects. I'd also like to accumulate a library capable of reading and writing all of them (not that hard, since there are so many commonalities that it can be made out of reusable building blocks). Once we have that library, it can be plugged into a universal Lisp pretty-printer. Then we can make a code formatter that can format every Lisp dialect, with customized indentation for macros. The hard parts are parsing (preserving comments), figuring out where to put line breaks, and customized indentation for macros. It makes sense to do the job right once. We can start with Marc Feeley's popular pretty-printer as well as these papers: * Strictly Pretty (Lindig 2000) -- Implemented by Arthur, this paper translates some Haskell work to OCaml which should be translatable to Scheme. The Haskell folks discovered a small set of generic combinators that can act as the backbone of pretty-printers for arbitrary languages. * AI Memo 279: Pretty-Printing (Goldstein 1973) -- Describes the classic GRIND algorithm from the MIT Lisp community. == Defining portable variants of S-expressions We should formally define some known-portable variants that people can use when they want assurance of cross-Lisp interoperability. LOSE (line-oriented s-expressions) is a start, as is Core S-expressions that John drafted based on discussions on the schemepersist list last year. Binary S-expressions need to be done. John is rooting for ASN.1; still an open question whether it is the best foundation to build on. == Schemas and translators There needs to be a generic language to easily write schemas for S-expression-based formats. Discovering generic rules to translate between S-expression-like and JSON-like formats would be very useful, as S-expressions are quite fringe.