Portable S-expressions
Lassi Kortela
(16 Apr 2021 09:40 UTC)
|
Re: Portable S-expressions
John Cowan
(20 Apr 2021 12:09 UTC)
|
Re: Portable S-expressions
elf
(20 Apr 2021 12:14 UTC)
|
Re: Portable S-expressions
Lassi Kortela
(20 Apr 2021 12:49 UTC)
|
Re: Portable S-expressions elf (20 Apr 2021 12:59 UTC)
|
Re: Portable S-expressions
Lassi Kortela
(20 Apr 2021 13:09 UTC)
|
Re: Portable S-expressions
John Cowan
(20 Apr 2021 19:29 UTC)
|
Re: Portable S-expressions
Lassi Kortela
(20 Apr 2021 20:55 UTC)
|
Re: Portable S-expressions
Peter Bex
(20 Apr 2021 12:22 UTC)
|
Re: Portable S-expressions
Lassi Kortela
(20 Apr 2021 13:03 UTC)
|
Re: Portable S-expressions
Peter Bex
(20 Apr 2021 13:15 UTC)
|
Re: Portable S-expressions
Lassi Kortela
(20 Apr 2021 13:27 UTC)
|
Re: Portable S-expressions
Lassi Kortela
(20 Apr 2021 13:33 UTC)
|
As a thought... define a compat-symbol/number as the first entry in the file, which will determine specific variations. Or define a new symbol for things like booleans, that are not used by any implementation or language, and that can be trivially parsed into a local representation. In other words, it may be useful to define a sub-or-super-language which can be trivially parsed by all extant languages, instead of trying to find a subset of all languages that can be parsed identically by all languages. I think that any such subset will be too small to be useful, and will eventually result in either of the two options above, if the idea continues at all. -elf On 20/04/2021 15:48, Lassi Kortela wrote: > Thank you for the comments! > > [Hencefort, POSE is the working name for this notation :)] > > John wrote: >> In order to evaluate this idea, we need a list of Lisps that the >> syntax is to be portable to. For example: >> >> 1) Vertical bars are not portable to R6RS systems > > Yes, they are not that portable. Not even Emacs Lisp has them. Common > Lisp, R7RS, and Racket do. > >> 2) #(...) vector syntax is portable to all Lisps I know of except >> Elisp (square brackets) and Interlisp (no syntax). > > I would leave out vectors from a portable syntax since lists can > represent all data that vectors can. Interop with languages like Python, > Ruby and JavaScript is important, and those have only one > list/vector/array type. > > It would also be nice to leave open the possibility of future Lisp > and/or Scheme dialects where dotted pairs don't exist and lists are > actually vectors (heresy, yes, but AFAICT most of the good stuff from > Scheme would work fine without a pair datatype). > >> 3) R5RS requires support only for \\ and \" in strings. > Yes, R5RS string escapes are delightfully minimalist; Common Lisp is the > same. Thankfully both standardized on backslash as the escape character. > > For the string and symbol syntax, we could be ultra-minimalist, drop > vertical-bar symbols, and drop string escapes other than \\ and \". But > then we can't represent a significant amount of real text. It would be > nice if a procedure that writes POSE can be confident that any (Unicode) > string or symbol it is told to write can in fact be encoded somehow. It > can already be confident that it can encode any (proper) list; would be > nice to have the same guarantee for the other supported data types. > > elf wrote: >> Don't forget that #t/'() for certain lisps, if we're talking >> boolean-equivalents. > > Indeed, booleans are difficult to do in a syntax spanning all Lisp > dialects since there is broad disagreement. Scheme has #t/#f and a > disjoint empty list; CL has T and NIL with NIL being the same as the > empty list, etc. > > It's clear that in a portable notation, () has to stand for the empty > list for symmetry with the rest of the list syntax. > > IMHO a portable syntax shouldn't assign special meaning to symbols in > the style of CL's NIL (which is actually an abbreviation of > COMMON-LISP:NIL -- a NIL symbol in a package other than COMMON-LISP does > not necessarily mean the empty list). > > IMHO the following would be best: > > - Don't specify a boolean datatype. > > - Suggest the use of the symbols true/false or t/f as a convention, but > do not require this or specify any special handling for it. > > - nil and t are not read in as special symbols. > > [This is off topic for Scheme, but in Common Lisp implementations it > might make sense to have (defvar f nil) so people can use T and F for > booleans. CL programmers can rebind global variables for the duration of > a dynamic extent, so CL libraries that read formats like JSON or POSE > could consult the dynamic value of F to figure out how to convert > boolean false from the external notation to a Lisp value. For example, > (let ((f 'false)) (json-read)) would convert it to the symbol 'false.] > >> Is the goal here _all_ lisps, all schemes, all CLs... ? > > All Lisp dialects (as well as non-Lisp languages whose users want to > read and write S-expressions). > >> If the goal is all lisps and lisp variants, where is the line going to >> be drawn? Arguably, Ruby, for example, is a lisp variant, as it was >> first implemented in CL, and Python started out, at least, as being a >> lisp with tabs instead of (). > > I already wrote a Python library, and will write a Ruby one as well. > >> If I may be so bold as to suggest, if the goal is _all_ lisps, someone >> should present at ELS in a few weeks to get buy-in from the CL >> community. (Even if it's just all schemes, it still may be a useful >> thing to do, as they may have thought of something the scheme >> community hasn't.) > > I avoid conferences but if someone else wants to present, no objection. > > The idea of a portable syntax is to strongly favor stuff that is already > standard in existing syntaxes, so there ought not to be that much > controversy, but feedback is always welcome. In case of controversy, > it's probably best to err on the side of leaving things out. > > One thing I'd like to leave out is dotted pairs. Those are difficult to > represent in languages where lists are vectors, which is most languages > outside the Lisp family. Another point against dotted pairs is that they > are most useful in situations where other idiosyncratic syntax from a > particular Lisp dialect is called for as well. For example, a > substantial amount of Scheme code is not directly representable in POSE > due to the lack of booleans, vectors, and datum comments, which loses > that use case of dotted pairs.