Email list hosting service & mailing list manager

Attempt at a stack of data formats to make everyone happy Lassi Kortela (19 Sep 2019 17:28 UTC)
Sketching the format stack Lassi Kortela (19 Sep 2019 18:07 UTC)
Re: Attempt at a stack of data formats to make everyone happy Arthur A. Gleckler (20 Sep 2019 22:19 UTC)
Re: Attempt at a stack of data formats to make everyone happy Alaric Snell-Pym (24 Sep 2019 09:02 UTC)
Core S-expression and binary formats John Cowan (24 Sep 2019 14:49 UTC)
Re: Core S-expression and binary formats John Cowan (25 Sep 2019 02:14 UTC)
Sharpsign syntax for hashtables, sets, bytevectors, etc. Lassi Kortela (25 Sep 2019 08:26 UTC)
Bytevector literals Lassi Kortela (25 Sep 2019 08:38 UTC)
Re: Sharpsign syntax for hashtables, sets, bytevectors, etc. Alaric Snell-Pym (25 Sep 2019 09:33 UTC)
Re: Sharpsign syntax for hashtables, sets, bytevectors, etc. Lassi Kortela (25 Sep 2019 09:53 UTC)
Re: Sharpsign syntax for hashtables, sets, bytevectors, etc. Alaric Snell-Pym (25 Sep 2019 10:32 UTC)
String literals inside bytevector literals Lassi Kortela (25 Sep 2019 10:46 UTC)
A S-expression syntax that can carry all this stuff Lassi Kortela (19 Sep 2019 20:01 UTC)

Re: Attempt at a stack of data formats to make everyone happy Alaric Snell-Pym 24 Sep 2019 09:02 UTC
On 20/09/2019 21:59, John Cowan wrote:

> https://bitbucket.org/cowan/r7rs-wg1-infra/src/default/CoreSexps.md is the
> next stab at core S-expressions.

I'd be inclined to remove the thing that numbers outside of ranges may
not interoperate.

1. How SHOULD one represent arbitrary numbers when they crop up in the
problem domain, then? Define a bignum format as a list of 64-bit
integers and have code to convert between them and proper numbers? Ugh!

2. People will forget about the restriction when using systems that
support bignums, which will work happily in their testing, but break in
undefined ways when interoperated with arbitrary third-party systems. Ugh!

Now, given that CoreSexps adds a new syntax #{ ... }, I don't think
there's any point in trying to make it "compatible" with (read) on any
existing Lisp by avoiding syntax that "might cause problems"; arbitrary
data shouldn't be fed into (read) in most cases due to syntax in very
many lisp implementations that can execute arbitrary code!

So I think it should be "compatible with s-expressions" for *human*
purposes (not needing to learn a new language), and perhaps to allow the
s-expression syntax of RnRS to become a superset of it in time (we can't
back-fill a written syntax for hash tables into R7RS now, alas) so that
CoreSexp literals can be written as-is in RnRS programs. But trying to
find a lowest common denominator of s-expression syntaxes is, I think, a
flawed approach, even if we then didn't leap straight out of that subset
by extending it with #{ ... }!

So my suggestion would be:

1) Take the s-expression syntax from R7RS, which IIRC has no remote code
execution defined in the standard (as opposed to CL's); but remove the '
` , ,@ syntactic sugars that just expand into (quote ...) and friends
anyway.

1a) I'm not sure if we should remove improper lists from the syntax...
It would be nice to be able to have non-Lisp implementations of this
model able to assume that lists are proper lists and can map to their
own list types.

2) Add syntax for arbitary types, perhaps of the form #NAME{ ... };
where NAME is a registry extended via SRFIs... hash tables are
important/common enough to claim the empty NAME and be written as #{key
val key val}, time objects can get #time{TYPE SECONDS NANOSECONDS}, etc.

3) Define an SRFI with "safe" read and write procedures that read and
write exactly this language, and also with a procedure to register
arbitrary type readers/writers so the arbitrary type list can be
extended by portable SRFI implementations.

Other languages can have their own implementations like that SRFI, doing
their best to map from our types into theirs.

>> Here are my suggestions for rock-bottom S-expressions:
>>
>> Proper lists as we know them.  They might turn into vectors in non-Lisp
>> systems.
>>
>> Alists as we know them.  They might turn into hashtables or dictionaries
>> in non-Lisp systems.  We always format an alist element (1 2 3) as (1 . (2
>> 3)).

How can we tell if an alist is an alist when writing? It's all just cons
cells and atoms... I'd prefer to use hash tables here, which can be
unambigously detected and written as #{ ... } under CoreSexps syntax.

--
Alaric Snell-Pym   (M7KIT)
http://www.snell-pym.org.uk/alaric/