Re: Core lexical syntax

Show/hide message thread

Core lexical syntax Lassi Kortela (25 Sep 2019 10:15 UTC)
Re: Core lexical syntax John Cowan (25 Sep 2019 14:09 UTC)
Machines vs humans Lassi Kortela (25 Sep 2019 14:25 UTC)
Re: Core lexical syntax Alaric Snell-Pym (25 Sep 2019 15:44 UTC)
Re: Core lexical syntax John Cowan (25 Sep 2019 19:18 UTC)
Mechanism vs policy Lassi Kortela (25 Sep 2019 19:58 UTC)
Re: Mechanism vs policy Arthur A. Gleckler (25 Sep 2019 21:17 UTC)
Re: Mechanism vs policy Lassi Kortela (26 Sep 2019 07:40 UTC)
Re: Mechanism vs policy John Cowan (25 Sep 2019 22:25 UTC)
Re: Mechanism vs policy Arthur A. Gleckler (26 Sep 2019 01:34 UTC)
Limits, symbols and bytevectors, ASN.1 branding Lassi Kortela (26 Sep 2019 08:23 UTC)
Re: Limits, symbols and bytevectors, ASN.1 branding Alaric Snell-Pym (26 Sep 2019 08:56 UTC)
Re: Limits, symbols and bytevectors, ASN.1 branding John Cowan (27 Sep 2019 02:38 UTC)
ASN.1 branding Lassi Kortela (27 Sep 2019 14:56 UTC)
Re: ASN.1 branding Alaric Snell-Pym (27 Sep 2019 15:24 UTC)
Re: ASN.1 branding Lassi Kortela (27 Sep 2019 18:54 UTC)
Re: Limits, symbols and bytevectors, ASN.1 branding John Cowan (27 Sep 2019 01:57 UTC)
Re: Limits, symbols and bytevectors, ASN.1 branding Lassi Kortela (27 Sep 2019 16:24 UTC)
Re: Limits, symbols and bytevectors, ASN.1 branding John Cowan (27 Sep 2019 17:37 UTC)
Re: Limits, symbols and bytevectors, ASN.1 branding Lassi Kortela (27 Sep 2019 18:28 UTC)
Re: Limits, symbols and bytevectors, ASN.1 branding John Cowan (27 Sep 2019 18:39 UTC)
Re: Limits, symbols and bytevectors, ASN.1 branding Lassi Kortela (27 Sep 2019 18:46 UTC)
Re: Limits, symbols and bytevectors, ASN.1 branding John Cowan (27 Sep 2019 21:19 UTC)
Re: Mechanism vs policy Alaric Snell-Pym (26 Sep 2019 08:45 UTC)
Implementation limits Lassi Kortela (26 Sep 2019 08:57 UTC)
Re: Implementation limits Alaric Snell-Pym (26 Sep 2019 09:09 UTC)
Re: Implementation limits Lassi Kortela (26 Sep 2019 09:51 UTC)
Meaning of the word "format" Lassi Kortela (26 Sep 2019 10:31 UTC)
Stacking it all up Lassi Kortela (26 Sep 2019 11:05 UTC)
Brief spec-writing exercise Lassi Kortela (26 Sep 2019 11:46 UTC)
Re: Brief spec-writing exercise John Cowan (26 Sep 2019 15:45 UTC)
Standards vs specifications Lassi Kortela (26 Sep 2019 21:24 UTC)
Re: Standards vs specifications John Cowan (27 Sep 2019 04:29 UTC)
Re: Standards vs specifications Lassi Kortela (27 Sep 2019 13:47 UTC)
Re: Standards vs specifications John Cowan (27 Sep 2019 14:53 UTC)
Re: Meaning of the word "format" John Cowan (26 Sep 2019 20:59 UTC)
Re: Meaning of the word "format" Lassi Kortela (26 Sep 2019 21:09 UTC)
Re: Meaning of the word "format" John Cowan (27 Sep 2019 02:44 UTC)
Length bytes and lookahead in ASN.1 Lassi Kortela (27 Sep 2019 13:58 UTC)
Re: Length bytes and lookahead in ASN.1 John Cowan (27 Sep 2019 14:22 UTC)
Re: Length bytes and lookahead in ASN.1 Alaric Snell-Pym (27 Sep 2019 15:02 UTC)
Re: Length bytes and lookahead in ASN.1 hga@xxxxxx (27 Sep 2019 15:26 UTC)
(missing)
Fwd: Length bytes and lookahead in ASN.1 John Cowan (27 Sep 2019 16:40 UTC)
Re: Fwd: Length bytes and lookahead in ASN.1 Alaric Snell-Pym (27 Sep 2019 16:51 UTC)
Re: Fwd: Length bytes and lookahead in ASN.1 John Cowan (27 Sep 2019 17:18 UTC)
Length bytes and lookahead in ASN.1 hga@xxxxxx (27 Sep 2019 16:58 UTC)
Re: Length bytes and lookahead in ASN.1 John Cowan (27 Sep 2019 17:21 UTC)
Re: Mechanism vs policy John Cowan (27 Sep 2019 03:52 UTC)
Re: Core lexical syntax Alaric Snell-Pym (26 Sep 2019 08:36 UTC)
Re: Core lexical syntax John Cowan (25 Sep 2019 14:13 UTC)

Re: Core lexical syntax Alaric Snell-Pym 25 Sep 2019 15:44 UTC

Show/hide attachments

On 25/09/2019 15:09, John Cowan wrote:

> The hyphen and underscore situation is a bit different.  Lisp programmers
> like hyphens in identifiers, as do Cobol programmers (even though Cobol has
> infix minus, which means you have to use spaces around it), because our
> languages date back to early punch card systems where underscore did not
> exist.  (In early Fortran, identifiers were limited to six characters and
> you didn't waste any of them on internal delimiters, or rather IDLTRS.)
> The younger languages, if they allow internal delimeters at all, use
> underscore, thus clearly separating it from infix minus.
>
> Nobody really needs _two_ internal delimiters, so I suggest that we either
> allow only "-" and leave it up to non-Lisp systems to change it to "_" if
> they are happier with that, or allow both but warn sternly against using
> both foo-bar and foo_bar for different purposes.

When will symbols be used as identifiers in other programming languages?
Lisp happens to use s-expressions to represent source code so symbols
are used to represent Lisp identifiers, but that relationship doesn't
hold for other languages!

>
>> Improper list   (exprs . expr)
>>
>
> *Nobody* outside the Lisp community knows what this is.  Even languages
> with linked-list support internally almost always allow only a pair or ()
> in the cdr slot.  To make it interoperate would require non-Lisp
> implementations to wrap their native array, vector, or list (in the case of
> Python) type in an opaque record type wrapper, which would block the use of
> native operations on it.  It's nothing but a nuisance to them.  And for
> data interchange as opposed to serialization, who cares about the
> difference between (a b) and (a . b) anyway, even in Lisp?  The extra pair
> is esssentially free.  (I have added an improper-list type to ASN.1 LER,
> primarily for serialization.)

Although I find it desirable to support as many Scheme values as
possible in the serialisation format, I am willing to be swayed on
improper lists, as they're almost deprecated in Scheme itself in many
respects!

>
> Vector          #{exprs}
>> Character       #\a #\newline #\x1234
>>
>
> These also, to a lesser degree, are distinctions without a difference.
> Lists vs. vectors?  Characters vs. strings of length 1?  Who cares?  One
> type is enough for each: general-purpose sequence and
> sequence-of-Unicode-codepoints.

Well, Scheme makes a distinction, and not being able to round-trip
Scheme values as simple as vectors and characters seems a shame...

Meta Question: Are we defining a serialisation format for Scheme values
(modulo some impractical things, like closures and ports) - in which
case we want to be able to cleanly round-trip as much as possible, in
general - or a new data format/model that just happens to have a
correspondence to a subset of Scheme?

Such talk as "not distinguishing vectors from lists" makes sense in the
latter case, but not in the former.

>> Special value   #!any-plain-symbol
>> Special type    #any-plain-symbol{exprs}
>>
>
> These are good on human readability and simplicity, but not so good for
> stability.  Are they really safe to ignore if you don't understand them?
> (I wish ASN.1 had such a must-understand flag.)

There's two levels of "safe to ignore" we need to think about here.

1. In an editor that lets you navigate subexpressions in a smart way,
like emacs+paredit; it would benefit from knowing that #SYMBOL{EXPR...}
is a subexpression without needing to special-case certain values of
SYMBOL. Sure, it might well special-case some things that it has special
behaviour for, but it can ignore any unknown type tags and fall back on
a more simple "I can just find the delimiters" mode.

2. In some application that's processing this data with some kind of
higher-level semantic knowledge. In this case, when it encounters an
unknown object, it needs to decide whether to skip (and HOW to skip it;
for instance, HTML renderers that don't understand an element will
usually interpret its contents as if the element wasn't there but skip
its attributes - s-expressions lack that distinction, so do we recurse
into any sub-expressions of the skipped expression in our processing, or
just skip the whole sub-expression?) or to error. File formats like PNG
assign a header bit to indicate chunks that can't be skipped, and have
no notion of nesting so there's no recursion issues.

We can approach this in one of two ways:

1. Push it to higher levels; the notion of what it means to "skip
unknown things" is somewhat domain-dependent, so a general mechanism
built into the data representation isn't necessarily universal.

2. Put safe-to-skip-subexpression markers and safe-to-recurse-through
markers (the latter with some indication of which subexpressions of the
expression to recurse into versus skip) into the syntax...

> In practice, people will
> use these to refer to concepts that other programmers don't get and will
> misunderstand or misuse.  Worse, one group may use #hash and another #dict,
> and who will know that they are the same concept?

This is a human, not technological, problem :-)

> However, I think JSON has shown that dictionaries are important enough that
> they should be first-class.  I currently recommend {key value ...}.
>
> Line comments should be provided: they are one of the things often asked
> for in JSON.

Yeah, probably.

ABS

--
Alaric Snell-Pym   (M7KIT)
http://www.snell-pym.org.uk/alaric/