Re: SRFI 220: Line directives

Show/hide message thread

SRFI 220: Line directives Arthur A. Gleckler (09 Feb 2021 23:01 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (10 Feb 2021 06:49 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (10 Feb 2021 07:20 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 08:46 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (10 Feb 2021 10:14 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 10:37 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 10:19 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 10:24 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (10 Feb 2021 10:30 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 10:54 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 11:13 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (10 Feb 2021 12:31 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 12:41 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (10 Feb 2021 12:49 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 13:12 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (10 Feb 2021 13:21 UTC)

Re: SRFI 220: Line directives Vladimir Nikishkin (10 Feb 2021 12:47 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (10 Feb 2021 12:53 UTC)

Re: SRFI 220: Line directives Vladimir Nikishkin (10 Feb 2021 12:56 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 12:57 UTC)

Re: SRFI 220: Line directives Vladimir Nikishkin (10 Feb 2021 13:05 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (10 Feb 2021 13:13 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 13:26 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (10 Feb 2021 12:25 UTC)

Re: SRFI 220: Line directives Vladimir Nikishkin (10 Feb 2021 13:36 UTC)

Re: SRFI 220: Line directives Lassi Kortela (10 Feb 2021 13:49 UTC)

Re: SRFI 220: Line directives Vladimir Nikishkin (10 Feb 2021 15:42 UTC)

Re: SRFI 220: Line directives Lassi Kortela (11 Feb 2021 10:06 UTC)

Declarations in general Lassi Kortela (11 Feb 2021 10:26 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (11 Feb 2021 12:18 UTC)

Re: SRFI 220: Line directives Lassi Kortela (11 Feb 2021 12:57 UTC)

Re: SRFI 220: Line directives Lassi Kortela (17 Feb 2021 08:23 UTC)

Re: SRFI 220: Line directives John Cowan (18 Feb 2021 03:07 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (18 Feb 2021 10:16 UTC)

Re: SRFI 220: Line directives John Cowan (18 Feb 2021 23:47 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (19 Feb 2021 07:08 UTC)

Re: SRFI 220: Line directives Lassi Kortela (19 Feb 2021 07:16 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (19 Feb 2021 07:18 UTC)

Re: SRFI 220: Line directives Lassi Kortela (19 Feb 2021 07:27 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (19 Feb 2021 07:32 UTC)

Re: SRFI 220: Line directives Lassi Kortela (19 Feb 2021 07:42 UTC)

Re: SRFI 220: Line directives Marc Nieper-Wißkirchen (19 Feb 2021 08:35 UTC)

Re: SRFI 220: Line directives John Cowan (20 Feb 2021 01:11 UTC)

Re: SRFI 220: Line directives Lassi Kortela 10 Feb 2021 08:46 UTC

Thank you for the astute comments!

>     (1) What is the reader supposed to do with the parsed line
>     directives? Can you explain this in the context of the `read` procedure?

The line directive

#! foo bar baz

is supposed to be read in as if implicitly wrapped in a quoted list:

'(foo bar baz)

The line directives are not evaluated like ordinary Scheme code (just
like existing directives such as #!r6rs are not so evaluated). The SRFI
doesn't specify what Scheme implementations should do with them because
I couldn't think of a good general-purpose solution and it's not really
crucial: the main benefit of the syntax is that external,
language-agnostic tools such as Emacs, Vim, Unix kernels (#! line),
license checkers, etc. can read the syntax while we retain the
possibility of parsing it in the manner of ordinary Scheme syntax.

If line directives are implemented, the most likely things that Scheme
implementations would do with them in the near future are:

(1) Detect character encoding from something like "#! coding: euc-jp".

(2) Offer a hook to let users attach custom parser procedures.

But since the set of directives is arbitrarily extensible (just coin new
identifiers), it's likely that more uses would be discovered over the
years. Here are all the uses I know of that Scheme implementations have
for directives so far: <https://registry.scheme.org/#hash-bang-syntax>.
Quite a few, and our existing directives can't even take "arguments".

One could argue there is a preference to use directives only for things
that affect the settings of the Scheme reader, but as the registry
shows, that rule is often broken, including by Scheme standards (R2RS
and DSSSL) and prominent implementations such as Chez and Gambit. That,
and the fact that Scheme's directive marker "#!" matches the Unix
interpreter marker "#! /usr/bin/env foo", makes it attractive to use
"#!" as a general-purpose metadata marker.

>     (2) IMO, it is a design error that the datums following `#!` mustn't
>     contain newlines. Don't change the datum system of single datums.
>     This is important for consistency, for simplicity of reader
>     implementations, and for compatibility with various pretty printers
>     that may insert line endings for readability. We don't have a
>     problem with line comments because line comments do not contain
>     datums. For anything containing datums, which are newline agnostic
>     (in contrast to Python syntax), choosing the newline as a delimiter
>     is a bad idea.

Indeed, I was dithering between whether or not newlines should be
allowed inside the datums. I agree with you that it's the right thing to
allow newlines, and thereby let implementations re-use Scheme's `(read)`
as it is. However, it seems the draft ended up internally inconsistent.
Sorry about that.

The thing with newlines is that most or all of the magic comments from
other tools tend to fit on one line:

#! /usr/bin/env gosh

#! vim: tw=60 ts=2 expandtab fileencoding=euc-jp :

#! SPDX-License-Identifier: GPL-3.0-or-later

If our more principled syntax for these "magic comments" permits
multi-line data unlike the other languages, that can be confusing.

I now think the best compromise is to read datums using Scheme `(read)`,
but only as long as the line number does not change. The trick is that
since `(read)` always reads a complete datum, and Scheme datums can
contain embedded newlines, it's possible to write things like multi-line
lists to express complex things. That's potentially very useful, and
since it makes no sense to have partial lists cut off at the middle
without matching closing parentheses, people are unlikely to be confused
about the meaning of a multi-line "#!" directive. (Even though
"multi-line line directive" sounds amusing; is there a better name?)

So things like this would be legal:

#! (foo
      (bar 123)
      (baz (2 3
              4)))

and even this:

#! foo bar (baz
              (... (...)))  ; tail comment

and even crazier:

#! foo #| block comment starts at this line
           and keeps going
           for another two lines. ends here |# (final
                                                list) ; and tail comment
(but this line is no longer part of the directive)

Even with all these arcane corner cases, at least to me the rule seems
easy enough to follow: while on the same line as the original "#!", keep
reading another datum.

> (3) Why do you special-case `#!r6rs`/`#! r6rs`? What about
> `#!fold-case`, etc.?

That's more bad wording on my part. #!r6rs is not meant to be
special-cased, but is merely used to illustrate what is done to all
existing directives: #!r6rs, #!fold-case, #!chezscheme, #!bwp, etc.

We should think about whether "#!foo" and "#! foo" should be required
mean the same thing. That probably makes sense; it seems confusing if
they mean something different.