One issue of S-expression metadata is that S-expression reader ismore involved than a simple finite automata.

If you look for a special character sequence without fully parsing S-expression, you can just use a magic comment

or #!identifier token just as well.

#!identifier token, as Marc brought up, is nice because we already have it. The drawback I see is that

- Probably we want something like #!coding=<value>, rather than just #!<identifier>. E.g. #!coding=utf-8 instead of #!utf-8.

The latter can eat up #!-namespace quickly.

- Recognizing coding[=:]<value> (without #!) can work with the editor. In Emacs, adding -*- coding: <coding> -*- immediately

switches the buffer encoding automatically. Having "#!" will lose it.

But I agree that it's getting moot as utf-8 dominates. The only concern is that, since R7RS-small doesn't require full unicode

support, the R7RS implementation that only supports ascii needs a way to reject source code using greek-lambda gracefully.

On Sun, May 12, 2019 at 3:23 AM Lassi Kortela <xxxxxx@lassi.io> wrote:

> The 'read' procedure that looks for the encoding declaration should be a
> special reader that's much simpler than the normal Scheme reader and
> handles encoding errors gracefully [...]

Here's a quick sketch of the idea implemented in 100 lines of R7RS
Scheme: <https://github.com/lassik/scheme-encoding-declaration>

It reads up to 1000 bytes from the file into a bytevector and reads the
first Scheme form from that bytevector, interpreting the bytes as an
ASCII superset. If there are any errors, it simply returns #f.