One issue of S-expression metadata is that S-expression reader ismore involved than a simple finite automata.
If you look for a special character sequence without fully parsing S-expression, you can just use a magic comment
or #!identifier token just as well.

#!identifier token, as Marc brought up, is nice because we already have it.  The drawback I see is that

- Probably we want something like #!coding=<value>, rather than just #!<identifier>.  E.g. #!coding=utf-8 instead of #!utf-8.
  The latter can eat up #!-namespace quickly.
- Recognizing coding[=:]<value> (without #!) can work with the editor.  In Emacs, adding -*- coding: <coding> -*- immediately
  switches the buffer encoding automatically.  Having "#!" will lose it.

But I agree that it's getting moot as utf-8 dominates.   The only concern is that, since R7RS-small doesn't require full unicode
support, the R7RS implementation that only supports ascii needs a way to reject source code using greek-lambda gracefully.




On Sun, May 12, 2019 at 3:23 AM Lassi Kortela <xxxxxx@lassi.io> wrote:
> The 'read' procedure that looks for the encoding declaration should be a
> special reader that's much simpler than the normal Scheme reader and
> handles encoding errors gracefully [...]

Here's a quick sketch of the idea implemented in 100 lines of R7RS
Scheme: <https://github.com/lassik/scheme-encoding-declaration>

It reads up to 1000 bytes from the file into a bytevector and reads the
first Scheme form from that bytevector, interpreting the bytes as an
ASCII superset. If there are any errors, it simply returns #f.