> have problems that do not occur with comment parsing (e.g. when using
> non-ASCII-superset multi-byte encodings such as UTF-16 or Shift-JIS)?
I had that wrong -- from skimming Wikipedia, Shift JIS, EUC-* and Big5
are almost strictly ASCII-compatible in the 0..127 range (trail bytes
that can overlap ASCII characters are almost the only exception).
Would it be problematic to read the first S-expression from a Shift JIS
/ EUC-* / Big5 coded source file assuming that it's (extended) ASCII?
Then if that form is something like:
(declare-file
(coding "Shift_JIS")
...possibly other declarations here...)
The file would be read again as Shift_JIS?