Re: Simplifying SRFI 109, part 1: entities John Cowan 30 Mar 2013 23:27 UTC

Per Bothner scripsit:

> The reader *could* return a literal string in cases where there are no
> enclosed expressions, but I feel uncomfortable with that - it seems
> a bit hacky and inconsistent.  For read/write round-tripping we have
> the traditional string literals, so I think it is cleaner to have the
> &{...} always return a ($string$ ...) form.

+1, given that you want to include enclosed expressions.

> Not hard-wiring in entity names is especially important for STFI-107,
> since the XML/SGML model does allow user-defined entity names.

But only if they are declared in a DTD, either externally or internally.
SRFI 107 makes no provision for doing so, so I consider that argument to
be unsound.

If SRFI 107 is to allow arbitrary entity names, getting their values
from corresponding Scheme variables is a sensible approach.  However,
in that case the SRFI 107 model must be extended to include at least
a mechanism for declaring an external subset as part of a new "XML
document" constructor.  The SRFI 107 reader need not read this subset:
it can be a programmer responsibility to keep it in sync with the
definitions of entity names at the Scheme level.  Alternatively, only
the standard five entities should be allowed.

I have added the SRFI 107 mailing list as a recipient.

> Having these be hard-wired into the reader is not IMO in the spirit of
> XML.  Even if the reader uses a user-programmable table it would be
> information-losing for the reader to expand the entity names.

Having the five standard XML names wired into a SRFI 107 processor is
definitely in the spirit of XML.

> Even then using using a programmable read-time lookup table is clearly
> less "Schemey" than using regular expand-time name-lookup.

I agree that programmability at read time is a Bad Thing because of the
phasing issues (the same as expand-time vs. read-time phasing).

> I should probably also state that an implementation MUST support the
> standard Scheme character names.

You don't say which list of standard names you are referring to.
R5RS mandates only "space" and "newline"; R6RS mandates "nul", "alarm",
"backspace", "tab", "linefeed", "newline", "vtab", "page", "return",
"esc", "space", and "delete"; R7RS-small mandates "null", "alarm",
"backspace", "tab", "newline", "return", "escape", "space", and "delete".

--
I could dance with you till the cows            John Cowan
come home.  On second thought, I'd              http://www.ccil.org/~cowan
rather dance with the cows when you             xxxxxx@ccil.org
come home.  --Rufus T. Firefly