(Previous discussion continued)
Re: Simplifying SRFI 109, part 1: entities Per Bothner 31 Mar 2013 02:18 UTC

Re: Simplifying SRFI 109, part 1: entities Per Bothner 31 Mar 2013 02:18 UTC

On 03/30/2013 04:27 PM, John Cowan wrote:
> Per Bothner scripsit:
>> Not hard-wiring in entity names is especially important for STFI-107,
>> since the XML/SGML model does allow user-defined entity names.
> But only if they are declared in a DTD, either externally or internally.
> SRFI 107 makes no provision for doing so, so I consider that argument to
> be unsound.

DTDs and specifically entity declarations are tied to the specific
syntactic form of XML.  SRFI 107 supports that at the "local syntax"
level, but not the "file syntax" level.  It is more important to
preserve the XML conceptual "information model", and in that sense
an entity reference corresponds to a variable reference, and an
entity definition corresponds to a variable definitions.

Also, some XML API preserve EntityRef as a unexpanded Node.
For example Scala:
I wouldn't want to preclude that kind of API.

> If SRFI 107 is to allow arbitrary entity names, getting their values
> from corresponding Scheme variables is a sensible approach.  However,
> in that case the SRFI 107 model must be extended to include at least
> a mechanism for declaring an external subset as part of a new "XML
> document" constructor.

SRFI 107 as currently written does not support the concept of an
XML document - whether we mean:
(1) XML document as a file format.
(2) DOM Document as a data type for representing the "significant"
SRFI-107 doesn't directly support either.  I think APIs supporting
both are desirable - and SRFI-107 should hopefully work well with
such APIs.  However, this process has dragged on long enough, and
working with documents seems like new functionality that I think
should be saved for future work.

> The SRFI 107 reader need not read this subset:
> it can be a programmer responsibility to keep it in sync with the
> definitions of entity names at the Scheme level.  Alternatively, only
> the standard five entities should be allowed.

For SRFI-108/-109 I think we should have { and } -
obviously they're not needed for SRFI-107.

>> I should probably also state that an implementation MUST support the
>> standard Scheme character names.
> You don't say which list of standard names you are referring to.
> R5RS mandates only "space" and "newline"; R6RS mandates "nul", "alarm",
> "backspace", "tab", "linefeed", "newline", "vtab", "page", "return",
> "esc", "space", and "delete"; R7RS-small mandates "null", "alarm",
> "backspace", "tab", "newline", "return", "escape", "space", and "delete".

Indeed, that is rather vague - and raises these questions:
(1) What are "standard Scheme character names"? I suggest going with the
R7RS names.
(2) When it comes to an implementation supporting "standard Scheme
character names", is this a "must" or a "should"?  I could go either way.
(3) Do we want a different answer for SRFI-17 and SRFI-108/-109?
I'd prefer not.
	--Per Bothner
xxxxxx@bothner.com   http://per.bothner.com/