Re: Simplifying SRFI 109, part 1: entities

Re: Simplifying SRFI 109, part 1: entities Per Bothner 31 Mar 2013 07:12 UTC
On 03/30/2013 10:57 PM, John Cowan wrote:
> Per Bothner scripsit:
>
>> It is more important to preserve the XML conceptual "information
>> model",
>
> Absolutely, but SRFI 107 models only a part of the XML Infoset
> <http://www.w3.org/TR/xml-infoset>, of which (ahem) I was the principal
> editor.

Yes, I actually consulted that today and noticed the "coincidence"
of the name.  Glad to confirm my assumption they're one and the same.
(I also know a John Cowan socially, but he doesn't appear to be you ...)

>> SRFI 107 as currently written does not support the concept of an
>> XML document - whether we mean:  (1) XML document as a file format.
>> (2) DOM Document as a data type for representing the "significant"
>> information
>
> It's the second concept I mean.
>
>> SRFI-107 doesn't directly support either.  I think APIs supporting
>> both are desirable - and SRFI-107 should hopefully work well with such
>> APIs.  However, this process has dragged on long enough, and working
>> with documents seems like new functionality that I think should be
>> saved for future work.
>
> In that case, perhaps this SRFI should be renamed "XML element reader
> syntax".

First, this SRFI also has a reader syntax for PI nodes, comment nodes,
and CDATA nodes.

There is no support for (top-level) attribute nodes, though you can
write them with a $xml-attribute$ constructor.

My assumption is a Scheme API for XML would have standard
function calls for creating DOM Nodes, and the SRFI-107
syntax would be basically syntactic sugar.

Some other languages also provide similar XML "literals".
My goal is not to embed XML documents inside a program,
but to provide a more familiar syntax equivalent to node
constructor expressions.  XQuery has equivalent functionality
as this proposal: XML-style syntax for elements, PI, and
comments.  For attributes and documents you need to use
"computed constructors".  "EcmaScript for XML" is similar.
Visual Basic does support "XML Document literals", and
I guess we can add support for them.  Perhaps we can allow:

   #<?xml optional-stuff?><root>...</root>

and/or:

   #<!DOCTYPE root optional-stuff><root>...</root>

And of course we can use SRFI1-108 in some way.

However, doing this "right" adds a fair bit of extra
work and complexity.  It also pushes the limits of my
expertise.  The obvious reader translation is to an
$xml-document$ constructor.  Beyond that things are less
obvious.  Because the infoset for an XML document is rather
complex, it seems cleaner to use keywords - but alas keywords
are far from standard: We get into the question of
whether to use SRFI-88-style keywords or plain symbols.
I like SRFI-88-style keywords of course (Kawa has them),
but obviously this limits portability.  (Though if we
add them to R7RS-large I'd have less reluctance ...)

If you think the name "XML reader syntax" is misleading,
we could change the name to "Basic XML reader syntax" by
analogy with SRFI-28.

>> Indeed, that is rather vague - and raises these questions:  (1) What are
>> "standard Scheme character names"? I suggest going with the R7RS names.
>
> I'm happy with that, of course.
>
>> (2) When it comes to an implementation supporting "standard Scheme
>> character names", is this a "must" or a "should"?  I could go either
>> way.  (3) Do we want a different answer for SRFI-17 and SRFI-108/-109?
>> I'd prefer not.
>
> I'd prefer a SHOULD for all three SRFIs.

Updated in: http://per.bothner.com/tmp/srfi-109/srfi-109.html
--
	--Per Bothner
xxxxxx@bothner.com   http://per.bothner.com/