Re: Simplifying SRFI 109, part 1: entities Per Bothner 31 Mar 2013 07:12 UTC
On 03/30/2013 10:57 PM, John Cowan wrote: > Per Bothner scripsit: > >> It is more important to preserve the XML conceptual "information >> model", > > Absolutely, but SRFI 107 models only a part of the XML Infoset > <http://www.w3.org/TR/xml-infoset>, of which (ahem) I was the principal > editor. Yes, I actually consulted that today and noticed the "coincidence" of the name. Glad to confirm my assumption they're one and the same. (I also know a John Cowan socially, but he doesn't appear to be you ...) >> SRFI 107 as currently written does not support the concept of an >> XML document - whether we mean: (1) XML document as a file format. >> (2) DOM Document as a data type for representing the "significant" >> information > > It's the second concept I mean. > >> SRFI-107 doesn't directly support either. I think APIs supporting >> both are desirable - and SRFI-107 should hopefully work well with such >> APIs. However, this process has dragged on long enough, and working >> with documents seems like new functionality that I think should be >> saved for future work. > > In that case, perhaps this SRFI should be renamed "XML element reader > syntax". First, this SRFI also has a reader syntax for PI nodes, comment nodes, and CDATA nodes. There is no support for (top-level) attribute nodes, though you can write them with a $xml-attribute$ constructor. My assumption is a Scheme API for XML would have standard function calls for creating DOM Nodes, and the SRFI-107 syntax would be basically syntactic sugar. Some other languages also provide similar XML "literals". My goal is not to embed XML documents inside a program, but to provide a more familiar syntax equivalent to node constructor expressions. XQuery has equivalent functionality as this proposal: XML-style syntax for elements, PI, and comments. For attributes and documents you need to use "computed constructors". "EcmaScript for XML" is similar. Visual Basic does support "XML Document literals", and I guess we can add support for them. Perhaps we can allow: #<?xml optional-stuff?><root>...</root> and/or: #<!DOCTYPE root optional-stuff><root>...</root> And of course we can use SRFI1-108 in some way. However, doing this "right" adds a fair bit of extra work and complexity. It also pushes the limits of my expertise. The obvious reader translation is to an $xml-document$ constructor. Beyond that things are less obvious. Because the infoset for an XML document is rather complex, it seems cleaner to use keywords - but alas keywords are far from standard: We get into the question of whether to use SRFI-88-style keywords or plain symbols. I like SRFI-88-style keywords of course (Kawa has them), but obviously this limits portability. (Though if we add them to R7RS-large I'd have less reluctance ...) If you think the name "XML reader syntax" is misleading, we could change the name to "Basic XML reader syntax" by analogy with SRFI-28. >> Indeed, that is rather vague - and raises these questions: (1) What are >> "standard Scheme character names"? I suggest going with the R7RS names. > > I'm happy with that, of course. > >> (2) When it comes to an implementation supporting "standard Scheme >> character names", is this a "must" or a "should"? I could go either >> way. (3) Do we want a different answer for SRFI-17 and SRFI-108/-109? >> I'd prefer not. > > I'd prefer a SHOULD for all three SRFIs. Updated in: http://per.bothner.com/tmp/srfi-109/srfi-109.html -- --Per Bothner xxxxxx@bothner.com http://per.bothner.com/