finishing SRFI-107 - representation of namespace declarations Per Bothner (27 Oct 2013 03:33 UTC)
|
Re: finishing SRFI-107 - representation of namespace declarations
John Cowan
(30 Oct 2013 02:22 UTC)
|
Re: finishing SRFI-107 - representation of namespace declarations
Per Bothner
(30 Oct 2013 04:01 UTC)
|
Re: finishing SRFI-107 - representation of namespace declarations
John Cowan
(30 Oct 2013 04:46 UTC)
|
SRFI-107 has been languishing for a while, but I'm going to try to finish it up. My latest edit is at http://per.bothner.com/tmp/srfi-107/srfi-107.html (NOTE: This isn't quite ready yet for review: I think the syntax and translations (into core S-expressions) are close to done, but the over-all structure needs more work.) First, I'd like to nail down the translation of namespace declarations. The current translation (as in the above URL, which is different from the older version at srfi.schemers.org) is: <prefix2:a xmlns:prefix1="URI1" xmlns:prefix2="URI&foo;2" xmlns="DURI">...</a> xmlns="DURI">...</prefix2:a> ==> ($xml-element$ ((prefix1 "URI1") (prefix2 "URI" $entity$:foo "2") (|| "DURI")) ($resolve-qname$ a prefix2) ...) I.e.: - The set of namespace declarations is translated to a list, one element for each declaration. - Each namespace declaration is a (sub-)list that starts with the prefix being defined, and continues with the URI being bound to the prefix. - Each prefix is a symbol. A default (element) namespace declaration is represented with an empty symbol, or equivalently the "reserved" prefix $default-element-namespace$. - Normally the the namespace URI is a literal string (and possibly entity references), but the format allows for evaluated expressions, using the same format as attributes. Note that the prefix names in both $resolve-qname$ and the namespace- declaration list are unquoted symbols. These forms are conceptually a kind of variable reference and variable declaration, respectively, in a kind of lexical scoping. Therefore, using either strings or quoted symbols would be IMO wrong. The implication is that both $xml-element$ and $resolve-qname$ cannot be bound to functions - they must be syntax, John Cowan has expressed a preference that it be possible that these be functions. I don't think that is feasible while also supporting namespaces, unless you use a rather clumsy encoding. Specifically, evaluating the tag-expression and the child-expressions must be done *after* the binding is created, so they're evaluated in the "scope" of the namespace declaration. This can be done by wrapping he sub-expressions as lambda-expressions, and resolving prefixes using a hash-table. However, the use of lambda expressions makes for an unacceptably verbose and ugly encoding. It is easy for $xml-element$ to be a function if namespace support is not needed. But maybe we can tweak the encoding so that $xml-element$ can be a function when namespaces aren't needed, while still supporting namespaces (requiring $xml-element$ to be macro)? The first problem is that the above encoding translates an empty namespace-declaration list to an empty list - which is not self-evaluating (at least in portable Scheme). We can fix this by using a vector instead of a list: <a xmlns:prefix1="URI1">...</> ==> ($xml-element$ #((prefix1 "URI1")) ...) The advantage is that vectors are self-evaluating, and specifically an empty vector is. A modest disadvantage is that empty vectors aren't necessarily shared. The other problem is that even a prefix-less element tag a is translated to ($resolve-qname$ a). This is needed in case there is default namespace declaration. A non-namespace-supporting implementation can easily define a macro that translates ($resolve-qname$ a) to (quote a). If it is important that $resolve-qname$ be implementable as a function, we can change the reader-mapping to ($resolve-qname$ 'a), but it makes for a more inefficient mapping. The original namespace-using example would be: <prefix2:a xmlns:prefix1="URI1" xmlns:prefix2="URI&foo;2" xmlns="DURI">...</a> xmlns="DURI">...</prefix2:a> ==> ($xml-element$ #((prefix1 "URI1") (prefix2 "URI" $entity$:foo "2") (|| "DURI")) ($resolve-qname$ (quote a) prefix2) ...) That's not terrible, but it is clumsier and less efficient than the original mapping. Personally, I don't see the value of supporting a function-only implementation. What is the use case? Therefore, my preference is the mapping at the start of this message, but if there is a strong expressed preference for the mapping just above, that is acceptable too. -- --Per Bothner xxxxxx@bothner.com http://per.bothner.com/