Re: Final comments, mostly editorial

Show/hide message thread

Final comments, mostly editorial John Cowan (27 Nov 2013 23:59 UTC)

Re: Final comments, mostly editorial Per Bothner (28 Nov 2013 03:52 UTC)

Re: Final comments, mostly editorial John Cowan (28 Nov 2013 04:07 UTC)

Re: Final comments, mostly editorial John Cowan (28 Nov 2013 04:10 UTC)

Re: Final comments, mostly editorial Per Bothner (28 Nov 2013 04:46 UTC)

Re: Final comments, mostly editorial John Cowan (28 Nov 2013 04:51 UTC)

Re: Final comments, mostly editorial Per Bothner (07 Dec 2013 01:24 UTC)

Re: Final comments, mostly editorial John Cowan (07 Dec 2013 19:24 UTC)

Re: Final comments, mostly editorial Per Bothner (08 Dec 2013 08:37 UTC)

Re: Final comments, mostly editorial John Cowan (08 Dec 2013 17:13 UTC)

Re: Final comments, mostly editorial Per Bothner (08 Dec 2013 20:27 UTC)

Re: Final comments, mostly editorial John Cowan (08 Dec 2013 23:23 UTC)

Re: Final comments, mostly editorial Per Bothner 07 Dec 2013 01:24 UTC

Still mulling how to handle "]]>".

>> I'm inclined to think you're right, but I don't see any benefit in
>> adding a restriction to prohibit "SRFI 109 constructs" in attribute values
>> - it would seem to make the rules and syntax more complicated, just to
>> reduce flexibility, without any obvious benefit.
>
> I'm primarily concerned that, when translated into actual XML, it won't
> have the effects that people think it will have.

I don't understand what you mean by "when translated into actual XML".
If an xml-constructor containing "]]>" in element content is evaluated
it would (assuming some DOM-like implementation) create a text node
containing "]]>".  Which is perfectly valid.  If this is the
"serialized" (e.g. written out to an XML file) then it is the job
of the serializer to replace "]]>" by "]]&gt;".

Note the syntax is intended for not only XML content, but also HTML
and related syntaxes.  "]]>" is not strictly valid, but it is "in
practice valid" - browsers (at least Chrome and Firefox) don't complain.

Furthermore, it appears to be valid HTML5, according to my reading of
the spec and the validators I've tried.

I did implement an experimental warning in Kawa:

#|kawa:1|# #<p>Weird: ]]>!</p>
/dev/stdin:1:12: warning - literal ']]>' is only valid following '<![CDATA['
<p>Weird: ]]&gt;!</p>
#|kawa:2|#

Note the ">" is serialized as "&gt;" in the Print part of the REPL.

Perhaps a warning is a reasonable compromise.  In the SRFI, perhaps we
could add:

The XML and HTML standards (up through HTML 4.x) do not allow the
literal text
"]]> in element content - instead it should be escaped as in "]]&gt;".
This is for historical reasons of SGML-compatibility.
An implementation SHOULD warn if literal "]]>" is seen.
--
	--Per Bothner
xxxxxx@bothner.com   http://per.bothner.com/