optional user-specified end-delimiters Per Bothner 16 Apr 2013 19:31 UTC

This is a proposed syntax for optional user-specified
end tokens for both SRFI-108 and SRFI-109.
First the proposal, then a discussion comparing alternatives.


A string quasi-literal may have an optional end-label,
specified by a '!' followed by an label.  For example:



label ::= tagname-subsequent+

It seems reasonable to allow an integer label, so we allow
starting with a digit.  Thus for simplicity we also allow
hyphen, underscore, or period.
(In an unrelated change, I think tagname-subsequent should
allow periods.  XML compatibility is one reason.)

extended-string-literal ::= "&" string-starttag?
    "{" initial-ignored? string-literal-part* "}!" string-endtag?
string-starttag ::= "!" label
string-endtag ::= "!" label

The string-endtag is required if the string-starttag is specified,
and of course the labels must much.
Note this chance means that an extended-string-literal must be
followed by a delimiter or end of input.

For a named quasi-literal, we can use the constructor-name as an end-tag:


DISCUSSION: Should we require a final "!" if there is no
explicit label?  Make it optional?  I.e.:


You can add an explicit label:




(It's not clear all these options are all that useful.)

extended-datum-body ::=
     "&" cname datum-start-label? "{" initial-ignored?
named-literal-part* "}" datum-end-label?
   | "&" cname datum-start-label? "[" expression* "]{" initial-ignored?
named-literal-part* "}" datum-end-label?

datum-start-label ::= "!" label
datum-end-label :=
     cname "!"?
   | cname? "!" label

If may occasionally be useful to make labels available for
semantic information.  One  example is as implicit "id" attributes.
To do that we modify the translation:

&cname!label[init-args...]{content}cname!label ==>
   ($construct$:cname init-args... $>>$ "content" ($construct-label$

I.e. we add ($construct-label$ "label") as the final operand
of the $construct$:foo.  We also add a standard definition:
   (define ($construct-label$ label) "")
This way the $construct-label$ by default becomes a no-op,
including when using define-simple-constructor.


There are some plausible alternatives.  For example we can put the
end-tag just before the right brace.
For named constructors we could use:


This works and parses unambiguously.  However, it's difficult
to come up with a "compatible" syntax for strings,
as discussed below.  Also, the start-of-literal and the
end-of-literal both start with the same prefix "&cname", which
makes visual scanning slower.

We could add the "!" as also used for strings:




The former appears incompatible with optional explicit
labels (i.e. "cname!tag"), while the latter doesn't
help much with the visual scanning problem.

For strings we don't need the "&" since the end-delimiter
is required if specified:


This is a little tricky to parse, since you don't know that !END if
an end-tag until you see the right-brace.  (For the same reason
it's not very human-readable.)  However, it doesn't feel very
consistent with the named-constructor case.

We could avoid this by requiring a '&' before the end-tag, either:




The former feels inconsistent with named constructors (it seems
to preclude explicit labels); the latter seems excessive.
	--Per Bothner
xxxxxx@bothner.com   http://per.bothner.com/