Email list hosting service & mailing list manager

(missing)
Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Ciprian Dorin Craciun (05 Mar 2019 15:34 UTC)
(missing)
(missing)
Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Marc Nieper-Wißkirchen (06 Mar 2019 10:12 UTC)

Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Ciprian Dorin Craciun 05 Mar 2019 15:33 UTC

On Tue, Mar 5, 2019 at 4:01 PM Lassi Kortela <xxxxxx@lassi.io> wrote:
> This is a proposal to gradually add a standardized set of HTML class
> attributes to the SRFI source documents. The classes would encode
> metadata that can be used to index all symbols defined in SRFIs.
> General information about the SRFI (date, author, abstract, status,
> license, etc.) could also be encoded.

I've quickly read through your HTML "tag" proposal, and although I
would have preferred something more "structured" (as my S-expression
based proposal), I think it's still better than the current approach
of almost no meta-data.

However, I would envisage something a little bit more:
* documents should be XHTML 1.0 strict (i.e.
https://www.w3.org/TR/xhtml1/) and validated with
https://validator.w3.org/;  (this would ensure they are proper XML
files, and use only "standard" elements;  thus allowing easier
parsing;)
* identify a set of "best practices" on how various HTML elements
should be used (together with any needed classes and ID attributes if
necessary;)
* (part of the "best practices" identify a set of standard required
elements to contain the author, title, etc.)
* come-up with a "standard" CSS that can be applied to all these documents;

Thus one still uses HTML(-like) language, and is free (within the
constraints of "best practices") to use `span`, `code` or `pre` as he
wishes.  And we also get to "extract" meta-data from these documents
as you've described, but...

But most importantly one is able to take these documents and integrate
them into ones Scheme implementation documentation as "reference"
material.

> The approach described here would be complementary. Ciprian has been
> working on an S-expression-based layout for the metadata: the
> S-expressions could be generated automatically from the HTML markup
> proposed here. In fact, Arthur Gleckler and Per Bothner already hinted
> at an HTML-based approach in the earlier thread.

In the end I managed to do a little bit more.  Based on the initial
proposed S-expressions format, I've introduced also Markdown-based
documentation support, and with a Rust-based generator, I've generated
the HTML's that can be viewed at:

  https://vonuvoli.volution.ro/documentation/libraries-html/_libraries.html

I have spent quite some time finding the best structure and layout.

Thus if you would like I would happily participate in such an effort
to define the XHTML-based document and the other steps I've mentioned.

> What's elegant about the HTML class system is that it's based on
> "composition, not inheritance". So you can freely combine classes in
> ways that feel natural. For example, I made a "hidden" class as a
> shorthand for the CSS "display: none" that makes things invisible. So
> the visible name of a definition is:

I would try to keep these classes as mutually exclusive as possible...
Especially if we want to be able to extract anything useful out of
these documents.

Regarding the `display: none` I think it is a very bad idea...  If it
is not visible, it will not be reviewed, and thus it will bit-rot, and
errors will creep into that element.  I am certain that one can
come-up with a sensible HTML structure, that allows such "meta-data"
items to be included besides the actual text.

> Similarly we can use the "def" class to find all definitions in the
> SRFI, but "def proc" to find only procedure definitions, or "def var"
> to find only variable definitions, etc.

As highlighted above I would use `def-proc` (i.e. one CSS class)
instead of `def proc` (i.e. two CSS classes).  (Because what means
just `proc` by itself?  Or just `def`?)

(I have a feeling that you've got "caught" in the CSS classes extravaganza.)  :)

Ciprian.