Email list hosting service & mailing list manager

Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Marc Nieper-Wißkirchen (06 Mar 2019 10:12 UTC)
Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Lassi Kortela (11 Mar 2019 11:43 UTC)

Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Lassi Kortela 11 Mar 2019 11:43 UTC

 > If the goal is, as it has been, human review and understanding, then
 > perhaps it would make sense if it was easier on the authors and
 > readers, otherwise it should be relegated to a personal project and
 > not a group discussion.

Like all complex projects with several human participants, this one
doesn't have one clear goal. Everybody has their own overlapping goals
and we're trying to arrive at a compromise that adds value for
everybody without significant detriment to anybody. Easy authoring,
easy review, easy maintenance work and easy machine parsing are all
goals to some degree. The precise degrees are a subject of constant
thought and debate, as can be seen all over the thread :)

We don't ultimately wish to work alone on our personal projects --
we'd like to have a standardized process and infrastructure to share
the workload as much as possible. Current personal projects are mainly
exploration and bootstrapping for what we hope to be a larger effort
if we can find common ground (seems very promising to me right now).

Later on some tools may be personal projects, but it would be futile
to keep making tools without relying on a parseable format upstream.

 > This may be a stupid suggestion, but perhaps it would be easier to
 > write the srfis in a specialised markup format, and then generate
 > both the docs and html/xhtml from that?

 > I would imagine that a very simple and clear semi-markup language
 > would actually make it easier to submit, edit, and review srfis

These would be very reasonable suggestions if this were only a
technical problem :) However, if you skim most of this thread it will
become apparent how difficult it would be to move away from HTML. A
new markup language would be all well and good if we weren't dealing
with human beings and 160+ SRFIs over 20 years.

The fundamental problem here is respecting people's time and energy.
That means asking SRFI authors what kind of format they are willing to
put up with. The technical aspect is just coming up with tricks to get
as much machine-parseable info from that format as we can.

Requiring all authors to submit strict markup seems like an
insurmountable hurdle right now, so we're exploring whether we could
have volunteers transform the markup of finalized SRFIs into a
standard form that is easy to parse. The time and energy of these
volunteers would also have to be respected, and the backlog of old
SRFIs is large, so the devil is mainly in the details. That's why this
discussion is so long and nuanced. Simple answers like "just switch to
language X" are unlikely to work for all stakeholders.

 > Aren't we down a rat hole atm about what elements, tags, features,
 > etc, and the best way to go about handling existing and future docs? :)

If you ask me, one formatting rat hole every 10-20 years means the
process is very healthy :)

The HTML tags/classes can be figured one way or another. It's just a
question of what percentage of the effort is manual human effort vs
what percentage is automated machine parsing. Any problem with it can
ultimately be solved with some more manual effort.

Each new new markup language considered would basically duplicate all
the tags/classes concerns for that language. Plus people have
different and often opposing ideas about what they like in a markup
language, as can be seen if your skim the whole thread. I mean, by now
people have suggested:

* Loose HTML
* Strict HTML
* Strict XHTML
* LaTeX
* Scribe
* Scribble
* CommonMark
* reStructuredText
* Anything else I forgot?

All of these choices would come back to that language's flavor of the
tags/classes problem.

At the end of the day, many authors can be expected to want to use
their own favorite markup language anyway. They'll just convert to the
submission format, as they have done until now.