Email list hosting service & mailing list manager

(missing)
(missing)
(missing)
Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Marc Nieper-Wißkirchen (06 Mar 2019 10:12 UTC)

Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Marc Nieper-Wißkirchen 06 Mar 2019 10:12 UTC

As an author of a number of SRFIs, I agree that having a more
standardized format, in which SRFIs are written, is a great idea. This
would also make it easier to incorporate SRFIs into future RNRS
standard documents.

However, XML/HTML/XHTML are not the best formats to write by hand. If
we are going to have a change, I would like to propose a subset of
TeX, which is much more convenient for us authors. Contrary to HTML,
TeX can be extended with macros. Someone would have to write a couple
of macros that are the basis for SRFI documents. Authors would have to
use these macros so that software can easily convert the TeX source
into other formats and is able to index them.

Marc

Am Mi., 6. März 2019 um 07:19 Uhr schrieb Arthur A. Gleckler
<xxxxxx@speechcode.com>:
>
> I continue to be impressed with pup.  Per Bothner's SRFI 164 already has a bunch of appropriate markup in it, so I've been experimenting with it and pup.  For example, here I extract the names of all the procedures defined in that SRFI:
>
> > SRFI=164; cat $ss/srfi-$SRFI/srfi-$SRFI.html | pup '[kind="Procedure"] .proc-def text{}'
> -&gt;shape
> shape
> array-shape
> array-rank
> array-start
> array-end
> array-size
> array
> make-array
> make-array
> build-array
> index-array
> array-ref
> array-ref
> array-index-ref
> array-set!
> array-set!
> array-copy!
> array-fill!
> array-transform
> array-index-share
> array-reshape
> share-array
> array-flatten
> array-&gt;vector
> >
>
> Using pup with jq, a command-line JSON querying tool, you can extract even more information.  Here's a crude query that produces the first few procedures that are defined, along with their arguments:
>
> > SRFI=164; cat $ss/srfi-$SRFI/srfi-$SRFI.html | pup '.synopsis json{}'|jq 'map(.children[]|{name:select(.class=="function").text}[],{argument:select(.tag=="var").text})[0:13]'
> [
>   "(array?",
>   {
>     "argument": "obj"
>   },
>   "(range-from",
>   {
>     "argument": "start"
>   },
>   {
>     "argument": "step"
>   },
>   "(range&lt;",
>   {
>     "argument": "end"
>   },
>   {
>     "argument": "start"
>   },
>   {
>     "argument": "step"
>   },
>   "(range&lt;=",
>   {
>     "argument": "end"
>   },
>   {
>     "argument": "start"
>   },
>   {
>     "argument": "step"
>   }
> ]
> >
>
> This is just a proof of concept.  I'm not arguing that we should use pup or jq, or that Per's specific markup is the correct one — only that, even with a simple convention like Per is using in his SRFI, it's already possible to extract useful information.  We really shouldn't have to do much to encode useful information even in basic HTML.

--
Prof. Dr. Marc Nieper-Wißkirchen

Universität Augsburg
Institut für Mathematik
Universitätsstraße 14
86159 Augsburg

Tel: 0821/598-2146
Fax: 0821/598-2090

E-Mail: xxxxxx@math.uni-augsburg.de
Web: www.math.uni-augsburg.de/alg/mitarbeiter/mnieper/