Email list hosting service & mailing list manager

(missing)
(missing)
(missing)
Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Ciprian Dorin Craciun (05 Mar 2019 19:46 UTC)
Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Marc Nieper-Wißkirchen (06 Mar 2019 10:12 UTC)

Re: Proposal to add HTML class attributes to SRFIs to aid machine-parsing Ciprian Dorin Craciun 05 Mar 2019 19:45 UTC

A note about HTML (4 or 5) vs XHTML.

My suggestion for XHTML is purely pragmatical:  XHTML is XML, then one
can just use any XML library to parse the document.

Now I know that it "seems" that there are many HTML parsers out there,
unfortunately this is not true...  There are a few, at least for the
most popular programming languages, however they are "bloated" and
full of issues...

I know this because I've tried once to use such tools and tried
"Beautiful Soup" for Python and failed...  Then I've settled on
https://github.com/ericchiang/pup and exported the whole thing as JSON
and moved on from there...

Ciprian.

BTW, the tool I've mentioned `pup` can be used to for HTML meta-data extraction.