Re: SRFI-metadata-syncing SRFI?

Show/hide message thread

SRFI-metadata-syncing SRFI? noosphere@xxxxxx (08 Nov 2020 21:50 UTC)
Re: SRFI-metadata-syncing SRFI? Vladimir Nikishkin (09 Nov 2020 01:00 UTC)
Re: SRFI-metadata-syncing SRFI? Lassi Kortela (09 Nov 2020 09:41 UTC)
Re: SRFI-metadata-syncing SRFI? noosphere@xxxxxx (09 Nov 2020 16:15 UTC)
Re: SRFI-metadata-syncing SRFI? Lassi Kortela (09 Nov 2020 16:36 UTC)
(missing)
Re: SRFI-metadata-syncing SRFI? noosphere@xxxxxx (09 Nov 2020 20:35 UTC)
Re: SRFI-metadata-syncing SRFI? Lassi Kortela (09 Nov 2020 20:57 UTC)
Re: SRFI-metadata-syncing SRFI? Lassi Kortela (09 Nov 2020 21:05 UTC)
Re: SRFI-metadata-syncing SRFI? noosphere@xxxxxx (09 Nov 2020 23:41 UTC)
Re: SRFI-metadata-syncing SRFI? Lassi Kortela (10 Nov 2020 07:53 UTC)
Re: SRFI-metadata-syncing SRFI? noosphere@xxxxxx (09 Nov 2020 23:45 UTC)
Re: SRFI-metadata-syncing SRFI? noosphere@xxxxxx (09 Nov 2020 20:50 UTC)
Re: SRFI-metadata-syncing SRFI? Lassi Kortela (09 Nov 2020 21:12 UTC)
The funnel pattern Lassi Kortela (09 Nov 2020 21:30 UTC)

Re: SRFI-metadata-syncing SRFI? Lassi Kortela 09 Nov 2020 20:57 UTC

>> FWIW I'm experimenting with scripts that directly query the GitHub API.
>> See `external.rkt' in the repo. It might or might not be a cleaner
>> solution. We'll see!

Using <https://api.github.com/repos/arcfide/chez-srfi/contents> to get
the file list in the repo is very nice! Can it get branches or tags
other than the default branch (master)?

To get the contents of a particular file in the repo, URLs of the form
<https://raw.githubusercontent.com/arcfide/chez-srfi/master/%253a0.sls>
let you do it without any JSON or Base64. "master" can be any other
branch or tag as well.

>> I'm all in favour of a standardised, minimal way of expressing SRFI
>> support, or maybe even general features (eg 'this implementation has
>> native threads')for each implementation.

This is a good idea. We have started the work at
<https://github.com/schemedoc/implementation-metadata/tree/master/schemes>;
additions welcome! The schema is not stable yet but it's getting there.
None of this data is scraped because it is all so idiosyncratic that
there isn't much of anything we could scrape.

The return values of `(features)` could be scraped; the Docker
containers at <https://github.com/scheme-containers> would make for a
good foundation for that.

>> It's worth noting that this
>> won't be possible for unmaintained implementations, but fortunately their
>> respective feature sets aren't likely to change any time soon. :-)
>
> But it could be possible if such information is not gleaned from their
> tar files but from a URL.
>
> The metadata consumer could be pointed to such a URL, which could be
> hosted anywhere, and does not need the intervention of an absent maintainer.
>
> Even if it was gleaned from their tar files, unmaintained
> implementations could be unarchived and then rearchived with the
> requisite metadata, so future queries of it could succeed, and so that
> ways of getting metadata could be consistent across implementations.

Sure - for old releases, we could create files with the missing
information and store them in one place (for example, as a repo under
schemedoc). Erkin's work on the srfi-metadata repo has already covered a
good distance toward this.

I'd advise against re-archiving old releases of Scheme implementations.
It's clearer policy for their authors' reputation as well as users'
convenience if we always use the pristine source releases made by the
authors; i.e. identical tar files. Nowadays security is also an
increasing concern, and we routinely take hashes of packages.
Repackaging changes the hash.

I'd also continue to advise storing the information for new releases in
the Scheme implementation's own git repo. There's a fairly strong
convention that one Scheme implementation = one repo; the same repo
contains the C source files, the Scheme source files, the boot image,
documentation, and everything else. And is watched over constantly by
that Scheme implementation's author and most active users. So there's a
good basis for getting the information right and keeping it well
maintained. We have quite a few places around Scheme where information
is stored rather far from its origin, and it tends to get out of date
quite easily, or its correctness is hard to verify.

A further benefit of storing in the Scheme implementation's repo is that
future source releases (tar files) will automatically contain info that
is up to date for that release. So even if we lose the version control
history, anyone having the tarball for a release can retrieve the
metadata. We've found and resurrected quite a few old Scheme
implementations for the Docker containers, and empirically, the tarballs
are by far the most resilient format for long-term archival.

If the standardized metadata files take off, a future aggregator could
read them and then merge in the file containing missing information from
old implementations. Does this sound reasonable?