Grand unified schema for the metadata and API Lassi Kortela 12 Jul 2019 11:05 UTC

Conceptually, I think we should collect all Schemedoc data into one
giant S-expression. (It doesn't need to be stored in one file, but we
should think of everything as one expression when thinking about design.
How we break it into files, is a detail that can change over time. It's
easy to write code to split/merge files as needed.)

I'd like to eventually have portable Scheme libraries for the metadata
generators. (I'd prefer to standardize on using R7RS, but it's not a big
deal. We can e.g. use a subset of R7RS that machine-translates to R6RS.)

If we think of the giant S-expression as a tree, each library would add
information to one or more branches of the tree. For example, there
would be a "srfi" branch, and a library that downloads SRFI metadata
from the srfi-admin GitHub repo, translates it into the Schemedoc
schema, and puts it under the "srfi" branch.

We should have one orchestration framework. That framework offers a
standard interface, and the generator libraries conform to that
interface. The framework could start basically run a "fold" function
over the generators to produce the giant S-expression. I have a
preliminary design for this.

I think the generator libraries should not download or cache files
directly. The framework should take care of all the downloading and
caching so it can be done in a uniform way in one place.

The framework should also have a Unix cron-like facility so the
generators can poll their data sources (for example, check for new SRFIs
once a day, check for new releases of an implementation in GitHub, etc.)

The giant S-expression should have a well-defined schema. Currently, I
find it easiest to design the schema first in GraphQL (which already has
a simple type system), then write the equivalent S-expressions. I'm
working on a schema language for S-expressions, and John expressed
interest in being kept in the loop, but it will take several months to
finish it. Additional help is welcome.