> Agreed. We should just be careful not to do anything that precludes > later aggregation. For example, if a common standard for indexing > becomes available, we should use that. That would be neat. If we auto-generate all the metadata using our tool (with only the categories and "see also" linkage coming from manual additions) then it will be easy to adapt the tool to output a standard format when one emerges. This might be a reason not to commit the auto-generated metadata files to Git yet. They could just go into the final tar file. Once a standard emerges we can commit them into the repos. Of course, we can try to hasten the emergence of such a standard by auto-converting RnRS arglists and perhaps writing a SRFI to specify a standard documentation dump format for implementations (after surveying the existing implementations to find out what kind of dump format would be easiest for them; and whether any implentors would be interested in the project in the first place). All that would be left then would be libraries. With luck they could use the same dump format as implementations (but frankly I have no idea what I'm talking about re: library documentation). > First we have to decide which place is blessed as the "point of truth" > where the latest sources are collected. Is it the GitHub origin repos > or the Git clones on the SRFI editor's personal computer? The > release-making tool will poll this place only. > > Is this really a distinction worth making? Keeping Git repos in sync is > trivial — that's one of Git's major benefits. If I run the > metadata-extraction tool locally, then commit the results and push them, > we can be confident that the Github copies are identical. I've been under the impression that it's difficult to pull so many repos at once. If it's that easy then I'm very surprised. I was sure GitHub would throttle it somehow to make it impractical. As an experiment, I tried to make the mega-repo with all SRFIs from scratch using 'git subtree'. GitHub throttled things quite heavily but the cloning eventually finished without errors and they didn't ban me yet :D Took about an hour. I also tried git-pulling all 160+ SRFIs and it took about 2-3 minutes for me. This would indeed be easy enough. > Because running a command is trivial and doesn't introduce dependencies > on other tools. This is fine if consistency is not a problem. It appears mass git pulls are way easier than I thought, which invalidates most of my concerns in the last email. > That's still a benefit after finalization. We still publish errata, > fixes to sample implementations, etc. > I don't see the difference in workflow at all. Even working in batches, > when I've done it, hasn't been impeded by having separate repos. > No, I just don't see the advantages. Since there's manual work > involved, we're still extracting metadata one SRFI at a time. And > keeping a consistent version control log across the entire history of > each SRFI, without a break at finalization, is important. There's no > need to make things more complex. OK, so the mega-repo is a no-go :) If you are opposed to both it and the Racket server, I would just leave out CI entirely (see below). > Can you tell me what, specifically, Travis could automatically check for > us? Why couldn't our metadata-extraction tool run that same check? Nothing -- it would run the exact tool we'd run on our own computers. If all the documents were in one repo it would be a no-brainer to also run the tool in Travis since we'd get status checks in GitHub essentially for free. If we have 160 repos, we might still be able to set it up but I wouldn't be surprised if there's some kind of trouble from Travis due to the scale of it (throttling or lackluster APIs for mass-administering the repos). And in the future each time we created a repo for a new SRFI we'd have to remember to enable Travis for it. All this would add up to too much of a hassle for my tastes too. IMHO the only tenable CI solutions are: 1. Each SRFI in its own repo, CI using webhook server. 2. All SRFIs in one repo, CI using Travis. If both of those are a no-go, I would just leave out CI and only run the tool manually as you suggest.