Re: Adding another Scheme.org server + better compartmentalization Lassi Kortela (28 Nov 2023 19:10 UTC)

Re: Adding another Scheme.org server + better compartmentalization Lassi Kortela 28 Nov 2023 19:10 UTC

Thanks for the detailed comments.

> Dedicated file system? Like what? You could ignore 404 errors, that should
> save some space, or redirect 404 to a dedicated log, that rotated, deleted
> more regularly to still have some logs about the 404.

That may be a good idea.

> Or you can use fail2ban to monitor nginx logs, and ban disrupting ips.
>
> hint:https://www.digitalocean.com/community/tutorials/how-to-protect-an-nginx-server-with-fail2ban-on-ubuntu-20-04#step-2-configuring-fail2ban-to-monitor-nginx-logs

I've used fail2ban in the past. Some people say it's a good idea, others
say it's not useful. You're right that it will probably reduce noise in
the logs. On the other hand, it's one more thing to set up, and we have
to ensure the site has active maintainers at all times.

Using fail2ban with its default settings would be quite easy, but if we
have to write a custom config, that's one more thing that people will
have to learn.

>> Even if the above is taken care of, the main server still isn't beefy
>> enough to host everything. I'd like to take this opportunity to advance
>> the "microkernelization" of Scheme.org (as explained in the original
>> announcement) by keeping the front page and some administrivia on the
>> current server, and moving the community subdomains to a new server.

> Scaling that way is very painful depending on the growth of the community.

What causes the pain?

I think it's extremely important that we can guarantee to have committed
top-level admins at all times. We can only achieve that if their duties
are simple and clear. If they have their own server, that is clearest.

> My favorite approach, given the team is:
>
> - two containers running in different hosts for gitea, and a third host for the database;

Why does Gitea need two hosts instead of one?

What is the benefit of having the database on a different host than the
web app?

Generally, it's best to avoid distributed systems if possible. I'd
rather rent a more powerful VPS than split one service across two VPS.

> - the static asset should be present their own hosts

I agree in principle, but static hosting takes almost no CPU/RAM. If we
dedicate a server to static sites, we will waste most of the money to
rent it. With $10/year VPS this is not necessarily an issue, but with
$5/month it's a big deal.

Using a CDN to serve static files would probably make scheme.org less
independent. Currently we do everything on stock Linux, so we can easily
move between dozens of neutral VPS providers. New providers appear all
the time.

> FWIW, I should have showed up earlier, but sub.scheme.org should be dedicated
> to community website like https://js.org/, and small-web.org/. Everything that
> the core scheme.org community maintains, unlike:
>
>    https://community.scheme.org/
>
> Should be in scheme.org/path/to/perfect/community

The main reason for our subdomains is Conway's Law.
(https://en.wikipedia.org/wiki/Conway%27s_law)

"Any organization that designs a system will produce a design whose
structure is a copy of the organization's communication structure."

IMHO our domain can only have long-term success if we plan for Conway's
Law ahead of time. Since we are volunteer driven, we cannot rely on
reaching particular individuals at a short notice. (Other than the
top-level admins. For this reason, I think it's crucial to have a
minimalist top-level site with clearly defined bounds.)

> The historical reason, is that SEO threat subdomains as different orgs, possibly
> with a less good reputation than toplevel domain circa 2000 that was the case with
> blogger and wordpress blogs. And that hurts indexation. Also, recently, there is
> a couple of website that avoid completely subdomains, most GAFAM do that.

People bring up SEO occasionally, but that seems like an unwise focus
for a long term site.

For starters, everything on Scheme.org should have excellent search
engine rankings purely because of the domain name. Even now, searching
for `scheme` and `scheme community`, scheme.org comes up first (at least
on my computer). The Register just linked to us. (Search for "scheme" on
https://www.theregister.com/2023/11/23/medley_interlisp_revival/)

If we do a non-terrible job, we will have high rank by default.

Another reason to avoid SEO is that good strategies are prone to change
year to year. And it adds complexity when we already have too few
volunteers.

> Another case is sr.ht that bring up the tilde e.g.https://git.sr.ht/~rabbits/

Tildes are confusing, difficult to type on many keyboards, and easily
confused with dashes.

What does a tilde represent (other than user)? We have almost no user
pages under scheme.org. (IMHO a good idea. At a time like this, user
pages will inevitably cause political/religious disputes.)

For the user pages on Gitea and the like, the URL layout is decided by
the upstream software.

>> Scheme.org is fundamentally DNS-based, git-sourced, and automated using
>> Ansible, so it is quite easy to experiment with different servers and
>> move subdomains between them with little disruption.

> It is unclear what is the current setup.

Two servers (well, as of yesterday, three) running Debian and Docker.

There's been some talk that we should use Guix. That's probably a good
idea, but then someone has to do the work to set up a Guix server :)