UpScheme: A new entry-level Scheme implementation

UpScheme: A new entry-level Scheme implementation Lassi Kortela 12 Aug 2019 17:52 UTC
Please welcome UpScheme: <https://github.com/lassik/upscheme>

I spent a day hacking on the FemtoLisp interpreter by Jeff Bezanson (now
known for the Julia language), porting and cleaning up its C codebase.
FL implements a Lisp dialect close to Scheme. I'm gradually nudging it
toward R7RS. FL's combination of simplicity and speed is impressive and
a big credit to what Lisp can do. The code is also very easy to hack.
Without a foundation like this I wouldn't have started at all.

The main purpose of this implementation is to be a maximally portable
replacement for things like shell and awk so that one can leverage
Scheme's clean, expressive and reliable design where they'd otherwise
have to contend with Unix tools. Basically like an extremely compact
version of Gauche with less features and performance but extremely
portable. So that if you find yourself on a weird system with nothing
but a C compiler, you can be quite confident to just run build.sh and
get a useful, maintained Scheme in seconds. As a programmer, this will
also satisfy my curiosity to find out how much can be done with a
portable and minimal program.

A central design tenet is to play along well with other Schemes. I'll
write a formal API spec that can be implemented for any R7RS Scheme so
that users who outgrow the simple interpreter can seamlessly switch to a
bigger Scheme. The API will incorporate many SRFIs, with custom APIs
only for the missing parts. I'll design any custom APIs with a view to
submitting them as SRFIs. The group discussion on these lists is really
great for finding better ways to do things.

A stable spec will be published once a year but people can keep using
prior years' APIs as long as they want to. A script written now would
start with (import (upscheme 2019)) and that will continue to work
indefinitely. Any bad design in the 2019 API can be rectified next year
while the old API keeps being supported.

This arrangement will provide an explicit API stability guarantee (no
deprecation warnings) and a guaranteed upgrade path for users. I think
these things are very important in order for people to trust a small
entry-level implementation. They are also important for any Unix shell
replacement (/bin/sh has many drawbacks but it persists because it has a
portability guarantee via POSIX and de facto standardization).

The most immediate use of this implementation will be dog-fooding SRFI
170 and other upcoming SRFIs to expose our specifications to harsher
conditions. (The FemtoLisp build has a simple bootstrapping step but I
plan to remove it, rewriting everything in C. This is tolerable: it's
amazing how little code there is. To my knowledge, this will be the only
Scheme implementation written only in C, which is a dubious honor :)

For the standard library I'll throw in as many goodies as can be
implemented easily in portable C. E.g. several S-expression variants,
JSON, XML/HTML output, deflate compression (zip, tar/gzip, png), some
variant of regexps, unicode stuff, dependency sort for make-like things,
sockets and a HTTP client. Call/cc is not planned currently. The
implementation is single-threaded (no OS threads, no green threads) and
will stay that way. Unix signals and subprocesses will be subsumed into
a centralized poll() framework (i.e. instead of installing signal
handing procedures, you inquire about signals by reading a port). More
generally, the guiding principle is "no magic". I've been systematically
eliminating what little magic there is in the FemtoLisp codebase.

The current code builds and bootstraps from scratch in 6 seconds with no
warnings using only the system C compiler on MacOS, Linux, OpenBSD,
FreeBSD and DragonFly BSD. Metrics like this are central in guiding the
design: faster builds, less dependencies, more platforms. Non-Unix
operating systems are coming soon: it'll be fun (pain is weakness
leaving the body :)

I have a pretty coherent vision of what the initial spec and release
should be like, but it'll be more efficient to build it than to write
about it. It'll be done in tandem with the OS SRFIs which will really
put both the implementation and the SRFIs to the test. Happy to hear any
positive/negative comments once things have stabilized a bit. I
guesstimate it will be ready for release in December or January, just in
time to finalize the (import (upscheme 2020)) API specification.
Development will be concurrent with other Scheme work and will hopefully
enrich it.

(The Up in UpScheme originally came from "ultra-portable". I kept the
name because it felt natural to type "upscheme" in a shell. It's also
nice to promote Scheme in the command name; I'll try my best to promote
and not embarrass :)