Pathname representations Lassi Kortela 07 Feb 2020 22:19 UTC

Arthur:
> Speaking of pathname-manipulating APIs, have you seen MIT Scheme's pathname API?  It's well developed, and supports all kinds of filesystems going all the way back to the civilized days of versioned files in TOPS-20.

Per:
> Please also look at the Kawa path API:
> https://www.gnu.org/software/kawa/Paths.html
> I put quite a bit of thought into it, specifically
> to integrate filenames, URIs, and URLs into a coherent framework.

Interesting. According to that page, Kawa also has a per-thread working
directory. Per, do you feel strongly like Marc that a per-thread rather
than per-process CWD is the right way to go?

Re: pathname representation, it seems clear that we have two camps:

* Record-like abstract data type (MIT Scheme/Kawa/Racket/Common Lisp)
* Strings (most other Schemes and Python, Go, etc.)

Personally, I have to side with John. I have extensive experience with
both styles and have to say that I find the strings-only style far
superior. I've used the pathname ADT in Common Lisp, and the pathname
and URL ones in Racket. In principle, the ADT sounds much better but in
practice it has consistently led to verbose and hard-to-understand code
for me. I haven't used the Kawa or MIT Scheme pathname APIs; they may be
substantially different in practice.

A large part of the problem is that Unix style pathnames are basically
directory-only, so the 6 or so components in Common Lisp pathnames
basically sit empty all the time. The difference between directory and
filename is often ambiguous, and with wildcards it's especially so.
Pathnames are often imported and exported from syscalls and other
programs which use strings.

In Racket, I've found the URL functions really hard to understand.

By contrast, I can only say good things about the minimalist Python and
Go pathname APIs. They present a lightweight level of abstraction that
consistently does just what I want.

If we have a clear lightweight API camp and a clear heavyweight API
camp, I don't want to push my minimalist views on others but would
rather honor both and think about interoperability between them. Scheme
is not well served by further divisions. At the end of the day, the
pathname/URL ADT always has to resolve to a string representation, so if
functions that take pathnames can take either a string or a pathname
object, things should be mostly OK. I've consciously taken care to write
all my pre-SRFIs dealing with pathnames so that they can take either one.