Re: The case for glob in SRFI 170
Lassi Kortela 11 Dec 2019 18:03 UTC
> Yes, it's messy. Yes, it has a lot of goofy options. Yes, its regex
> language is like nothing else anywhere (except its ancestors in the DEC
> operating systems, though [...] comes from Unix regex).
>
> BUT: it's in C, and that makes it fast even on slow Schemes. I think
> that alone makes it worth considering.
As stated, I'd vote against it in this SRFI, but would probably like it
in a dedicated higher-level SRFI.
- As mentioned, it's significantly higher-level than the other stuff in
this SRFI.
- It's easy to implement in portable Scheme on top of a directory walker.
- Not sure interpreter speed is the bottleneck even on slow
interpreters; is interpretation really slower than syscalls and disk I/O?
- Probably not implemented in C on Windows API. Windows FindFirstFile()
has its own weird glob syntax that we need to deliberately avoid in any
case.
- There is no agreement on a particular glob syntax. In particular, "**"
globs are useful but not as standard. How are international characters
handled? Etc.
- It'd be nice to use S-expression regexps instead of using string
regexps and worrying about escaping. Probably would be nice to have the
traditional string regexps as well.
> Specifics: I think GLOB_MARK (terminate directory paths with slashes)
> should always be on, GLOB_ERR (raise an error if opening or reading a
> directory fails, otherwise carry on) should be exposed, and GLOB_NOSORT
> (don't sort results) should be exposed in the reverse sense.
>
> So: (glob pattern report-error? sorted?). The result is a list of
> strings, possibly an empty list (if you want the bash behavior of
> returning the pattern as a result, do it yourself).
Another procedure that begs for keyword arguments ;-) Or some other way
to pass named options.
Not much opinion on those particular options; errors should probably be
on by default.
> Glibc exposes a lot more flags than Posix, but I don't think any of them
> are killers.
We also need to support musl libc and the like. Do those have glob()?