Email list hosting service & mailing list manager

The case for glob in SRFI 170 John Cowan (11 Dec 2019 15:54 UTC)
Re: The case for glob in SRFI 170 Lassi Kortela (11 Dec 2019 18:03 UTC)
Re: The case for glob in SRFI 170 John Cowan (11 Dec 2019 19:19 UTC)
Re: The case for glob in SRFI 170 Lassi Kortela (11 Dec 2019 19:36 UTC)
Re: The case for glob in SRFI 170 John Cowan (11 Dec 2019 20:26 UTC)
Re: The case for glob in SRFI 170 Lassi Kortela (11 Dec 2019 20:45 UTC)
Re: The case for glob in SRFI 170 Shiro Kawai (11 Dec 2019 20:43 UTC)
Re: The case for glob in SRFI 170 John Cowan (11 Dec 2019 20:58 UTC)
Re: The case for glob in SRFI 170 Lassi Kortela (11 Dec 2019 21:00 UTC)
Re: The case for glob in SRFI 170 Duy Nguyen (13 Dec 2019 10:33 UTC)

Re: The case for glob in SRFI 170 Lassi Kortela 11 Dec 2019 19:36 UTC

>     - As mentioned, it's significantly higher-level than the other stuff in
>     this SRFI.
>
> In what sense?  It's glob(3) and much like other things at that level.

In the sense that it adds a lot of semantics on top of syscalls. While
something like read-directory is an abstraction, it's a fairly obvious
mapping of getdirentries(). Glob isn't an obvious mapping of any
syscall. If there was a glob in kernels, it would be a different matter.

libc is both a blessing and a curse. It has a lot of goodies but few of
them are implemented optimally for garbage-collected languages, and many
are not even optimal for C. I hope it will be widely deprecated at some
point. That's why I think the syscall API is a better target to aim for.
Of course, many concessions need to be made.

>     - It's easy to implement in portable Scheme on top of a directory
>     walker.
>
> The implementation on top of open/read/closedir is certainly not a
> trivial effort; especially if you want to minimize the number of
> directories you open.  There's no reason not to use the libc
> implementation since it is there.

IMHO if we're going all the way up to this level of abstraction, the
libc glob syntax is no longer the best one to use. We might as well
implement something like bash or zsh globs, and there is no universally
available C library for that.

> It's not available in Win32, that's true; but lots of Posix things
> aren't.  Here's a self-described minimal implementation:
> https://github.com/oetiker/rrdtool-1.x/blob/master/win32/win32-glob.c

That is a neat glob implementation. I don't quite understand the
parsing; it hardly seems to do any preprocessing to the pattern string.

Again thinking of the syscall surface, WinAPI DLLs expose almost-direct
equivalents for most of the essential syscalls / syscall combinations.

> Posix 2008 and later specifies *, ?, and [...] only, and that's what we
> should provide too.

I've several times found the Posix globs wanting for real work. Simple
jobs are not too bad to do by manually filtering and merging directory
listings; for complex jobs, Posix globs are not feature-rich enough.
Hence based on my experience I'd advocate for something more complex.

>     - It'd be nice to use S-expression regexps instead of using string
>     regexps and worrying about escaping. Probably would be nice to have the
>     traditional string regexps as well.
>
> The whole point of this function is to trade off perfomance, certainly
> in scsh, for convenience.  It does what it does.

The main point of S-expression regexps is correctness and composability.
Performance ought to be slightly poorer than with strings unless macros
are used. But again, disk I/O and syscalls probably take more time.

>     errors should probably be on by default.
>
> Makes sense.  Change the argument name to carry-on? then.

https://sd.keepcalms.com/i/keep-calm-and-ignore-warnings.png

>     We also need to support musl libc and the like. Do those have glob()?
>
> Anything that supports Posix 2008 has glob().  In particular both musl
> and newlib have it.

That's good to know.