Re: The case for glob in SRFI 170 Lassi Kortela 11 Dec 2019 19:36 UTC
> - As mentioned, it's significantly higher-level than the other stuff in > this SRFI. > > In what sense? It's glob(3) and much like other things at that level. In the sense that it adds a lot of semantics on top of syscalls. While something like read-directory is an abstraction, it's a fairly obvious mapping of getdirentries(). Glob isn't an obvious mapping of any syscall. If there was a glob in kernels, it would be a different matter. libc is both a blessing and a curse. It has a lot of goodies but few of them are implemented optimally for garbage-collected languages, and many are not even optimal for C. I hope it will be widely deprecated at some point. That's why I think the syscall API is a better target to aim for. Of course, many concessions need to be made. > - It's easy to implement in portable Scheme on top of a directory > walker. > > The implementation on top of open/read/closedir is certainly not a > trivial effort; especially if you want to minimize the number of > directories you open. There's no reason not to use the libc > implementation since it is there. IMHO if we're going all the way up to this level of abstraction, the libc glob syntax is no longer the best one to use. We might as well implement something like bash or zsh globs, and there is no universally available C library for that. > It's not available in Win32, that's true; but lots of Posix things > aren't. Here's a self-described minimal implementation: > https://github.com/oetiker/rrdtool-1.x/blob/master/win32/win32-glob.c That is a neat glob implementation. I don't quite understand the parsing; it hardly seems to do any preprocessing to the pattern string. Again thinking of the syscall surface, WinAPI DLLs expose almost-direct equivalents for most of the essential syscalls / syscall combinations. > Posix 2008 and later specifies *, ?, and [...] only, and that's what we > should provide too. I've several times found the Posix globs wanting for real work. Simple jobs are not too bad to do by manually filtering and merging directory listings; for complex jobs, Posix globs are not feature-rich enough. Hence based on my experience I'd advocate for something more complex. > - It'd be nice to use S-expression regexps instead of using string > regexps and worrying about escaping. Probably would be nice to have the > traditional string regexps as well. > > The whole point of this function is to trade off perfomance, certainly > in scsh, for convenience. It does what it does. The main point of S-expression regexps is correctness and composability. Performance ought to be slightly poorer than with strings unless macros are used. But again, disk I/O and syscalls probably take more time. > errors should probably be on by default. > > Makes sense. Change the argument name to carry-on? then. https://sd.keepcalms.com/i/keep-calm-and-ignore-warnings.png > We also need to support musl libc and the like. Do those have glob()? > > Anything that supports Posix 2008 has glob(). In particular both musl > and newlib have it. That's good to know.