The only thing is, the implementation is going to make FFI calls to
readdir() or getdirentries() in a loop. Going back and forth between the
FFI and the user-supplied fold function might be about as costly as just
returning a list/vector of everything and filtering it after the fact.
There's no getting away from that. If performance matters, use a compiler like Chicken that can integrate FFI calls.
"Never worry about how long it takes to initialize matrices. If the matrices are small, the initialization time is trivial; if the matrices are large, the initialization time is still trivial relative to the amount of time spent manipulating the matrices." --The Elements of Programming Style (paraphrased from memory)