Hello Lassi,
Good idea for a SRFI! I have some comments which are listed in order
from practical to esoteric.
* (os-command-line) is defined to return strings. I suggest returning
the OS's native data as bytevectors.
Linux file systems don't require valid UTF-8 and Windows doesn't
require valid UTF-16 (https://simonsapin.github.io/wtf-8/). We can
already get the friendly strings from (command-line), but don't have
any way to get the bytevectors. Having the command line in bytevector
format is one prerequisite for being able to open filenames that are
not valid UTF-8/UTF-16. More APIs would need to handle bytevectors,
but it's a start.
* The (command-line) procedure has converted the arguments to strings,
which is fine. But what happens to invalid bytes, are they replaced
with the replacement character (U+FFFD)?
* You mention that argv[0] is not reliable, but even worse is that
execve() will let you pass NULL in argv[0]. The consequence would be
(os-command-line) => (). A lot of C programs out there segfault when
you start them that way. Do we want to keep trapping programs into
writing such bugs or should argv[0]==NULL give (os-command-line) =>
(#vu8())? (Assuming bytevectors are used, of course).
* I concur with Sebastien Marie and believe that (os-executable-file) is
probably not very usable in practice. Finding the first argument given
to execve(), even in the face of a brutal parent process, is not very
useful to the program author. Programs that need to load other files
from the file system tend to incorporate their paths into the binary
during a configuration step before compilation. If a program wants to
know the name of its own executable it can simply build that string
into the binary. If a program wants to have different personalities
based on how it was started (like e.g. Chez Scheme's scheme-script
binary) then argv[0] is all it needs, because it can assume a
friendly parent process. The parent process can always do something to
mess up the execution of the child anyway.
* readlink("/proc/self/exe") on Linux is not 100% reliable. If the
binary is deleted then the symlink points to e.g. "/bin/bash
(deleted)". Programs can also be executed from a memfd and the symlink
then says "/memfd: (deleted)". It is also not certain that /proc is
mounted.
* The ELF auxiliary vector has the executable filename.
I checked "info auxv" in gdb on FreeBSD, NetBSD and Linux (respectively):
15 AT_EXECPATH Executable path 0x7fffffffefd8 "/bin/ls"
2014 AT_SUN_EXECNAME Canonicalized file name given to execve 0x7f7fffcb74e0 "/bin/ls"
31 AT_EXECFN File name of executable 0x7fffffffeff0 "/bin/ls"
I think only NetBSD canonicalizes it. OpenBSD omits this useful information.
Regards,
--
Göran Weinholt | https://weinholt.se/
Debian developer | 73 de SA6CJK