Provide two versions of os-executable-file
Sebastien Marie
(20 Apr 2020 11:10 UTC)
|
Re: Provide two versions of os-executable-file Lassi Kortela (21 Apr 2020 21:11 UTC)
|
Re: Provide two versions of os-executable-file
John Cowan
(23 Apr 2020 02:15 UTC)
|
Thank you for the detailed comments (and welcome to the SRFI lists in case you are new here :) > First, I think the srfi covers relatively well the problem with os-executable-file > about "how returning absolute pathname pointing to the executable file running > the Scheme program" on differents OS. > > But I would add some elements. > > First, why a programmer would want to use such function, and which would not be > possible with argv[0] ? argv[0] is often just the basename of the command, which means you'd have to replicate a PATH search. That's complex and unreliable. > Usually it is because it needs accuracy and somehow secure information (to not > rely on user provided argv[0]). The OS-dependent executable filename can be expected to be more consistently accurate than the basename (since the basename can be in many formats, whereas the executable filename is in one format per OS. I wouldn't consider the executable filename safe in security-sensitive contexts (as Göran detailed in his mail) - that was never the goal. The SRFI should emphasize that more clearly; I'll edit the prose. > From my experience, it could be to simply re-execute the program (a way to > "reload", with eventually new parameters, for example), or to execute a subpart > of the program as separated system processus (an alternative to simply calling > fork(2) to take advantage of freshness provided by OS - see fork+exec inside > https://www.openbsd.org/innovations.html for details) > > or to open the file to read it, for example to parse ELF and get debug symbols > name to provide fancy backtraces. Yes, that's one possible use for it. > In both cases, the accuracy of the result is very important, else it could > introduce subtile security issues: if at time of use, the pathname points to > something else (file removed and replaced by different program) the program will > execute unexpected code, or it will read and try to parse unexpected data. You're exactly right. However, this security problem exists no matter which filename you execute: even /bin/ls can be replaced by a malicious program if an attacker has root access. fexecve() is probably a little safer. <https://pubs.opengroup.org/onlinepubs/9699919799/functions/fexecve.html> SRFI 193 doesn't have a provision to open a file handle to the running executable, and I'm not sure how portable such a feature would be. Perhaps a future SRFI should collect Unix security features (fexecve(), issetugid(), pledge(), etc.) SRFI 170 is still open, but the security APIs are less fundamental than most stuff in it and security is more of a moving target. > The SRFI-169 shows OS dependent methods to retreive the pathname information, > and I think it is the responsability to the OS to provide accurate information > (it is why OpenBSD doesn't provide it). > > But I would add that (os-executable-file) should provide such information > *as-it*, and only provide an uniform way to access to the information. > Particulary, it should not try to resolve the path provided by the system. Else > it will introduce a inherent TOCTOU. Very good point - thank you for making it :) I'll tone down the prose so that the OS APIs are not advertised as "reliable" like they are in the current draft. The point of resolving the pathname to an absolute one is that the program can chdir() later, which would make the relative path wrong. But your approach may be better. > In order to illustrate my concern, I will take the Linux example. The kernel > provides a pathname "/proc/self/exe" which points to the right file at anytime > (the kernel itself deals with file removing or renaming, if I recall correctly). > > If (os-executable-file) returns the result of readlink("/proc/self/exe"), at > soon the function returns, the result could be already wrong. Someone could > remove the file and replace it with something else, in the time between > (os-executable-file) returns and the effective use of the function result. Only > the kernel itself could provide the required atomicity. > > But now, if (os-executable-file) returns only "/proc/self/exe", the information > is valid at anytime for any use. The program could use it to re-execute (I > assume, I don't have checked in depth) or to read the file. The kernel itself > will ensure the operation (execve(2) or open(2)) to operate on the right file. > > So, as long the function returns a path which isn't generic, it is already > flawed for any real usage. It is why I think (os-executable-file) shouldn't > return a resolved path. Aha, you're thinking of returning a path to a file/symlink from which the real path can be read. That's a different concern still. The trouble is that there are many non-procfs based approaches (sysctl and custom C APIs as listed in the draft). I'm not sure what those APIs do in case the executable is moved from the old path (i.e. whether they update the internal informations to point to the new path). I'd guess they don't all update it. > For this reason I would introduce two differents functions: > - one to retreive "accurate" pathname (generic one) or #f > > - another to retreive a "supposed right" pathname, which could be always > implemented (even on OpenBSD) by duplicating the actions of the shell in > searching for the executable file (using PATH environment, or confstr(_CS_PATH) > [posix function to retreive default PATH] and iterating on the directories to > find the program name or argv[0]) IMHO the path-search-for-self sounds a bit hacky to have in a SRFI. However, a path-search-for-arbitrary-command procedure would probably be useful for many things, and people could trivially combine it with (car (os-command-line)) to find self. There's a plan to write a process spawning SRFI as a continuation of SRFI 170. (Subprocesses were left out of 170 since 170 was already so big and we wouldn't have time to do justice to the many details.) Perhaps the process SRFI should include a path search procedure. A procedure that promises to get an accurate pathname is a bit problematic as well - we should find out which OSes give an accurate pathname and Scheme implementations should hardcode that knowledge. But what if the internals of those operating systems are later changed so that the information they return is no longer as accurate? In light of these thoughts, it would probably be best to have a "here is the raw executable filename from the OS - take it or leave it" procedure. The `os-executable-file` in the current draft is like that, but you are right that it shouldn't be advertised as reliable. > From the SRFI-169 list of OS, only few OS will be able to provide the "accurate" > version of (os-executable-file). But all will be able to provide the > "supposed-right" version. > > And by providing two versions, it makes the developer aware about the fact that > (os-executable-file/assumed-right) could return a pathname to something else > that the executable itself, and so should not be used without care. > > Thanks. Wonderful comments all - thank you for making them.