Re: Bytevectors instead of strings in SRFI 170
Marc Nieper-WiÃkirchen 03 Aug 2020 20:17 UTC
Would using strings assume that the underlying character encoding of
the OS is UTF-8? Can we assume this in 2020? Or do we have to convert
from whatever local encoding to Unicode?
Strings instead of bytevectors make some sense because the basic
R7RS-small procedures dealing with file names all take (Unicode)
strings as arguments.
Am So., 2. Aug. 2020 um 21:30 Uhr schrieb Lassi Kortela <xxxxxx@lassi.io>:
>
> The following procedures take and return strings:
>
> (read-symlink fname)
> (directory-files dir [dotfiles?])
> (make-directory-files-generator dir [dotfiles?])
> (open-directory dir [dot-files?])
> (real-path path)
> (create-temp-file [prefix])
> (call-with-temporary-filename maker [prefix])
> (user-info uid/name)
> (group-info gid/name)
>
> I think we should define them so that when bytevectors are given in
> place of strings as arguments, strings will be returned as bytevectors
> as well. This is what the Python OS APIs do nowadays: for example,
> compare os.listdir("/") and os.listdir(b"/"). IIRC they started with
> strings only but ran into trouble and had to add the bytes support.
>
> In SRFI 170 (current-directory) is idiosyncratic in that it returns a
> string but doesn't take any string argument that we could decide whether
> to return a string or a bytevector. Also, Schemes like Gambit and Kawa
> have a per-thread current directory which I presume is a string.
>
> I'll provide a `os-current-directory-as-bytevector` procedure in the
> upcoming SRFI
> <https://misc.lassi.io/2020/srfi-submit/process-state-as-bytevectors.html>.