Re: remaining issues - Simplelists

Show/hide message thread

remaining issue: Windows-disallowed file names Derick Eddington (06 Mar 2010 01:11 UTC)

Re: remaining issue: Windows-disallowed file names R. Kent Dybvig (07 Mar 2010 06:50 UTC)

Re: remaining issue: Windows-disallowed file names Derick Eddington (08 Mar 2010 02:04 UTC)

Re: remaining issue: Windows-disallowed file names Derick Eddington (08 Mar 2010 02:26 UTC)

Re: remaining issue: Windows-disallowed file names R. Kent Dybvig (08 Mar 2010 18:42 UTC)

Re: remaining issue: Windows-disallowed file names Derick Eddington (09 Mar 2010 08:53 UTC)

Re: remaining issue: Windows-disallowed file names R. Kent Dybvig (14 Mar 2010 04:12 UTC)

Re: remaining issues Derick Eddington (14 Mar 2010 19:21 UTC)

Re: remaining issues R. Kent Dybvig (15 Mar 2010 14:08 UTC)

Re: remaining issues Derick Eddington (15 Mar 2010 23:44 UTC)

Re: remaining issues R. Kent Dybvig (22 Mar 2010 21:28 UTC)

Re: remaining issues Derick Eddington (27 Mar 2010 02:30 UTC)

Re: remaining issues Derick Eddington 14 Mar 2010 19:21 UTC

On Sat, 2010-03-13 at 23:12 -0500, R. Kent Dybvig wrote:

> > In conclusion, I don't see why considering library names as
> > representations of contemporary file names is proper, and I don't see
> > why it outweighs the importance of abstract unrestricted library names.
>
> All of what you say follows from your perspective but is uncompelling to
> me from my perspective.  So perhaps we just have to agree to disagree.

Agreed.

(I'd like to make a point I didn't.  With file-name extensions
automatically added, you can't name every possible file via library
names, because you can't get a file name without a "." and extension.
Perhaps this could be solved by interpreting an empty-string extension
as meaning "don't add an extension", but that might conflict with the
interpretation of an empty-string extension as meaning "include
system-default extensions", unless you add more special rules about
where the empty-string extension occurs.)

> But in your second-to-last paragraph:
>
> > If you want to override the SRFI to catch such library names, you can
> > also map them to a file name without the SRFI's encoding (maybe just the
> > "absolute path" first symbol).  Same for ~.
>
> lies a possible compromise I might be able to justify implementing:
> require a system to look first for the unencoded version of the
> constructed pathname (ignoring search-path prefixes if the pathname is
> absolute in the host filesystem) and, if that fails, then for the encoded
> version.  For example, say the list of library directories includes "lib1"
> and "lib2" and the list of library extensions includes only "sls".  Then
> the system looks for (/ foo) first in /foo.sls, then in lib1/%2f%/foo.sls,
> then in lib2/%2f%/foo.sls.  It looks for (srfi :1) first in
> lib1/srfi/:1.sls, then in lib2/srfi/:1.sls, then in lib1/srfi/%3a%1.sls,
> then in lib2/srfi/%3a%1.sls.  This should allow one to name any file that
> exists in the filesystem and to specify absolute pathnames for convenience
> or security, without inhibiting the sharing of filenames with funny
> characters via the %scalar-value% encoding.  Obviously, a system could
> choose not to bother trying the unencoded version of the path name if it
> is clearly not valid for the underlying filesystem.

I think such unencoded-then-encoded lookup would work.  I'm going to
explore adopting it, and I'll get back to here about that.

However, it breaks mapping library-file names to library names, which
has been one of my original goals for this SRFI, because, e.g.,
acme/a%3A%b.ext might be (acme a%3A%b) or (acme a:b), and
foo/bar/zab.ext might be (foo/bar/zab), (foo bar/zab), (foo/bar zab), or
(foo bar zab).  Being able to programmatically manage/analyze
collections of library files (as far as what's possible from knowing
only library names) from only file-name listings has been important to
me.

I'm not sure it's the SRFI's place to involve special handling of
(~ ---) or (/ ---), (c:/ ---), etc.

> Incidentally, is there a reason to chose "r6rs-lib" as the extension for
> R6RS libraries rather than the shorter "sls" as recommended in the R6RS
> non-normative appendices?  We selected sls because it did not (to our
> knowledge) conflict with existing extensions for Scheme source code, so I
> assume that's not your reason.  If there is no particular reason why you
> chose r6rs-lib, please change it to sls.

I can't find "sls" in the Non-Normative Appendices nor any of the other
R6RS documents.

I do have particular reasons for choosing "r6rs-lib".

"sls" is currently being used for single-library and multiple-library
files.  If "sls" (or any extension) is used for different types/formats,
we can't programmatically process files based on their name extensions
because they could be something unknown.  "r6rs-lib" says precisely what
a file is: one R6RS library.  "thing-libs" means a file containing
multiple libraries of dialect Thing.

What if a Scheme system wants to support multiple types/formats of
library files and use the extension to know how to handle?  What if
multiple implementations of a library for multiple Scheme dialects (i.e.
the files have the same name modulo extension) should be in the same
directory?  They're all "Scheme library sources" (i.e. "sls"), but they
can't all have the same extension, so why should this SRFI's format for
R6RS library files get "sls"?

For only the tiny price of 5 more characters, a number of benefits are
gained.  If actually understanding the type/format of files requires my
human analysis or analyzing the file contents, I don't see the point of
having barely-useful wannabe types jerry-rigged into file names.  I see
file-name extensions as global names which need to be precisely
unambiguous, not as local names which are okay to be super truncated and
rely on context for disambiguation.  (I wish prevalent file systems had
metadata outside the file name and contents and which could be used for
file typing (and other purposes), maybe in the future...)

I tried to succinctly address all this in the new section "Dialects,
Formats, and Extensions" in the current draft.

> Also, Chez Scheme treats a trailing separator character (":" under
> Unix-based systems, ";" under Windows) in its variants of SCHEME_LIB_PATH
> and SCHEME_LIB_EXTENSIONS as an indication that the system should look in
> the system-specific libraries/extensions if the library isn't found in
> the user-specified set.  (This mirrors similar behavior for the
> LD_LIBRARY_PATH variable used by some Unix dynamic loaders.)  Perhaps SRFI
> 103 should do the same.  For example, if SCHEME_LIB_PATH is set to
> "foo:bar" on a Unix-based system, the system should look in foo and bar
> only, but if SCHEME_LIB_PATH is set to "foo:bar:", the system should look
> in foo, bar, and any system-specific library directories.  It is useful to
> prevent the system from looking in system-specific libraries if you want
> to make sure you know exactly where each library is coming from, and it is
> useful for user directories to be searched first if you have a library
> you want used in preference to one shipped with the implementation.

I know about that and I questioned whether the SRFI should involve it.
I decided not to because it's system-specific and the purpose of the
SRFI's environment variables is for telling what directories/extensions
to portably use across arbitrary multiple systems, and some systems may
not have the notion of system-default directories/extensions, so
something which has system-specific effects doesn't seem appropriate to
involve.  What should it mean to a system which doesn't have the notion?

However, I intended that systems still be free to interpret a trailing
separator character as you described above.  The current draft says the
environment variables' values are "a string containing a sequence of
[things] separated by [a separator]" and it says "Scheme systems may
initialize the [abstract sequence of things] to include additional
[things]".  Since an empty-string directory-name/extension is otherwise
improper, systems may interpret a trailing empty-string thing as meaning
something special.  If users know that all the systems they're using
interpret it as you described above, it'll work for them; but if it
doesn't make sense for some system, then it doesn't make sense to do it
in the for-portable-cross-system-use environment variables.  I've been
imagining that systems will still have their own environment variables
for system-specific configuration.

--
: Derick
----------------------------------------------------------------