Re: five problems with this draft SRFI

Show/hide message thread
five problems with this draft SRFI William D Clinger (26 Sep 2009 01:20 UTC)
Re: five problems with this draft SRFI Abdulaziz Ghuloum (26 Sep 2009 05:58 UTC)
Re: five problems with this draft SRFI Derick Eddington (26 Sep 2009 15:42 UTC)
Re: five problems with this draft SRFI Derick Eddington (27 Sep 2009 02:43 UTC)
Re: five problems with this draft SRFI Shiro Kawai (27 Sep 2009 03:16 UTC)
Re: five problems with this draft SRFI Derick Eddington (29 Sep 2009 02:32 UTC)
Re: five problems with this draft SRFI William D Clinger (30 Sep 2009 01:49 UTC)
Re: five problems with this draft SRFI Derick Eddington (30 Sep 2009 03:22 UTC)
Re: five problems with this draft SRFI Derick Eddington (30 Sep 2009 03:51 UTC)
Re: five problems with this draft SRFI Derick Eddington (30 Sep 2009 06:33 UTC)
Re: five problems with this draft SRFI William D Clinger (30 Sep 2009 13:11 UTC)
Re: five problems with this draft SRFI Derick Eddington (01 Oct 2009 09:10 UTC)
Re: five problems with this draft SRFI Derick Eddington 26 Sep 2009 15:42 UTC
I haven't yet read the newer posts since the one responded to below,
which might influence my views, and I could probably spend more time
editing what I've written below, but I think it's good enough to give an
accurate conveyance of the reasons I've had for the decisions I've made
which shaped the current draft.

On Fri, 2009-09-25 at 21:20 -0400, William D Clinger wrote:

> I haven't had time to read this draft SRFI carefully,
> but I want to report several problems that I perceived
> during my first quick reading:
>
> 1. Treatment of versioning.
> 2. Ignoring all contents after the first datum.
> 3. Failure to specify which characters are encoded.
> 4. Specification of ordering but not matching.
> 5. Implicit file names.

[------------------------------------------------------------------------------]

> 1. Treatment of versioning.
>
> The R6RS does not specify any portable semantics for
> versions, which was indubitably silly but had the
> virtue of allowing implementations to ignore versions
> altogether.

I wish R6RS's versioning were more clear also, but my current
perspective makes me understand why it wasn't specified in more detail.
One of my intents for this SRFI is to try to make some progress towards
supporting versioning.  Given the complications involved (IIUC), which
implementations need to handle in differing ways appropriate to their
particular natures and ideals, SRFI 103 wants to support both people who
want versioning and people who don't.

> Implementations that support this draft SRFI would not
> be able to ignore versions altogether.  They would be
> required to implement *some* semantics for versions.

That's true, and I think if enough people want versioning in library
names and references, then it's worth requiring implementations which
want to support portability to at least understand a file naming scheme
which involves versions and to at least prevent libraries which do not
meet version constraints from being used.  My intent for this SRFI is to
not require implementations to do any version handling beyond loading
the first-processed-import-clause's library which is a match according
to that import clause's version reference constraint (with matching
extended in the one additional way this SRFI describes), whatever
version it may be, nor any version handling beyond failing if other
library references, to the same library but with version constraints
which are not compatible, are encountered in other import clauses.  If
subsequently-processed (in whatever implementation-dependent order)
import clauses of the same library but with a different version
reference constraint which is not compatible with the version of the
first-processed and already-chosen version of the same library occurs,
then implementations are free to just immediately fail and say
"incompatible version references for library (foo bar) ...".  However,
if the combined versions are compatible, meaning that whatever
implementation-specific loading-order did not fail, then the versions
used do meet all the given version constraints and so are supposedly
okay according to the given version references.

Having at least this degree of portable version handling allows the file
naming scheme to be portable for users on opposite sides of this issue,
and it ensures programs will not run with versions of libraries which do
not match the version constraints.

Implementations are free to just bail and say "My implementors don't
like versioning so I won't do anything more to help you deal with using
all these libraries with their version reference constraints".  And
other implementations are free to provide additional
external-from-library-references mechanisms of specifying which exact
version of a library to use to deal with the inherent problem of how to
satisfy all the combined version constraints.  My current understanding
is that this is as good as it can get if we simultaneously want to
support portability for people who want versioning and to support
implementors who don't want to deal with the complexities of versioning.

I agree it's not a beautifully simple nature, but if people want
enforced versioning in library references, and I can understand why,
then why can't we go so far as to at least require that implementations
understand a file naming scheme supporting versions (which I think is
simple) and at least honor the version reference constraints by not
using a library which violates them (which I think is relatively
simple)?

> In particular, they would be required to implement a
> semantics that is arguably inconsistent with the
> intent of the R6RS editors.

What do you mean exactly?  I don't know all the details of the history.
Do you mean that their intent was to allow implementations to be
unsupporting and oblivious to R6RS's versioning?  If some people want
versioning, how is being unsupporting and oblivious to their versioned
libraries helping portability?

> Yet this draft SRFI still
> stops short of requiring any portable semantics for
> versions.

It intends to support portability to the extent that it's required that
libraries which do not match all the combined version constraints will
not be used, and whatever ways the issue of incompatible imports occurs
and is dealt with is implementation-specific.

> That sounds like the worst of all possible worlds.

To me, it sounds better than excluding the people who want to use
versioning.

I have a few libraries I've put versions in their names and referenced
them with version constraints because not all the versions of the
libraries, created and originally-distributed by others, will work.  I
want to make clear that I can see why some people desire version
constraints to be in import clauses' library references and why some
people desire being able to have multiple versions available for use by
different programs which require different versions.

> In addition, the inclusion of versions changes the
> mapping from library names from a one-to-many mapping
> to a one-to-infinite mapping.  I understand that it's
> implementable, but it changes the nature of the beast.

If you consider the version to be part of a library name and require
that library file paths exactly represent the library name, and therefor
the files for libraries with versions are required to have the versions
in the file names, then it is not a one-to-infinite mapping and it is a
one-to-many mapping just as like for libraries without versions.

> In particular, it would be harder for humans to map
> library names to file names, which goes against one
> of the the alleged goals of this draft SRFI;

Why is that?  A library name with a version maps to a file name with the
version just as easily as a library without a version maps to a file
name without a version.

> in
> particular, it goes against the alleged rationale
> for discarding all but the first library in a file
> (see below).

I don't understand why this is, even after analyzing your below
comments.  My rationale for single-library files is primarily to be able
to have one-to-one mapping of library file paths to library names.

[------------------------------------------------------------------------------]

> 2. Ignoring all contents after the first datum.
>
> Taken literally, that is a recipe for disaster.  For
> example, the R6RS permits implementations to extend
> the lexical syntax of Scheme with a datum of the form
> #!fold-case or #!larceny or similar, and many systems
> have added such extensions.  Requiring all contents
> that follow a #!fold-case datum to be discarded is
> silly.

I thought #!identifier lexical tokens are considered comments at a level
below "syntactic datums".  I, of course, meant the first thing the
standard read procedure would return.  I thought calling that a "datum"
was the normal terminology.  Whatever is the most recognized way to say
that, I'll change it to.

> Requiring all contents to be discarded following the
> first library is silly as well.  As demonstrated by
> Larceny, allowing multiple libraries within a single
> file reduces clutter.

I agree it can reduce clutter and be convenient.  But it sacrifices the
ability to have a one-to-one mapping of library file path names to
library names.  Having that ability is my main reason for promoting
single-library files (more below).  The other reasons I tried to
succinctly describe in the current draft are secondary, and, IMO,
important additional benefits.

> I am not going to argue that
> this SRFI should require implementations to support
> multiple libraries within a file, but *requiring*
> implementations to discard all but the first library
> within a file serves no purpose other than to ensure
> that Larceny will not support this SRFI.

It wasn't my intent to prevent Scheme implementations from supporting
multiple libraries in a file.  The intent of this SRFI having only
single-file libraries is so path names can be mapped to library names,
and, secondarily, so finding a library's file when you don't know where
it's located does not involve as much hunting-down.  I think the simpler
and one-to-one mapping nature of single-library files is the better
option to make the portable requirement.

> This SRFI should state that files conforming to this
> SRFI must have only one library per file.  This SRFI
> should not require implementations to ignore all but
> the first library in a file.

That is how I wanted it to be.  I thought my saying:

        Library files are files which contain one library form as the
        first syntactic datum, and they are files whose path exactly
        represents the name of the contained library. Any additional
        contents after the first datum are ignored by this SRFI.

was clear enough to mean that the ignoring of additional
things-`read'-would-see is only applicable in the context of conformance
to this SRFI.  I'll gladly reword it to what is the best way to convey
that.

[------------------------------------------------------------------------------]

> 3. Failure to specify which characters are encoded.
>
> If you don't specify which characters are encoded,
> then one of two things will happen.  (I don't yet
> understand the draft SRFI well enough to understand
> which of the following disasters it would require,
> but it would require one of them.)
>
> One possibility is that different implementations
> will require different sets of characters to be
> encoded, so moving files of portable libraries from
> one implementation to another will continue to
> involve wholesale renaming of files.  Removing that
> necessity, it seems to me, is the main thing this
> SRFI should accomplish.

Because of certain OSs' and shells' nuisance-character prevalence, I
agree this SRFI should encode the characters which allow files to be
exchanged between the contemporary platforms without transcoding their
path names.  That bothers my notions of idealistic design, but I'm
willing to compromise.

Note, however, that transcoding (renaming) path names might still be
necessary when exchanging between platforms other than the prevalent
ones.  I tried to make this SRFI, and its companion, SRFI 104, able to
easily transcode path names in the event platforms which have
incompatible sets of allowable path name characters are desired to be
used.  However, one might say, non-prevalent platforms might have a
notion of naming persistent objects so different from this SRFI's
assumptions that trying to take them into consideration is too
speculative to not just go with the current flow.

> The other possibility is that every implementation
> must support all possible sets of rules for encoding
> characters, so the mapping from library names to
> file names becomes one-to-infinite instead of the
> one-to-many mapping that most implementations must
> already support.  That makes implementations more
> complex and more fragile.  It also makes collections
> of portable code more complex and more fragile.  In
> particular, it would be harder for humans to map
> library names to file names, which goes against one
> of the the alleged purposes of this draft SRFI; in
> particular, it goes against the alleged rationale
> for discarding all but the first library in a file.

I think I've been convinced that I agree this SRFI should specify a set
of characters which is encoded and therefore a set which is not encoded.
Just to mention it, I had been assuming it was better to leave this to a
more informal convention or to another SRFI than to require a set which
might (hopefully) not be applicable in the future, or which might not be
applicable to some platforms now, and that because transcoding path
names of the nature of current SRFI 103, assisted by current SRFI 104,
is relatively easy, it was okay to leave it to other conventions.

[------------------------------------------------------------------------------]

> 4. Specification of ordering but not matching.
>
> I don't even pretend to understand this issue, but
> what is the point of specifying a detailed ordering
> "as the precedence for choosing a match" if the actual
> matching is going to be implementation-dependent?

The point is that there's an ordering which implementations, which want
the freedom to deal with the versioning in different ways, can refer to
when they might say something like: "This implementation has such and
such opinion of versioning support, and when it processes imports, it
will attempt to handle version constraints by choosing the first-ordered
matching library file, according to SRFI 103's ordering, which matches
the first-processed import clause, according to the particular way this
implementation processes import clauses and deals with
separately-compiled libraries' assumptions about what versions they were
compiled against."  Or another implementation might say: "This
implementation likes supporting versioning more than others, so it goes
to greater efforts to help you use versioning, and it does this in such
and such ways which involve the ordering SRFI 103 offers."

Yeah, that's a mouth- and head-full to consider when trying to use all
implementations which differ in multiple ways even though they refer to
SRFI 103's ordering -- but I'm trying to support both sides of people
who don't like the versioning and people who do.  I personally don't
care very much about this SRFI's ordering or versions support.  I
thought it was a step towards supporting my goal of supporting
portability with people who want versioning and people who don't.  I
personally have no problem ditching it.  But I'll be left wondering why
people who do not like versioning can't work with people who do.

> One thing I *do* understand is that the R6RS
> pseudo-semantics for versions is part of the problem
> here.  This SRFI would do better to drop versions
> altogether, as was explicitly urged by six voters
> as one of the well-informed reasons they gave for
> voting against ratification of the R6RS in the form
> that was, unfortunately, approved.

I'm individually okay with dropping versions.  But I want to make clear
I'm still not entirely convinced it should be dropped and that I can see
why some people desire versioning to be in import clauses' library
references.  As I said above, I have a few libraries I've put versions
in their names and referenced them with version constraints because not
all the versions of the libraries will work.  If implementations can at
least guarantee they won't use libraries which violate the version
constraints, then why shouldn't some people be able to use that
portably?  Additionally, if multiple versions of libraries exist, which
I don't think we should ignore that reality, then there might be a
program A which cannot use the same version of a library as program B,
and you want to be able to run both programs on the same computer.
Isn't that a reasonable thing to move towards supporting?

You can accomplish that by making the search paths used by each program
be differently configured, but that's a big hassle because it involves
additional actions to be managed.  SRFI 103 is designed to support both
a one-to-one mapping of library file paths to library names and to
support having multiple versions of a library be located in the
already-configured-don't-have-to-reconfigure search paths, which means
probably in the same directory.  Both these are accomplished by
requiring library files' paths to exactly represent their contained
library, and therefore be single-library-only and always contain a
(possible) version in the file name.

[------------------------------------------------------------------------------]

> 5. Implicit file names.
>
> If the reduction of clutter to be gained from
> implicit file names were truly worth increasing the
> number of files that must be examined in order to
> locate a library, then this SRFI would allow more
> than one library per file.  Allowing more than one
> library per file would actually reduce the number
> of files, instead of just allowing a small number
> of special cases to be moved into a different
> directory.

In some cases what I believe is your desired scheme for multiple
libraries per file could reduce the number of files, but at the expense
of losing the ability to one-to-one map a library file path to a library
name, and at the expense of being able to place "main" files in the same
directory as their related files, when you definitely want the related
files separated from each other and all in the same directory.  With
multiple libraries possible per file and without the implicit file name,
for a library whose name is only the shared name prefix of related
sub-libraries with the same name prefix but longer names, and these
sub-libraries are desired to be organized as separate files in a
directory, the file for that library must unavoidably be placed outside
the directory collection of the other files in its conceptual
collection.

It's my perspective that the ability to one-to-one map a library file
path to a library name, and the ability to place "main" files in the
same directory as their related files, is worth not supporting
multiple-library files as the portable conformance to this SRFI.

I recognize that implicit file names increase the number of possible
file locations which might be examined in order to locate a library, but
it increases it differently, and I think no worse, than the
multiple-libraries per file scheme; and it also gains the ability to
one-to-one map a library file path to a library name, and the ability to
have implicit file names.  Multiple libraries per file has an
I-don't-see-why-it's-better possibilities-growth nature, and it loses
that mapping ability.

With the current draft of SRFI 103, the number of paths where a library
reference's file might be satisfied from is:

(* 4 number-of-search-paths)

because for each search path, a library's file might be located at four
possible sub-paths: 1) no implicit file name and no
implementation-specific file name extension, 2) no implicit file name
and an implementation-specific file name extension, 3) an implicit file
name and no implementation-specific file name extension, and 4) an
implicit file name and an implementation-specific file name extension.

With multiple libraries per file as per Larceny's current support
(IIUC), with versioning totally ignored, and with an optional
implementation-specific file name extension, and without implicit file
names, the number of possibilities where a library reference's file
might be satisfied from is:

(* 2 number-of-library-name-symbols number-of-search-paths)

because for each search path, a library's file might be located at as
many sub-paths as there are symbols in the named library, multiplied by
the possibility of being at an implementation-specific file or not.
Which for libraries with more than 2 symbols in their name, is larger
than the number of possibilities for SRFI 103 as it currently is.

If the multiple-libraries-per-file scheme didn't also have
implementation-specific file name extensions, it would be:

(* number-of-library-name-symbols number-of-search-paths)

which for libraries with more than 4 symbols in their name, is larger
than the number of possibilities for SRFI 103 as it currently is.

You might say: but in practice, libraries with names that long or longer
will not be common, and I'd mostly agree.  But then I'll say: that also
in practice, the difference between the numbers of possibilities of the
two schemes is quite small (because of the common lengths of library
names), and so if there are additional useful properties (i.e., the
one-to-one mapping and the files-in-the-same-directory) which one has
and the other does not, then the one with those properties should be the
portable convention.

[------------------------------------------------------------------------------]

> I'm not arguing that this SRFI should require
> implementations to allow more than one library
> per file so much as I'm arguing against implicit
> file names.  If you won't consider the more useful
> feature (which this draft SRFI would outlaw), then
> it's silly to consider a much less useful and less
> general special case.

I hope I've explained why I think implicit file names aren't that bad
and explained that I think sacrificing (in the context of current
portability) multiple-library files is better in order to support a
one-to-one mapping and to support files-in-the-same-directory.  I hope
my previous posts also help explain why I think implicit file names are
desirable enough to support.

I do not want the implications of SRFI 103 to outlaw multiple files per
library, but to make its standard for portability involve only library
file paths which exactly represent library names and so involve only
single-library files.

I imagine one of the main points of your response might be: why is a
one-to-one mapping of library file paths to library names so important.
I'm ready to engage that topic, and whatever else you think is
appropriate, next, after I'm recharged.

--
: Derick
----------------------------------------------------------------