The politics and other realities of names.... Thomas Lord 26 Jan 2006 18:39 UTC

Thank you for writing this RFI Andrew.  Good job.  I especially admire
your call for a formal proof that your grammar can not generate the
same string in two different ways, assigning two owners to the same
name: that's a good exercise and would be a valuable improvement to
the SRFI.  Identifying issues like that and clearly stating what
problem needs to be solved is often 3/4s of the battle.  I'll bet the
proof is easier than you think and you should give it a try -- maybe
someone closer to you can give some hints.

Here are some issues and proposed solutions.

* Add to the Issues List

  I would add these issues:

    ~ What are the political and economic implications
      of this proposal?  Can those implications effect
      the correctness of programs?  Are the implications
      arbitrary and harmful in any way and, if so, are
      there ways to improve the design to reduce these
      effects?

    ~ Related to those implications: what is the role
      in name-space design of referential transparency?

    ~ Is there a way to improve the description of the
      design?

* Political and Economic Implications

** General Description

  The proposal creates (by defining) an infinite name-space
  which has a natural tree topology.   By "tree topology"
  we mean that every point in this name-space (each name)
  is, aside from being a name-in-general, the name for an
  infinite subspace of names which are uniquely "contained"
  under that name.   The proposed tree looks like this:

                    community recognition
                     of the SRFI process
                            |
                           srfi-84
                   /   |    |       |   \
        schemers.org  rNrs srfi-N   DNS  HTTP-URL
         / ...  \     /..\  /..\    /..\    /..\
       libA    libZ r6 rINf 84 inf a .. z  a/~ .. z/~
       / ..\    ...        /..\   ...      /..\ ...
     X::Y     ...       Q::R      ...     F::G  ....

  The proposal is based on the idea of assigning ownership of
  subspaces (subtrees) to specific authorities.  Thus, ICANN is
  authoritative for the DNS subtree; the scheme standard community is
  authoritative for the rNrs subtree; etc.  The proposal is pleasingly
  and appropriately "rooted" at nothing more or less than community
  recognition of SRFIs in general and srfi-84 in particular.

  Ownership is hierarchical and complex.  Thus, ICANN, W3C, and IETF
  are authoritative for the `http://swiss.mit.edu' name-space but they
  delegate that authority to MIT.   MIT in turn has delegated
  `http://swiss.mit.edu/~jaffer' to Aubrey.

** Referential Transparency?

  A name is referentially transparent if, always, it refers
  to "the same thing".   We don't necessarily mean that names
  in a program always refer to "the same string of bits".

  For example, Aubrey may want a unique name for "a development
  version of SLIB".  Which collection of Scheme code is referred to by
  that name may change over time but, always, the name refers to what
  Aubrey has designated as an SLIB snapshot.

  We ought to notice that in this design the hierarchy of authority is
  at odds with referential transparency.  At the whim of ICANN, W3C,
  IETF, MIT, or Aubrey a name that Aubrey chooses under this proposal
  can change meaning -- can no longer be the name for "a development
  version of SLIB".

  In fact, it seems inevitable that eventually such a whim *will* be
  exercised.  It is highly likely that, eventually, neither Aubrey nor
  any direct or indirect designee of his will control
  `http://.../~jaffer'.  It is entirely *possible* that, sometime
  after that point, the name will be re-used.

  Of course some of this concern is purely theoretical (like worrying
  about latency between Earth and Alpha Centuri while designing TCP).
  For example, none of us should be seriously concerned that `http:'
  won't have a stable enough meaning for our purposes.    By the time
  `http:' changes meaning *that* badly enough else will have gone
  wrong that this Scheme name-space will be one of the least of our
  concerns.

  One doesn't have to go far down the tree, though, before the
  concerns are not so theoretical.   I'm not too sure I see
  strong continuity of operations guarantees for the `rNrs' subspace.
  I'm pretty sure I don't see strong continuity guarantees for
  the `schemers.org' subspace.   By the time we get
  `http:/.../~newbie' or open-source-project `joes:cool:hack' I'm
  nearly certain that the subspace is a jump-ball on the scale of
  just a few years, at best.

** Does it Really Matter?

  In some sense these concerns about referential transparency are
  exaggerated.   The Scheme community knows, by observing other
  communities who wrestle with name-spaces (e.g., Java), that we
  can potentially get very, very far without raising these
  concerns.

  Well, really, the question comes down to "what are these names
  for"?

  If these names are nothing more than (as in the Java world) a
  local convention for convenience -- perhaps everything is just
  fine.

  I would like to believe that SRFI-84 has the potential for much
  greater significance and that, if we can, we should design with
  that in mind.

  For example, some uses of this name-space may have historical
  significance.   They may wind up being used in R7RS.  They may
  be of use to future historians trying to figure out what this
  or that mailing list discussion was about 100 years hence.

  More immediately, I would like to believe that these names will take
  on a role in access methods.  I would like a Scheme loader to
  understand a subset of these names and for their to be a rigorous
  (perhaps local) correspondence between the name and what a loader
  should do to find a named data set or a named code library.

  I earlier pointed out that `http://.../~jaffer' is likely to
  eventually slip out of the control of Aubrey's intentions.
  What will happen, then, to systems which operate by loading
  data/code based on those names?   Will their correctness be
  effected?

  I earlier pointed out that it is *possible* for names in, say,
  the `schemers.org' name-space of open source projects to be
  re-used as economic control over the domain and political control
  over the pages shifts.

  "Possible" or "likely"?   Well, if nothing in the scheme world
  ever matters much -- "unlikely".   If Scheme really starts serving
  certain vital roles in the world -- it all comes down to cost v.
  benefit, doesn't it?

** The Gist of the Nub of the Essence of the Political Foo

  This SRFI draft defines a set of names for people to use
  but the names selected by the definition have an independent
  existence:

  The names used by this draft exist prior to this draft, but
  for some small details of syntax, in a contentious political and
  economic space.  The name `http://.../~jaffer' exists prior to
  this SRFI.   Control of the name is something people can and
  do fight over (e.g., look at where ICANN is at).

  How nasty can that fight get?   The answer is determined by
  how valuable the named things are.

  That fight is already well under way.

  Why drag Scheme into it?

* Solutions

  In general we have seen that the current draft is weakened
  by a dependency upon authorities over which control is
  contingent and contentious.   More or less by definition we
  can define a *standard* for names which makes no reference
  at all to *some* authority.

  Does there exist a higher authority which is more stable
  upon which we build universal names?   It's a definite
  maybe:

** Crypto and Being Human to the Rescue

  Math and physics, specifically the physics and math of computational
  complexity, provide us a handy and apparently non-contentious
  authority that mostly does the job.  With hashes and signing we
  can create names which are, for all practical purposes, as far
  as we can tell, unforgeable.   With signing, especially, we can
  democratize this process.

  We presume, in this SRFI, that individuals or groups wish to
  allocate to themselves an infinite subspace of names.  We
  presume that they have enough social coherence to manage
  internal-to-the-group secrets such as the password to a web site
  or....  a public key.

  We are not certain but I am far from certain we can do much
  better than to appeal to, extend, and improve our math and
  use math as the ultimate naming authority.   Names for strings
  of bits are ideally the best hashes we can muster.  Names for
  dynamic processes controlled by individuals or groups are
  ideally cryptographically signed definitions of those processes.

** "But quick...."

  The draft objects to this approach indirectly.   Hash values
  and signatures are, pretty much by definition, incomprehensible
  and inconvenient.   As the draft puts it, regarding a similar
  proposal:

        "Quick! Is urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6 the
         same or different than the example UUID found in the middle of
         page 4 of RFC 4122?)"

  Can we form a math-based proposal which avoids such objections?

** Short Names, Long Names, and Disambiguating Footnotes

  Among certain close company I am certainly, simply, "Tom".

  At a greater distance I may be "Tom, the GNU Arch guy" or
  "Tom, the guy working on telemarketing and business development".

  In some places I am "Thomas Lord, with federal tax id XXX-XX-XXXX".

  You get the idea -- identity is one thing and names another.
  Identity is fundamental (well... but that's another story) but
  names exist in discursive context.   Names are relative not just
  to where the named thing is in the world but also to who is
  speaking to whom.

  In a paper, the reference `[Lamport 86]' has a precise meaning but
  to disambiguate the meaning there is a footnote with a complete
  bibliographic entry.  Beyond that there is a style guide that
  explains how to parse the bibliographic entry.  Beyond that are
  authorities like "The Association of Computing Machinery" who
  can disambiguate further.

  Lexical scoping, as in Scheme, reflects this locality property of
  names.  In a piece of code designed to be loaded into an environment
  where, at the time of loading, `call-with-current-continuation'
  is presumed to be defined I can usefully write:

        (let ((call/cc  call-with-current-continuation))
           ...)

  In doing so I have done two things:

        1. I've defined a convenient local short-hand for
           an unwieldy name.

        2. I've resolved, in a stable context, a "global"
           name by retrieving the named thing.  Thereafter
           (in the enclosed code) I am insulated from
           redefinitions that may take space in the global
           name-space.

  The same pattern can apply to a design for "Universal Identifiers"
  once we acknowledge separate needs for:

        1. "local" conventions for short-hand names

           Me and 12,000 friends can all agree that it is safe
           to cut and paste certain code between our works because
           we all mean the same thing by "call/cc".

        2. "global", math/physics/human-based disambiguators

           We can all agree that names like:

              srfi84://r6rs/call-with-current-continuation/ff352a...5734

           have a precise-as-we-can-get meaning.   We can do our
           best to make collisions utterly improbable while also
           taking steps to make sure we are not too badly disrupted
           if, to our surprise, collisions eventually occur.

* Conclusions and Final Thoughts

  The draft can be improved by:

    a) distinguishing conventional short-hands from universal names

    b) using crypto techniques to define universal names

    c) defining mechanism for declaring conventional mappings from
       universal to short-hand names

    d) suggesting recovery for procedures for the contingency that
       crypto techniques fail

  Difficult but solvable issues in this program include distinguishing
  cases where signatures are wanted from situations where hashes are
  wanted from situations where some combination is wanted.

  Reaching prematurely for too narrow a set of crypto technology
  is probably a mistake.  If we had this conversation too many years
  ago it might have been tempting to say `hashes are MD4' and things
  like that.   Nothing has really changed.

  A subtle point will be the idea of improving names over time.
  Supposing I have an SHA-based name tomorrow can I smoothly transtion
  to a better has tomorrow?

Regards,
-t