SRFI-1 round 3 discussion

Show/hide message thread
SRFI-1 round 3 discussion Olin Shivers (26 Jun 1999 15:20 UTC)
Left & right [Was: SRFI-1 round 3 discussion] sperber@xxxxxx (27 Jun 1999 15:51 UTC)
SRFI-1 round 3 discussion Olin Shivers 26 Jun 1999 15:25 UTC
Sorry for the down time on this discussion. I have collected all the feedback
I've gotten on the last two major rounds of discussion. Based on that
interaction, I have resolved most issues; five remain open. You can find the
summary of the resolved and outstanding issues on the AI Lab ftp server at
    ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/issues.txt
    ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/closed-issues.txt
I would particularly like to get feedback on the naming issues for
right & left variants of procedures.

I believe we will be able to put this SRFI to bed pretty quickly from this
point.

Below, I append issues.txt. See also closed-issues.txt to find out how
the other issues have been resolved.
    -Olin

-------------------------------------------------------------------------------
This is the list of the ongoing issues. For each issue, I list some of the
people who have raised it, and quote some of the email to sum up various
points of view.

In discussion, please refer to the relevant topic by its tag or header.
That will help us stay organised as we range over a lot of different
issues.

I am pretty sure that the draft reference implementation no longer precisely
conforms to the spec as modified by these topics. Once discussion has
converged, I will hack the ref imp into conformance.

Here is a list of the current issues/discussion topics I have identified from
the email. You may wish to read only the topics about which you care.  To aid
navigation, this document format can be parsed using emacs' outline mode.
    length and circular lists
    circular lists
    improper lists
    TAKE & DROP
    Naming: right & left variants

This document, along with current drafts of the reference implementation
and the draft SRFI (in ASCII format) can be found at
    ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/issues.txt
I'll HTML'ize them for the final SRFI format when discussion is done.

Related documents:
    ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/small-stuff.txt
      Minor issues -- typos, things I went ahead and changed without feeling
      they required discussion

    ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/closed-issues.txt
      Topics of discussion that I have closed out. Feel free to flame me
      if you think any have been resolved in a bad way, but I am basically
      moving ahead on these.

-Olin

-------------------------------------------------------------------------------
* length and circular lists

I have introduced a new function LENGTH+ that properly handles circular lists.

LENGTH+ returns the length of a list if the list is finite. If it is circular,
    it returns #F.

LENGTH is guaranteed to always return an integer. If the argument is circular,
    it may either raise an error exception or diverge, depending upon the
    implementation.

-------------------------------------------------------------------------------
* circular lists

>From: Doug Currie <xxxxxx@flavors.com>
    2 General Comment: The srfi should say which procedures work on improper
    lists and also which work on circular lists. A quick glance at the model
    implementation indicates to me that you did not intend the procedures to
    work on circular lists; the stfi should say so. On the other hand, maybe
    some procedures should work on circular lists (especially list-length, and
    maybe list-copy).

Good point. Here is my proposed taxonomy, which corresponds pretty
closely to the natural, straightforward definitions of these functions,
with some room left in for fancier implementations:

Works on circular lists:
   xcons cons*
   first ... tenth
   Taking and dropping from the left
   zip with at least one finite list
   append append! reverse-append reverse-append! -- last arg may be circular
   acons
   cons pair? null? circular-list? dotted-list? proper-list?
   car cdr ... cdddar cddddr set-car! set-cdr! list-ref
   length+

May diverge / is an error / bad thing:
(Plus'd entries have meaningful definitions for circular lists; we might
 leave these cases open to the implementation. Discussion?)
+  list-copy
   Taking and dropping from the right
   last last-pair
   zip with no finite list
   append append! reverse-append reverse-append! -- initial args must be finite
+  unzip2 unzip3 unzip4 unzip5
+  filter  partition  remove
+  filter! partition! remove!
+  del  delq  delv  delete
+  del! delq! delv! delete!
+  alist-copy
+  delq-duplicates  delv-duplicates  delete-duplicates  del-duplicates
+  delq-duplicates! delv-duplicates! delete-duplicates! del-duplicates!
+  alist-delete  del-ass  del-assq  del-assv  del-assoc
+  alist-delete! del-ass! del-assq! del-assv! del-assoc!
   reverse!
   length reverse

Some lists may be circular; at least one must be finite:
   foldl foldr pair-foldl pair-foldr reducel reducer
   append-map append-map! map! pair-for-each filter-map map-in-order
   map for-each

May diverge if no hit:
(Some implementations might do a fancy implementation
and *not* diverge. Sure.)
   find find-tail any every list-index
   mem member memq memv ass assoc assq assv

-------------------------------------------------------------------------------
* improper lists

The issue has been raised (see, for example, Currie's email quoted in
topic "circular lists") of what the list-lib procedures do when applied
to improper lists.

I try for an "accept a wide spectrum" approach to the inputs the list
functions take, and intend to spec as many of these procedures to work on
improper lists as possible. Procedures that work on improper lists must be
able to treat any non-pair as a list-tail signifying the end of the list --
that is, the empty list. E.g. a robust LENGTH function that handles improper
lists must be spec'd so that both of these calls return three:
    (length '(a b c . ())) => 3		; Proper
    (length '(a b c . d))  => 3		; Improper
This means that *any non-pair* must be treated as the empty list. This
fact falls out naturally from the straightforward base cases of these
recursive functions. Hence
    (length '()) => 0	; Proper
    (length 'd)  => 0	; Improper
This is the simplest, most consistent spec that covers improper lists.
The functions in the current reference implementation have this property.

In this spirit, *all* of the procedures in list-lib are defined to work
on improper lists.

As a related design criteria, I am specifying these procedures to replicate
the proper/improperness of their args in their results where this can
straightforwardly be defined. So, for example:
    (filter even? '(2 7 1 8 2))       => (2 8 2)
    (filter even? '(2 7 1 8 2 . end)) => (2 8 2 . end)

There is a trade-off here. On one hand, widening the set of arguments
acceptable to our utilities reduces our ability to provide error-checks
when an improper list means an error. On the other, it widens the
applicability of these procedures. I tilt towards the latter. Scheme
supports improper lists; so I will support them. Were this ML, the issue
would not arise -- tuple-trees and lists are statically distinguished by
the type system.

Or, let's put it another way: CONS, CAR and CDR support improper lists,
so we need to follow through.

Replicates the argument's terminating value:
    list-copy [e.g., (list-copy '(a b . c)) => (a b . c)]
    filter  partition  remove
    filter! partition! remove!
    del  delq  delv  delete
    del! delq! delv! delete!
    alist-copy
    delq-duplicates  delv-duplicates  delete-duplicates  del-duplicates
    delq-duplicates! delv-duplicates! delete-duplicates! del-duplicates!
    alist-delete  del-ass  del-assq  del-assv  del-assoc
    alist-delete! del-ass! del-assq! del-assv! del-assoc!
    take take! (from right) drop drop! (from left)
    member memq memv assoc assq assv
    map filter-map map-in-order map!
        -- These n-ary funcs use the terminator from left-most shortest list
        arg. Simple to implement, and produces the desired result in the
        single-list map case. Example:
	    (map + '(1 2 3 4 . a) '(1 0 1 . b) '(0 1 0 . c)) =>
                (2 3 4 . b)
    unzip2 unzip3 unzip4 unzip5

Not specified (may either replicate or nil-terminate):
    reverse!

Produces nil-terminated result:
    take take! (from left, always)
    drop drop! (from right, always)
    reverse

Handles improper-list args, but issue of nil-termination does not
apply to result or is trivial:
    xcons tree-copy make-list list-tabulate list*
    circular-list :iota iota:
    first second third fourth fifth sixth seventh eighth ninth tenth
    append append! reverse-append reverse-append!
    unfold unfold/tail
    pair-for-each append-map append-map!
    foldl foldr pair-foldl pair-foldr reducel reducer
    find find-tail any every list-index
    mem ass acons
    last last-pair
    zip
    cons pair? null? list? list length
    car cdr ... cdddar cddddr set-car! set-cdr! list-ref for-each

-------------------------------------------------------------------------------
* TAKE & DROP

>From: Doug Currie <xxxxxx@flavors.com>
    3. I like Sergei's comments on TAKE and DROP:
    >I would prefer to see (SUBLIST list start end) which
    >is both generic and quite natural, and (LIST-HEAD list k)
    >to complement existing (LIST-TAIL list k).
    Perhaps for DROP, LIST-BUTLAST ala Common Lisp and LIST-BUTFIRST (or NTH-CDR).

>From: Maciej Stachowiak <xxxxxx@mit.edu>
    Why not `list-head' and `list-tail' for these, I think those are more
    intuitive names. `list-ref' vs. `nth' is more an open question, why
    not have both.

I do not like LIST-HEAD and LIST-TAIL because I find the names confusing.
Does (LIST-TAIL list k) *drop* K elements or *give* you a K-element tail? With
TAKE and DROP there is no such confusion. If you *want* K elements, you TAKE
them. If you want to *remove* K elements, you DROP them.

I particularly dislike the names LIST-HEAD and LIST-TAIL. By all rights,
these names should be synonyms for CAR and CDR. The head of a list is its
car; the tail of a list is its cdr.

Matt Flatt argues that separate functions to take from the front and
back of the list gives better error checking; similarly for dropping.

This is a good point. Note that we currently have four procedures:
    take drop take! drop!
This would split each of these procedures into two, for a total of eight
procedures. We could call them
    take take/right   take! take/right!
    drop drop/right   drop! drop/right
So (DROP LIST 3) would drop the first 3 elements of LIST and
   (DROP/RIGHT LIST 3) would drop the last 3 elements of LIST.
This is good & bad. Good is that is might give better dynamic error checking,
should a client erroneously pass a negative index to the function. On the
other hand, I don't think it's a very common error. Fencepost errors are when
you confuse 0 & 1. Drifting over into negative values is much, much rarer. Bad
is that is induces namespace bloat, and there's a hidden multiplier lurking in
store: I will be proposing string and vector libs with TAKE and DROP functions
(e.g., STRING-TAKE and VECTOR-DROP), so the factor of two will hit these libs
as well.

There are two issues here:
    The actual functionality -- punting the negative-indexing for twice
        as many function bindings.
    In the event of punting negative-indexing, we must deal with the
        naming choices for the left-end and right-end variants.
	This relates to the larger naming issue discussed in topic
	"Naming: right & left variants."

I am coming around to Flatt's point of view, and now support splitting
the functionality. If there are dissenting voices, speak up now, otherwise
that's what I am going to do.

[By the way, in response to questions of the source of the right-end
negative-indexing convention -- it's a popular feature in APL and J.]

Votes:
    Split functionality into left & right versions: Egorov, Stone, lth, shivers

    John Stone prefers TAKE & DROP names to LIST-HEAD & LIST-TAIL.

-------------------------------------------------------------------------------
* Naming: right & left variants

This is a *thorny* naming issue. Many of the procedures in this list lib, and
also in the upcoming vector and string libs I will be proposing, come in
left/right pairs. I have been using an L and R suffix to name these pairs,
e.g. FOLDL and FOLDR, REDUCEL and REDUCER. This has the charm of being concise
-- as should be obvious by now, I code in 80-column buffers and do not like to
waste columns gratuitously. However, many people prefer longer names. Also,
some of the resultant names are unfortunate -- the most glaringly awkward one
is REDUCER, which really just means "REDUCE from the Right."  So my L and R
suffix convention is not without its downsides. But I like short. I like
simple. And FOLDL, in particular, is well-established in the FP world.

As will be obvious below, this is a very important naming convention to get
right, due to its pervasiveness across multiple libraries and many, many
names. So it really takes some careful thought.

I see three alternatives to consider:

- use *no* suffix for the left operator, and a /R suffix for the right
  operator. This is based on the idea that left-to-right is the "natural"
  processing order for lists. This gives us the following list, vector and
  string bindings (assuming we split the TAKE and DROP ops into left & right
  variants):

    fold	fold/r		reduce	reduce/r
    take	take/r    	take!	take!/r
    drop	drop/r    	drop!	drop!/r
    pair-fold	pair-fold/r

    vector-take vector-take/r vector-take/shared vector-take/rshared
    vector-drop vector-drop/r vector-drop/shared vector-drop/rshared
    vector-find vector-find/r vector-skip vector-skip/r
    vector-fold vector-fold/r vector-reduce vector-reduce/r
    vector-scan vector-scan/r

    string-fold  string-fold/r
    string-take string-take/r
    string-drop string-drop/r
    string-pad  string-pad/r
    string-trim string-trim/r
    string-index string-index/r
    string-skip string-skip/r

- use an /L suffix for the left operator, and an /R suffix for the right
  operator. This gives us the following list, vector and string bindings:

    fold/l	fold/r		reduce/l	reduce/r
    take/l	take/r    	take!/l		take!/r
    drop/l	drop/r    	drop!/l		drop!/r
    pair-fold/l	pair-fold/r

    vector-take/l vector-take/r vector-take/lshared vector-take/rshared
    vector-drop/l vector-drop/r vector-drop/lshared vector-drop/rshared
    vector-find/l vector-find/r vector-skip/l vector-skip/r
    vector-fold/l vector-fold/r vector-reduce/l vector-reduce/r
    vector-scan/l vector-scan/r

    string-fold/l  string-fold/r
    string-take/l string-take/r
    string-drop/l string-drop/r
    string-pad/l  string-pad/r
    string-trim/l string-trim/r
    string-index/l string-index/r
    string-skip/l string-skip/r

- use a -left suffix for the left operator, and a -right suffix for the right
  operator. This gives us the following list, vector and string bindings:

    fold-left	fold-right	reduce-left	reduce-right
    take-left	take-right    	take-left!	take-right!
    drop-left	drop-right    	drop-left!	drop-right!
    pair-fold-left		pair-fold-right

    vector-take-left vector-take-right vector-take-left/shared vector-take-right/shared
    vector-drop-left vector-drop-right vector-drop-left/shared vector-drop-right/shared
    vector-find-left vector-find-right vector-skip-left vector-skip-right
    vector-fold-left vector-fold-right vector-reduce-left vector-reduce-right
    vector-scan-left vector-scan-right

    string-fold-left	string-fold-right
    string-take-left	string-take-right
    string-drop-left	string-drop-right
    string-pad-left	string-pad-right
    string-trim-left	string-trim-right
    string-index-left	string-index-right
    string-skip-left	string-skip-right

  Ouch, I find these names painfully verbose for primitive, low-level
  operations. It makes it hard to use functional composition --
  (f (g (h x))) -- without drifting off the right side of the screen or
  shifting over to awkward, multiline indenting styles. Also, the actual
  operation (fold, reduce, pad, trim) gets lost amidst all the tacked-on
  modifiers.

- use no suffix for the left operator, and a -right suffix for the right
  operator. This gives us the following list, vector and string bindings:

    fold	fold-right	reduce	reduce-right
    take	take-right    	take!	take-right!
    drop	drop-right    	drop!	drop-right!
    pair-fold	pair-fold-right

    vector-take vector-take-right vector-take/shared vector-take-right/shared
    vector-drop vector-drop-right vector-drop/shared vector-drop-right/shared
    vector-find vector-find-right vector-skip vector-skip-right
    vector-fold vector-fold-right vector-reduce vector-reduce-right
    vector-scan vector-scan-right

    string-fold		string-fold-right
    string-take		string-take-right
    string-drop		string-drop-right
    string-pad		string-pad-right
    string-trim		string-trim-right
    string-index	string-index-right
    string-skip		string-skip-right

However, I listen attentively to the community voice.

[Let's not worry about what exactly these "vector shared" ops are; that is
a topic for another SRFI and another time. Let's focus on this issue of
left and right variant names.]

Votes:
    no left suffix; /r right suffix: Will Fitzgerald
    /l and /r suffix: Will Fitzgerald
    -left and -right suffix: lth
    no left suffix; -right right suffix: lth 2nd choice

I have received very little feedback on this issue. In the absence of further
votes, I intend to go with no left suffix; -right right suffix, e.g.
    FOLD and FOLD-RIGHT

-------------------------------------------------------------------------------