|
SRFI-1 round 3 discussion Olin Shivers (26 Jun 1999 15:20 UTC)
|
|
Left & right [Was: SRFI-1 round 3 discussion]
sperber@xxxxxx
(27 Jun 1999 15:51 UTC)
|
Sorry for the down time on this discussion. I have collected all the feedback
I've gotten on the last two major rounds of discussion. Based on that
interaction, I have resolved most issues; five remain open. You can find the
summary of the resolved and outstanding issues on the AI Lab ftp server at
ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/issues.txt
ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/closed-issues.txt
I would particularly like to get feedback on the naming issues for
right & left variants of procedures.
I believe we will be able to put this SRFI to bed pretty quickly from this
point.
Below, I append issues.txt. See also closed-issues.txt to find out how
the other issues have been resolved.
-Olin
-------------------------------------------------------------------------------
This is the list of the ongoing issues. For each issue, I list some of the
people who have raised it, and quote some of the email to sum up various
points of view.
In discussion, please refer to the relevant topic by its tag or header.
That will help us stay organised as we range over a lot of different
issues.
I am pretty sure that the draft reference implementation no longer precisely
conforms to the spec as modified by these topics. Once discussion has
converged, I will hack the ref imp into conformance.
Here is a list of the current issues/discussion topics I have identified from
the email. You may wish to read only the topics about which you care. To aid
navigation, this document format can be parsed using emacs' outline mode.
length and circular lists
circular lists
improper lists
TAKE & DROP
Naming: right & left variants
This document, along with current drafts of the reference implementation
and the draft SRFI (in ASCII format) can be found at
ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/issues.txt
I'll HTML'ize them for the final SRFI format when discussion is done.
Related documents:
ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/small-stuff.txt
Minor issues -- typos, things I went ahead and changed without feeling
they required discussion
ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/closed-issues.txt
Topics of discussion that I have closed out. Feel free to flame me
if you think any have been resolved in a bad way, but I am basically
moving ahead on these.
-Olin
-------------------------------------------------------------------------------
* length and circular lists
I have introduced a new function LENGTH+ that properly handles circular lists.
LENGTH+ returns the length of a list if the list is finite. If it is circular,
it returns #F.
LENGTH is guaranteed to always return an integer. If the argument is circular,
it may either raise an error exception or diverge, depending upon the
implementation.
-------------------------------------------------------------------------------
* circular lists
>From: Doug Currie <xxxxxx@flavors.com>
2 General Comment: The srfi should say which procedures work on improper
lists and also which work on circular lists. A quick glance at the model
implementation indicates to me that you did not intend the procedures to
work on circular lists; the stfi should say so. On the other hand, maybe
some procedures should work on circular lists (especially list-length, and
maybe list-copy).
Good point. Here is my proposed taxonomy, which corresponds pretty
closely to the natural, straightforward definitions of these functions,
with some room left in for fancier implementations:
Works on circular lists:
xcons cons*
first ... tenth
Taking and dropping from the left
zip with at least one finite list
append append! reverse-append reverse-append! -- last arg may be circular
acons
cons pair? null? circular-list? dotted-list? proper-list?
car cdr ... cdddar cddddr set-car! set-cdr! list-ref
length+
May diverge / is an error / bad thing:
(Plus'd entries have meaningful definitions for circular lists; we might
leave these cases open to the implementation. Discussion?)
+ list-copy
Taking and dropping from the right
last last-pair
zip with no finite list
append append! reverse-append reverse-append! -- initial args must be finite
+ unzip2 unzip3 unzip4 unzip5
+ filter partition remove
+ filter! partition! remove!
+ del delq delv delete
+ del! delq! delv! delete!
+ alist-copy
+ delq-duplicates delv-duplicates delete-duplicates del-duplicates
+ delq-duplicates! delv-duplicates! delete-duplicates! del-duplicates!
+ alist-delete del-ass del-assq del-assv del-assoc
+ alist-delete! del-ass! del-assq! del-assv! del-assoc!
reverse!
length reverse
Some lists may be circular; at least one must be finite:
foldl foldr pair-foldl pair-foldr reducel reducer
append-map append-map! map! pair-for-each filter-map map-in-order
map for-each
May diverge if no hit:
(Some implementations might do a fancy implementation
and *not* diverge. Sure.)
find find-tail any every list-index
mem member memq memv ass assoc assq assv
-------------------------------------------------------------------------------
* improper lists
The issue has been raised (see, for example, Currie's email quoted in
topic "circular lists") of what the list-lib procedures do when applied
to improper lists.
I try for an "accept a wide spectrum" approach to the inputs the list
functions take, and intend to spec as many of these procedures to work on
improper lists as possible. Procedures that work on improper lists must be
able to treat any non-pair as a list-tail signifying the end of the list --
that is, the empty list. E.g. a robust LENGTH function that handles improper
lists must be spec'd so that both of these calls return three:
(length '(a b c . ())) => 3 ; Proper
(length '(a b c . d)) => 3 ; Improper
This means that *any non-pair* must be treated as the empty list. This
fact falls out naturally from the straightforward base cases of these
recursive functions. Hence
(length '()) => 0 ; Proper
(length 'd) => 0 ; Improper
This is the simplest, most consistent spec that covers improper lists.
The functions in the current reference implementation have this property.
In this spirit, *all* of the procedures in list-lib are defined to work
on improper lists.
As a related design criteria, I am specifying these procedures to replicate
the proper/improperness of their args in their results where this can
straightforwardly be defined. So, for example:
(filter even? '(2 7 1 8 2)) => (2 8 2)
(filter even? '(2 7 1 8 2 . end)) => (2 8 2 . end)
There is a trade-off here. On one hand, widening the set of arguments
acceptable to our utilities reduces our ability to provide error-checks
when an improper list means an error. On the other, it widens the
applicability of these procedures. I tilt towards the latter. Scheme
supports improper lists; so I will support them. Were this ML, the issue
would not arise -- tuple-trees and lists are statically distinguished by
the type system.
Or, let's put it another way: CONS, CAR and CDR support improper lists,
so we need to follow through.
Replicates the argument's terminating value:
list-copy [e.g., (list-copy '(a b . c)) => (a b . c)]
filter partition remove
filter! partition! remove!
del delq delv delete
del! delq! delv! delete!
alist-copy
delq-duplicates delv-duplicates delete-duplicates del-duplicates
delq-duplicates! delv-duplicates! delete-duplicates! del-duplicates!
alist-delete del-ass del-assq del-assv del-assoc
alist-delete! del-ass! del-assq! del-assv! del-assoc!
take take! (from right) drop drop! (from left)
member memq memv assoc assq assv
map filter-map map-in-order map!
-- These n-ary funcs use the terminator from left-most shortest list
arg. Simple to implement, and produces the desired result in the
single-list map case. Example:
(map + '(1 2 3 4 . a) '(1 0 1 . b) '(0 1 0 . c)) =>
(2 3 4 . b)
unzip2 unzip3 unzip4 unzip5
Not specified (may either replicate or nil-terminate):
reverse!
Produces nil-terminated result:
take take! (from left, always)
drop drop! (from right, always)
reverse
Handles improper-list args, but issue of nil-termination does not
apply to result or is trivial:
xcons tree-copy make-list list-tabulate list*
circular-list :iota iota:
first second third fourth fifth sixth seventh eighth ninth tenth
append append! reverse-append reverse-append!
unfold unfold/tail
pair-for-each append-map append-map!
foldl foldr pair-foldl pair-foldr reducel reducer
find find-tail any every list-index
mem ass acons
last last-pair
zip
cons pair? null? list? list length
car cdr ... cdddar cddddr set-car! set-cdr! list-ref for-each
-------------------------------------------------------------------------------
* TAKE & DROP
>From: Doug Currie <xxxxxx@flavors.com>
3. I like Sergei's comments on TAKE and DROP:
>I would prefer to see (SUBLIST list start end) which
>is both generic and quite natural, and (LIST-HEAD list k)
>to complement existing (LIST-TAIL list k).
Perhaps for DROP, LIST-BUTLAST ala Common Lisp and LIST-BUTFIRST (or NTH-CDR).
>From: Maciej Stachowiak <xxxxxx@mit.edu>
Why not `list-head' and `list-tail' for these, I think those are more
intuitive names. `list-ref' vs. `nth' is more an open question, why
not have both.
I do not like LIST-HEAD and LIST-TAIL because I find the names confusing.
Does (LIST-TAIL list k) *drop* K elements or *give* you a K-element tail? With
TAKE and DROP there is no such confusion. If you *want* K elements, you TAKE
them. If you want to *remove* K elements, you DROP them.
I particularly dislike the names LIST-HEAD and LIST-TAIL. By all rights,
these names should be synonyms for CAR and CDR. The head of a list is its
car; the tail of a list is its cdr.
Matt Flatt argues that separate functions to take from the front and
back of the list gives better error checking; similarly for dropping.
This is a good point. Note that we currently have four procedures:
take drop take! drop!
This would split each of these procedures into two, for a total of eight
procedures. We could call them
take take/right take! take/right!
drop drop/right drop! drop/right
So (DROP LIST 3) would drop the first 3 elements of LIST and
(DROP/RIGHT LIST 3) would drop the last 3 elements of LIST.
This is good & bad. Good is that is might give better dynamic error checking,
should a client erroneously pass a negative index to the function. On the
other hand, I don't think it's a very common error. Fencepost errors are when
you confuse 0 & 1. Drifting over into negative values is much, much rarer. Bad
is that is induces namespace bloat, and there's a hidden multiplier lurking in
store: I will be proposing string and vector libs with TAKE and DROP functions
(e.g., STRING-TAKE and VECTOR-DROP), so the factor of two will hit these libs
as well.
There are two issues here:
The actual functionality -- punting the negative-indexing for twice
as many function bindings.
In the event of punting negative-indexing, we must deal with the
naming choices for the left-end and right-end variants.
This relates to the larger naming issue discussed in topic
"Naming: right & left variants."
I am coming around to Flatt's point of view, and now support splitting
the functionality. If there are dissenting voices, speak up now, otherwise
that's what I am going to do.
[By the way, in response to questions of the source of the right-end
negative-indexing convention -- it's a popular feature in APL and J.]
Votes:
Split functionality into left & right versions: Egorov, Stone, lth, shivers
John Stone prefers TAKE & DROP names to LIST-HEAD & LIST-TAIL.
-------------------------------------------------------------------------------
* Naming: right & left variants
This is a *thorny* naming issue. Many of the procedures in this list lib, and
also in the upcoming vector and string libs I will be proposing, come in
left/right pairs. I have been using an L and R suffix to name these pairs,
e.g. FOLDL and FOLDR, REDUCEL and REDUCER. This has the charm of being concise
-- as should be obvious by now, I code in 80-column buffers and do not like to
waste columns gratuitously. However, many people prefer longer names. Also,
some of the resultant names are unfortunate -- the most glaringly awkward one
is REDUCER, which really just means "REDUCE from the Right." So my L and R
suffix convention is not without its downsides. But I like short. I like
simple. And FOLDL, in particular, is well-established in the FP world.
As will be obvious below, this is a very important naming convention to get
right, due to its pervasiveness across multiple libraries and many, many
names. So it really takes some careful thought.
I see three alternatives to consider:
- use *no* suffix for the left operator, and a /R suffix for the right
operator. This is based on the idea that left-to-right is the "natural"
processing order for lists. This gives us the following list, vector and
string bindings (assuming we split the TAKE and DROP ops into left & right
variants):
fold fold/r reduce reduce/r
take take/r take! take!/r
drop drop/r drop! drop!/r
pair-fold pair-fold/r
vector-take vector-take/r vector-take/shared vector-take/rshared
vector-drop vector-drop/r vector-drop/shared vector-drop/rshared
vector-find vector-find/r vector-skip vector-skip/r
vector-fold vector-fold/r vector-reduce vector-reduce/r
vector-scan vector-scan/r
string-fold string-fold/r
string-take string-take/r
string-drop string-drop/r
string-pad string-pad/r
string-trim string-trim/r
string-index string-index/r
string-skip string-skip/r
- use an /L suffix for the left operator, and an /R suffix for the right
operator. This gives us the following list, vector and string bindings:
fold/l fold/r reduce/l reduce/r
take/l take/r take!/l take!/r
drop/l drop/r drop!/l drop!/r
pair-fold/l pair-fold/r
vector-take/l vector-take/r vector-take/lshared vector-take/rshared
vector-drop/l vector-drop/r vector-drop/lshared vector-drop/rshared
vector-find/l vector-find/r vector-skip/l vector-skip/r
vector-fold/l vector-fold/r vector-reduce/l vector-reduce/r
vector-scan/l vector-scan/r
string-fold/l string-fold/r
string-take/l string-take/r
string-drop/l string-drop/r
string-pad/l string-pad/r
string-trim/l string-trim/r
string-index/l string-index/r
string-skip/l string-skip/r
- use a -left suffix for the left operator, and a -right suffix for the right
operator. This gives us the following list, vector and string bindings:
fold-left fold-right reduce-left reduce-right
take-left take-right take-left! take-right!
drop-left drop-right drop-left! drop-right!
pair-fold-left pair-fold-right
vector-take-left vector-take-right vector-take-left/shared vector-take-right/shared
vector-drop-left vector-drop-right vector-drop-left/shared vector-drop-right/shared
vector-find-left vector-find-right vector-skip-left vector-skip-right
vector-fold-left vector-fold-right vector-reduce-left vector-reduce-right
vector-scan-left vector-scan-right
string-fold-left string-fold-right
string-take-left string-take-right
string-drop-left string-drop-right
string-pad-left string-pad-right
string-trim-left string-trim-right
string-index-left string-index-right
string-skip-left string-skip-right
Ouch, I find these names painfully verbose for primitive, low-level
operations. It makes it hard to use functional composition --
(f (g (h x))) -- without drifting off the right side of the screen or
shifting over to awkward, multiline indenting styles. Also, the actual
operation (fold, reduce, pad, trim) gets lost amidst all the tacked-on
modifiers.
- use no suffix for the left operator, and a -right suffix for the right
operator. This gives us the following list, vector and string bindings:
fold fold-right reduce reduce-right
take take-right take! take-right!
drop drop-right drop! drop-right!
pair-fold pair-fold-right
vector-take vector-take-right vector-take/shared vector-take-right/shared
vector-drop vector-drop-right vector-drop/shared vector-drop-right/shared
vector-find vector-find-right vector-skip vector-skip-right
vector-fold vector-fold-right vector-reduce vector-reduce-right
vector-scan vector-scan-right
string-fold string-fold-right
string-take string-take-right
string-drop string-drop-right
string-pad string-pad-right
string-trim string-trim-right
string-index string-index-right
string-skip string-skip-right
However, I listen attentively to the community voice.
[Let's not worry about what exactly these "vector shared" ops are; that is
a topic for another SRFI and another time. Let's focus on this issue of
left and right variant names.]
Votes:
no left suffix; /r right suffix: Will Fitzgerald
/l and /r suffix: Will Fitzgerald
-left and -right suffix: lth
no left suffix; -right right suffix: lth 2nd choice
I have received very little feedback on this issue. In the absence of further
votes, I intend to go with no left suffix; -right right suffix, e.g.
FOLD and FOLD-RIGHT
-------------------------------------------------------------------------------