Update -- !, UNFOLD, COUNT, LIST= & set ops

*shivers@xxxxxx*05 Sep 1999 01:05 UTC1. I am staying with ! for the linear-update suffix. 2. Sergei is right: UNFOLD comes in directional flavors, in concert with the two directions of FOLD. I am sticking with the FOO and FOO-RIGHT general convention, and adding the left unfold. I am also making the tail parameter an optional argument, hence unfold/tail is going away. This gives us unfold p f g seed [tail] unfold-right p f g seed [tail-gen] I append the new specs. 3. I have added two more procedures. The first is taken from Common Lisp: count PRED list1 list2 ... -> integer It applies PRED across the lists and returns the number of the true values thus produced. My experience, at least, tells me this is a very common operation; one worth naming. Secondly, I realised that list-lib has no equality procedure for lists! It's a check-list item: constructor, selector, predicate, comparison. So I have added one: list= elt= lis1 ... Two proper lists are equal if they are the same and their elements are equal (as determined by ELT=). It is an error to apply LIST= to anything except proper lists. Implementations may choose to extend it to circular lists. It cannot reasonably be extended to dotted lists, as there is no way to specify an equality procedure for comparing the list terminators. I did not add list *inequality* procedures LIST< and LIST<=. What would they be? Elementwise "tuple" comparison is one possibility, but so is lexicographic order. If lexicographic order, then which endianness is used? Taking left elements to be more significant than right elements fits the natural implementation of the comparison function, but one could argue that, given that we *construct* lists right-to-left, the most significant elements should be the earliest ones, i.e., the rightmost ones. In practice, these issues will be application-specific. In the absence of a clear-cut obvious single choice, I decided it was best not to institutionalise a particular comparison semantics in the library. I mention these inequality issues mostly to have them in the discussion record for future reference. I append the new specs for COUNT and LIST=. 5. I also considered adding, along the lines of LIST=, a TREE= function: (define (tree= elt= x y) (let recur ((x x) (y y)) (if (pair? x) (and (pair? y) (recur (car x) (car y)) (recur (cdr x) (cdr y))) (and (not (pair? y)) (elt= x y))))) Then I realised that besides TREE= and TREE-COPY, we also want various kinds of tree folding operations, tree unfolders, tree mappers, ... Partial coverage is bogus. So I'm killing TREE-COPY, and have no tree procs at all. Designing a good tree library is a thoughtful task; it's time to push this library out into the world, so we will leave that for another time, another library, another SRFI process. 5. I have more carefully specified the list-set operations, which were not spec'd in a detailed way. I append the definitions. The two most significant design elements of my spec for the list-set operations are: - The element-equality predicate must be "consistent" with EQ?. That is, for element-comparison function elt=, (eq? x y) => (elt= x y). This, in turn, implies that two lists that are EQ? are set-equal under any legal element-comparison function. - I have pretty carefully pinned down what the set ops do. For example, the intersection operator returns a filtered version of its first list argument -- elements are not disordered from the order in which they appear in the first list argument. One reason for such a spec is so that programmers can use ordered lists as sets, without disarranging useful order that might be important in later, non-set use of the results. Other points that are unambiguously pinned down are how multiple occurences of identical elements in a list are handled, and what order arguments are given to the element comparison procedure. So there's more in the spec than is strictly necessary for the bare requirements of a set operation, but it does make for a complete and reliable specification of the list->list operation performed. - The spec is tuned to allow fast, constant-time execution in specific special cases (EQ? arguments, () arguments). 6. I've gone ahead and added the R5RS text for all the R5RS procs exported by this library (CONS, CAR, CDR, ASSQ, ...), so the document is a complete spec/reference. 7. I have put a lot of work into updating the documentation. The HTML, in particular, is now nicely formatted; I have built a CSS style sheet for writing up SRFI's and other HTML documents that define procedures; it even exploits bugs in Netscape's CSS handling to work around other bugs in Netscape's CSS handling, and works fine with IE. But the HTML text is now also out-of-date wrt the text version. For the current state of the SRFI, be sure and check the text version: ftp://ftp.ai.mit.edu/people/shivers/srfi/srfi-1/srfi-1.txt The only one of these items on which I see much potential for controversy is the spec for the list-set ops. Whew. -Olin ------------------------------------------------------------------------------- list= elt= list1 ... -> boolean Determines list equality, given an element-equality procedure. Proper list A equals proper list B if they are of the same length, and their elements are equal, as determined by ELT=. If the element-comparison procedure's first argument is from LISTi, then its second argument is from LISTi+1, i.e. it is always called as (elt= a b) for a an element of list A, and b an element of list B. In the n-ary case, every LISTi is compared to LISTi+1 (as opposed, for example, to comparing LIST1 to every LISTi, for i>1). If there are no lists arguments at all, LIST= simply returns #t. It is an error to apply LIST= to anything except proper lists. While implementations may choose to extend it to circular lists, note that it cannot reasonably be extended to dotted lists, as it provides no way to specify an equality procedure for comparing the list terminators. Note that the dynamic order in which the ELT= procedure is applied to pairs of elements is not specified. For example, if LIST= is applied to three lists, A, B, and C, it may first completely compare A to B, then compare B to C, or it may compare the first elements of A and B, then the first elements of B and C, then the second elements of A and B, and so forth. The equality procedure must be consistent with EQ?. That is, it must be the case that (eq? x y) => (elt= x y). Note that this implies that two lists which are EQ? are always LIST=, as well. count pred clist1 clist2 ... -> integer PRED is a procedure taking as many arguments as there are lists and returning a single value. It is applied element-wise to the elements of the LISTs, and a count is tallied of the number of true values produced. This count is returned. COUNT is "iterative" in that it is guaranteed to apply PRED to the LIST elements in a left-to-right order. The counting stops when the shortest list expires. (count even? '(3 1 4 1 5 9 2 5 6)) => 3 (count < '(1 2 4 8) '(2 4 6 8 10 12 14 16)) => 3 At least one of the argument lists must be finite: (count < '(3 1 4 1) (circular-list 1 10)) => 2 unfold p f g seed [tail] -> value UNFOLD constructs a list with the following loop: (let lp ((seed seed) (lis tail)) (if (p seed) lis (lp (g seed) (cons (f seed) lis)))) P: Determines when to stop unfolding. F: Maps each seed value to the corresponding list element. G: Maps each seed value to next seed value. SEED: The "state" value for the unfold. TAIL: list terminator; defaults to '(). UNFOLD is the fundamental iterative list constructor, just as FOLD is the fundamental iterative list consumer. While UNFOLD may seem a bit abstract to novice functional programmers, it can be used in a number of ways: (unfold zero? ; List of squares: 1^2 ... 10^2. (lambda (x) (* x x)) (lambda (x) (- x 1)) 10) (unfold null-list? car cdr lis) ; Reverse a proper list. ;; Read current input port into a list of values. (unfold eof-object? values (lambda (x) (read)) (read)) ;; (APPEND-REVERSE rev-head tail) (unfold null-list? car cdr rev-head tail) Interested functional programmers may enjoy noting that FOLD and UNFOLD are in some sense inverses. That is, given operations KNULL?, KAR, KDR, KONS, and KNIL satisfying (kons (kar x) (kdr x)) = x and (knull? knil) = #t then (FOLD kons knil (UNFOLD knull? kar kdr x)) = x and (UNFOLD knull? kar kdr (FOLD kons knil x)) = x. This combinator presumably has some pretentious mathematical name; interested readers are invited to communicate it to the author. unfold-right p f g seed [tail-gen]-> list UNFOLD-RIGHT is best described by its basic recursion: (unfold-right p f g seed) = (if (p seed) (tail-gen seed) (cons (f seed) (unfold-right p f g (g seed)))) P: Determines when to stop unfolding. F: Maps each seed value to the corresponding list element. G: Maps each seed value to next seed value. SEED: The "state" value for the unfold. TAIL-GEN: creates the tail of the list; defaults to (lambda (x) '()) UNFOLD-RIGHT is the fundamental recursive list constructor, just as FOLD-RIGHT is the fundamental recursive list consumer. While UNFOLD-RIGHT may seem a bit abstract to novice functional programmers, it can be used in a number of ways: (unfold-right (lambda (x) (> x 10)) ; List of squares: 1^2 ... 10^2. (lambda (x) (* x x)) (lambda (x) (+ x 1)) 1) (unfold-right null-list? car cdr lis) ; Copy a proper list. ;; Read current input port into a list of values. (unfold-right eof-object? values (lambda (x) (read)) (read)) ;; Copy a possibly non-proper list: (unfold-right not-pair? car cdr lis values) ;; Append HEAD onto TAIL: (unfold-right null-list? car cdr head (lambda (x) tail)) Interested functional programmers may enjoy noting that FOLD-RIGHT and UNFOLD-RIGHT are in some sense inverses. That is, given operations KNULL?, KAR, KDR, KONS, and KNIL satisfying (kons (kar x) (kdr x)) = x and (knull? knil) = #t then (FOLD-RIGHT kons knil (UNFOLD-RIGHT knull? kar kdr x)) = x and (UNFOLD-RIGHT knull? kar kdr (FOLD-RIGHT kons knil x)) = x. This combinator sometimes is called an "anamorphism;" when an explicit TAIL-GEN procedure is supplied, it is called an "apomorphism." ** Set operations on lists ========================== The following procedures all take an = argument used to compare elements of lists. This equality procedure is required to be consistent with EQ?. That is, it must be the case that (eq? x y) => (= x y). Note that this implies, in turn, that two lists that are EQ? are also set-equal by any legal comparison procedure. This allows for constant-time determination of set operations on EQ? lists. Be aware that these procedures typically run in time O(n * m) for N- and M-element list arguments. Performance-critical applications operating upon large sets will probably wish to use other data structures and algorithms. lset<= = list1 ... -> boolean Returns true iff every LISTi is a subset of LISTi+1, using = for the element equality procedure. List A is a subset of list B if every element in A is equal to some element of B. When performing an element comparison, the = procedure's first argument is an element of A; its second, an element of B. (lset<= eq? '(a) '(a b a) '(a b c c)) => #t (lset<= eq?) => #t ; Trivial cases (lset<= eq? '(a)) => #t lset= = list1 list2 ... -> boolean Returns true iff every LISTi is set-equal to LISTi+1, using = for the element equality procedure. "Set-equal" simply means that LISTi is a subset of LISTi+1, and LISTi+1 is a subset of LISTi. (lset= eq? '(b e a) '(a e b) '(e e b a)) => #t (lset= eq?) => #t ; Trivial cases (lset= eq? '(a)) => #t lset-adjoin = list elt1 ... -> list Adds the ELTi elements not already in the list parameter to the result list. The result shares a common tail with the list parameter. The new elements are added to the front of the list, but no guarantees are made about their order. The = parameter is an equality procedure used to determine if an ELTi is already a member of LIST. Its first argument is an element of LIST; its second is one of the ELTi. The list parameter is always a suffix of the result -- even if the list parameter contains repeated elements, these are not reduced. (lset-adjoin eq? '(a b c d c e) 'a 'e 'i 'o 'u) => (u o i a b c d c e) lset-union = list1 ... -> list Returns the union of the lists, using = for the element equality procedure. The union of lists A and B is constructed as follows: - If A is the empty list, the answer is B (or a copy of B). - Otherwise, the result is initialised to be list A (or a copy of A). - Proceed through the elements of list B in a left-to-right order. If b is such an element of B, compare every element r of the current result list to b: (= r b). If all comparisons fail, b is consed onto the front of the result. However, there is no guarantee that = will be applied to every pair of arguments from A and B. In particular, if A is EQ? to B, the operation may immediately terminate. In the n-ary case, the two-argument list-union operation is simply folded across the argument lists. (lset-union eq? '(a b c d e) '(a e i o u)) => (u o i a b c d e) ;; Repeated elements in LIST1 are preserved. (lset-union eq? '(a a c) '(x a x)) => (x a a c) (lset-union eq?) => () ; Trivial cases (lset-union eq? '(a b c)) => (a b c) lset-intersection = list1 list2 ... -> list Returns the intersection of the lists, using = for the element equality procedure. The intersection of lists A and B is comprised of every element of A that is = to some element of B: (= a b), for a in A, and b in B. Note this implies that an element which appears in B and multiple times in list A will also appear multiple times in the result. The order in which elements appear in the result is the same as they appear in LIST1 -- that is, LSET-INTERSECTION essentially filters LIST1, without disarranging element order. The result may share a common tail with LIST1. In the n-ary case, the two-argument list-intersection operation is simply folded across the argument lists. However, the dynamic order in which the applications of = are made is not specified. The procedure may check an element of LIST1 for membership in every other list before proceeding to consider the next element of LIST1, or it may completely intersect LIST1 and LIST2 before proceeding to LIST3, or it may go about its work in some third order. (lset-intersection eq? '(a b c d e) '(a e i o u)) => (a e) ;; Repeated elements in LIST1 are preserved. (lset-intersection eq? '(a x y a) '(x a x z)) => '(a x a) (lset-intersection eq? '(a b c)) => (a b c) ; Trivial case lset-difference = list1 list2 ... -> list Returns the difference of the lists, using = for the element equality procedure -- all the elements of LIST1 that are not = to any element from one of the other LISTi parameters. The = procedure's first argument is always an element of LIST1; its second is an element of one of the other LISTi. The result may share a common tail with LIST1. Elements that are repeated multiple times in the LIST1 parameter will occur multiple times in the result. No constraint is placed on the ordering of the elements in the result. The result may share a common tail with LIST1. (lset-difference eq? '(a b c d e) '(a e i o u)) => (b c d) (lset-difference eq? '(a b c)) => (a b c) ; Trivial case lset-xor = list1 ... -> list Returns the XOR of the lists, using = for the element equality procedure. If there are exactly two lists, this is all the elements that appear in exactly one of the two lists. The operation is associative, and thus extends to the n-ary case -- the elements that appear in an odd number of the lists. The result may share a common tail with any of the LISTi parameters. More precisely, for two lists A and B, A xor B is a list of - every element a of A such that there is no element b of B such that (= a b) - every element b of B such that there is no element a of A such that (= b a) In the n-ary case, the binary-xor operation is simply folded across the lists. (lset-xor eq? '(a b c d e) '(a e i o u)) => (d c b i o u) ;; Trivial cases. (lset-xor eq?) => () (lset-xor eq? '(a b c d e)) => (a b c d e) lset-diff+intersection = list1 list2 ... -> [list list] Returns two values -- the difference and the intersection of the lists. Is equivalent to (values (lset-difference = list1 list2 ...) (lset-intersection = list1 list2 ...)) but can be implemented more efficiently. The = procedure's first argument is an element of LIST1; its second is an element of one of the other LISTi. Either of the answer lists may share a common tail with LIST1. This operation essentially partitions LIST1. lset-union! = list1 ... -> list lset-intersection! = list1 list2 ... -> list lset-difference! = list1 list2 ... -> list lset-xor! = list1 ... -> list lset-diff+intersection! = list1 list2 ... -> [list list] These are linear-update variants. They are allowed, but not required, to use the cons cells in their first list parameter to construct their answer. LSET-UNION! is permitted to recycle cons cells from *any* of its list arguments.